Inactive array – mdadm: Cannot get array info for /dev/md126

Author:

Replacing a disk maybe sometimes challenging, especially with software RAID. If the software RAID1 went inactive this article might be for you!
Booting from a LIVECD or a rescue PXE system and all RAID devices got inactive despite the loaded personalities. We have similar article on the subject – Recovering MD array and mdadm: Cannot get array info for /dev/md0

livecd ~ # cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [raid10] [linear] [multipath] 
md125 : inactive sdb3[1](S)
      1047552 blocks super 1.2
       
md126 : inactive sdb1[1](S)
      52427776 blocks super 1.2
       
md127 : inactive sdb2[1](S)
      16515072 blocks super 1.2
       
unused devices: <none>

Despite the personalities are loaded, which means the kernel modules are successfully loaded – “[raid6] [raid5] [raid4] [raid0] [raid1] [raid10] [linear] [multipath] “. Still, something got wrong and the device’s personality is unrecognized and is inactive state.
A device in inactive state cannot be recovered and it cannot be added disks:

livecd ~ # mdadm --add /dev/md125 /dev/sda3
mdadm: Cannot get array info for /dev/md125

In general, to recover a RAID in inactive state:

  1. Check if the kernel modules are loaded. If the RAID setups are using RAID1, the “Personalities” line in /proc/mdstat should include it as “[raid1]”
  2. Try to run the device with “mdadm –run”
  3. Add the missing device to the RAID device with “mdadm –add” if the status of the RAID device goes to “active (auto-read-only)” or just “active”.
  4. Wait for the RAID device to recover.


Here are the steps and the RAID status and its changes:

STEP 1) Check if the kernel modules are loaded.

Just cat the /proc/mdstat and search for the “Personalities” line:

livecd ~ # cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [raid10] [linear] [multipath] 
md125 : inactive sdb3[1](S)
      1047552 blocks super 1.2
       
md126 : inactive sdb1[1](S)
      52427776 blocks super 1.2
       
md127 : inactive sdb2[1](S)
      16515072 blocks super 1.2
       
unused devices: <none>

The above example shows all software RAID modules are loaded successfully. But if someone is missing it’s simple to load it. For example, to load the RAID1 execute:

modprobe raid1

If you do not know the type of the inactive RAID, you could always check the metadata information of one of the partitions mentioned in the /proc/mdadm:

livecd ~ # mdadm -E /dev/sdb3
/dev/sdb3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 474d4d5b:4d995cb5:a51a8287:28fb4f1a
           Name : srv.example.com:boot
  Creation Time : Fri Oct 25 12:28:25 2019
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 2095104 (1023.00 MiB 1072.69 MB)
     Array Size : 1047552 (1023.00 MiB 1072.69 MB)
    Data Offset : 4096 sectors
   Super Offset : 8 sectors
   Unused Space : before=4016 sectors, after=0 sectors
          State : clean
    Device UUID : afd7785e:4f987f6d:0e66b02a:43071feb

Internal Bitmap : 8 sectors from superblock
    Update Time : Fri Apr 10 19:00:59 2020
  Bad Block Log : 512 entries available at offset 16 sectors
       Checksum : 79709c4e - correct
         Events : 47


   Device Role : Active device 1
   Array State : AA ('A' == active, '.' == missing, 'R' == replacing)

The Raid Level is raid1 and is in clean state (which means this is not the faulty disk/partition/device). So if the kernel RAID1 module is missing, just load it.

STEP 2) Try to run the device with “mdadm –run”

Run the array:

livecd ~ # mdadm --run /dev/md126
mdadm: started array /dev/md/srv.example.com:root
livecd ~ # cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [raid10] [linear] [multipath] 
md125 : inactive sdb3[1](S)
      1047552 blocks super 1.2
       
md126 : active (auto-read-only) raid1 sdb1[1]
      52427776 blocks super 1.2 [2/1] [_U]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md127 : inactive sdb2[1](S)
      16515072 blocks super 1.2
       
unused devices: <none>

The RAID device md126 has been identified and the state has changed to active (it’s “active(auto-read-only)”, which means will turn to active when mounted). One disk is missing “[_U]”.
Execute the run command for the other two RAID devices:

livecd ~ # mdadm --run /dev/md125
mdadm: started array /dev/md/srv.example.com:boot
livecd ~ # mdadm --run /dev/md127
mdadm: started array /dev/md/srv.example.com:swap
livecd ~ # cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [raid10] [linear] [multipath] 
md125 : active (auto-read-only) raid1 sdb3[1]
      1047552 blocks super 1.2 [2/1] [_U]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md126 : active (auto-read-only) raid1 sdb1[1]
      52427776 blocks super 1.2 [2/1] [_U]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md127 : active (auto-read-only) raid1 sdb2[1]
      16515072 blocks super 1.2 [2/1] [_U]
      
unused devices: <none>

STEP 3) Add the missing device to the RAID device

If you have not already copy the partitions layout to the new disk (follow this only of you have not done it yet).

sgdisk /dev/sdb -R /dev/sda
sgdisk -G /dev/sda

Add the missing partitions in the RAID devices and wait for recovering.

livecd ~ # mdadm --add /dev/md126 /dev/sda1
mdadm: added /dev/sda1
livecd ~ # mdadm --add /dev/md125 /dev/sda3
mdadm: added /dev/sda3
livecd ~ # mdadm --add /dev/md127 /dev/sda2
mdadm: added /dev/sda2
livecd ~ # cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [raid10] [linear] [multipath] 
md125 : active raid1 sda3[2] sdb3[1]
      1047552 blocks super 1.2 [2/1] [_U]
        resync=DELAYED
      bitmap: 0/1 pages [0KB], 65536KB chunk

md126 : active raid1 sda1[2] sdb1[1]
      52427776 blocks super 1.2 [2/1] [_U]
      [=>...................]  recovery =  6.0% (3170496/52427776) finish=6.2min speed=132104K/sec
      bitmap: 0/1 pages [0KB], 65536KB chunk

md127 : active raid1 sda2[2] sdb2[1]
      16515072 blocks super 1.2 [2/1] [_U]
        resync=DELAYED
      
unused devices: <none>

Reinstall GRUB?

In most cases a grub installation should be performed before restarting the server. Here is how to do it (BIOS Legacy mode):

livecd ~ # mkdir /mnt/recover/
livecd ~ # mount /dev/md126 /mnt/recover/
livecd ~ # mount -o bind /dev /mnt/recover/dev
livecd ~ # mount -o bind /proc /mnt/recover/proc
livecd ~ # mount -o bind /sys /mnt/recover/sys
livecd ~ # chroot /mnt/recover/
[root@livecd (srv) /]# . /etc/profile
[root@livecd (srv) /]# grub2-install /dev/sda
Installing for i386-pc platform.
grub2-install: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
grub2-install: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
Installation finished. No error reported.
[root@livecd (srv) /]# nano /etc/fstab 
[root@livecd (srv) /]# exit
livecd ~ # umount /mnt/recover/dev
livecd ~ # umount /mnt/recover/proc/
livecd ~ # umount /mnt/recover/sys
livecd ~ # umount /mnt/recover
livecd ~ # reboot

Leave a Reply

Your email address will not be published. Required fields are marked *