megacli – FW error description: The current operation is not allowed … offline or missing virtual drives

Probably everyone who has ever touched LSI controllers and the megacli tool has had this error or he is going to have it for sure! It’s only a matter of time when you receive it when creating or replacing a disk! No this is not another article for this error!!!

FW error description: The current operation is not allowed because the controller has data in cache for offline or missing virtual drives.

Probably you want to replace a failed disk and you have inserted the new one and you cannot add it into the RAID array or something similar. In our case, the drive failed and even stopped working, the slot showed “missing drive” status. We replace the hard drive and it was in “Unconfigured(Good), Spun Up”

You have to use exactly the name of the Virtual Drive with the zero leading if any and even you may use double quotes

root@srv ~ # megacli -DiscardPreservedCache -L"08" -a0
                                     
Adapter #0

Virtual Drive(Target ID 08): Preserved Cache Data Cleared.

Exit Code: 0x00

As you can see: “Target ID 08“, the ID is 08, NOT 8!

Here are some unsuccessful tries (even with 08 was unsuccessful – it might be from our cli version or shell, but it did not work!). Note the last command to get the status for all devices:

root@srv ~ # megacli -DiscardPreservedCache -L8 -a0
                                     
Adapter 0: No Virtual Drive has Preserved Cache Data.

Exit Code: 0x00
root@srv ~ # megacli -DiscardPreservedCache -L08 -a0
                                     
Adapter 0: No Virtual Drive has Preserved Cache Data.

Exit Code: 0x00

root@srv ~ # megacli -GetPreservedCacheList -a0
                                     
Adapter #0

Virtual Drive(Target ID 08): Missing.

Exit Code: 0x00

It reports there is no drive with preserved cache on the very ID, YOU TYPED, but then getting the Preserved cached list there is a drive in the list because 8 is different from 08.

For farther problems getting your new disk in the array you can check our tested replace procedure: megacli – restart a rebuild with a disk in failed state

LSI MegaRAID 2108 freezes with abort command and all processes hang up in disk sleep

It happened to one of our old LSI MegaRAID 2108 controllers (AOC-USAS2LP-H8iR (smc2108) with 36 disk, 32x2T and 4x8T) to freeze and most of the processes hang up with Disk sleep. The server was up, the network was working, but no login could be successful. A hard reset was executed with the IPMI KVM. The server started up, the MegaRAID controller booted with a warning that it was shutdown unexpectedly so there could be possible loss of data and to accept it by pressing any key or “C” to boot in the WebBIOS of the controller.

To summarize it up: the LSI controller hangs up when is in the following modes:

  1. Background Initialization
  2. Check Consistency

Aborting and disabling the modes above let out controller to work till replacement. If you experience any kind of strange disk hangs or freezes you can try our solution here! Check below to see how to do it yourself.

Keep on reading!

megacli – restart a rebuild with a disk in failed state

Sometimes we need to start a rebuild with a disk in failed state when using a LSI hardware controller, but if we just return the good state of the failed disk, it will return immediately in the array and our filesystem will be broken for sure! In addition it happens that when we replace a disk the new disk to be in failed state, too.

So here are simple and tested steps for proper resetting a failed state of a disk to a good state and starting a rebuild. In the example below the disk in failed state is [32:1], replace with the proper [enclosure_id:slot_id] in your case.

  1. Make “Failed State” in “Unconfigured(BAD)”
    megacli -pdmarkmissing -physdrv[32:1] -aAll
    
  2. Prepare for removal (this command could fail, not a critical one)
    megacli -pdprprmv -physdrv[32:1] -a0
    
  3. Make the state of the disk “Unconfigured(Good), Spun Up”
    megacli -PDMakeGood -PhysDrv[32:1] -a0
    
  4. Start rebuild (this command could fail) – if the command fails continue with the next step, if not, the rebuild is restarted successfully.
    megacli -PDRbld -Start -PhysDrv[32:1] -a0
    

    Or

    megacli -pdlocate -start -physdrv[32:1] -a0
    

    One of the two commands will probably start the rebuild, but if the two fail then continue to the next step.

  5. Start rebuild, first clean the foreign configuration and then make the device hot spare (only if 4 the above command failed)
    megacli -CfgForeign -Clear -aALL
    #set global hostspare
    megacli -PDHSP -Set -PhysDrv [32:1] -a0
    

* If you need to unset/remove a global hotspare:

megacli -PDHSP -Rmv -PhysDrv [32:1] -aN