PHP posix_kill missing in CentOS7

So you think you PHP code is running ok and your script for killing bad guys is perfect. Let’s assume you made a script to kill processes according to some criteria, put you script on your good all CentOS 7 server, but suddenly you encounter the error:

PHP Fatal error: Call to undefined function posix_kill() in ./kill-process.php on line 64
Killed process id: 21899
PHP Fatal error: Call to undefined function posix_kill() in ./kill-process.php on line 64
Killed process id: 21899
PHP Fatal error: Call to undefined function posix_kill() in ./kill-process.php on line 64

So you just missed a php plugin and there it is: “php-process”. Just install it with yum!

yum install -y php-process

And the error will disappear and the next time you want to kill some bad process you’ll sure do it!

bacula fatal error – Unable to connect to Storage daemon

Bacula is an open software enterprise backup system! Check out the official site here
Complex but useful software, which could automate the whole backup process of all your servers.
Some errors are easy to track some are not, so here is one error with a misleading error message if you do not know or forget the details of how the daemons works.

Here is the error extracted from the logs:

01-Sep 00:45 backup01-de-dir JobId 8789: No prior Full backup Job record found.
01-Sep 00:45 backup01-de-dir JobId 8789: No prior or suitable Full backup found in catalog. Doing FULL backup.
01-Sep 00:45 backup01-de-dir JobId 8789: Job srv123us.2017-09-01_00.45.28_34 waiting 103 seconds for scheduled start time.
01-Sep 00:47 backup01-de-dir JobId 8789: Start Backup JobId 8789, Job=srv123us.2017-09-01_00.45.28_34
01-Sep 00:47 backup01-de-dir JobId 8789: Using Device "web" to write.
01-Sep 00:51 srv123us-fd JobId 8789: Warning: bsock.c:112 Could not connect to Storage daemon on 1.1.1.1:9103. ERR=Connection timed out
01-Sep 01:17 srv123us-fd JobId 8789: Fatal error: bsock.c:118 Unable to connect to Storage daemon on 1.1.1.1:9103. ERR=Interrupted system call
01-Sep 01:17 srv123us-fd JobId 8789: Fatal error: job.c:1893 Failed to connect to Storage daemon: 1.1.1.1:9103
01-Sep 01:17 backup01-de-dir JobId 8789: Fatal error: Bad response to Storage command: wanted 2000 OK storage
01-Sep 01:17 backup01-de-dir JobId 8789: Error: Bacula backup01-de-dir 7.0.5 (28Jul14):
 Build OS:               x86_64-pc-linux-gnu ubuntu 16.04
  JobId:                  8789
  Job:                    srv123us.2017-09-01_00.45.28_34
  Backup Level:           Full (upgraded from Incremental)
  Client:                 "srv123us" 7.0.5 (28Jul14) x86_64-pc-linux-gnu,ubuntu,16.04
  FileSet:                "web" 2017-11-07 17:19:45
  Pool:                   "web-full" (From Job FullPool override)
  Catalog:                "ucdn" (From Client resource)
  Storage:                "web" (From Job resource)
  Scheduled time:         01-Sep-2018 00:47:11
  Start time:             01-Sep-2018 00:47:11
  End time:               01-Sep-2018 01:17:23
  Elapsed time:           30 mins 12 secs
  Priority:               10
  FD Files Written:       0
  SD Files Written:       0
  FD Bytes Written:       0 (0 B)
  SD Bytes Written:       0 (0 B)
  Rate:                   0.0 KB/s
  Software Compression:   None
  VSS:                    no
  Encryption:             no
  Accurate:               no
  Volume name(s):         
  Volume Session Id:      4719
  Volume Session Time:    1510075534
  Last Volume Bytes:      0 (0 B)
  Non-fatal FD errors:    2
  SD Errors:              0
  FD termination status:  Error
  SD termination status:  Waiting on FD
  Termination:            *** Backup Error ***

But when we check the status of client from “bconsole” (Bacula’s management Console), everything seems OK, the backup server (Director daemon = bacula-dir) connects and get the report from the client daemon (Bacula File service = bacula-fd) in the server, even when you run a backup job, the status report is OK, the backup is running on the client, here is the output:

srv@local ~ # bconsole
Connecting to Director localhost:9101
1000 OK: 1 backup01-de-dir Version: 7.0.5 (28 July 2014)
Enter a period to cancel a command.
*status
Status available for:
     1: Director
     2: Storage
     3: Client
     4: Scheduled
     5: All
Select daemon type for status (1-5): 3
The defined Client resources are:
     1: srv1us
     2: srv2us
     3: srv123us
Select Client (File daemon) resource (1-3): 3
Connecting to Client srv123us at 108.61.250.36:9102
srv123us-fd Version: 7.0.5 (28 July 2014)  x86_64-pc-linux-gnu ubuntu 16.04
Daemon started 23-Feb-17 00:43. Jobs: run=1 running=0.
 Heap: heap=98,304 smbytes=571,344 max_bytes=571,361 bufs=97 max_bufs=97
 Sizes: boffset_t=8 size_t=8 debug=0 trace=0 mode=0,0 bwlimit=0kB/s
 Plugin: bpipe-fd.so 

Running Jobs:
JobId 8789 Job srv123us.2017-09-01_00.45.28_34 is running.
    Incremental Backup Job started: 01-Sep-17 00:45
    Files=0 Bytes=0 AveBytes/sec=0 LastBytes/sec=0 Errors=0
    Bwlimit=0
    Files: Examined=5 Backed up=0
    SDReadSeqNo=6 fd=5
Director connected at: 01-Sep-17 01:10
====

Terminated Jobs:
====

As you can see, everything seems OK of the status, there was a running job in the client server and it seemed the backup process had been running without errors for more then 20 minutes, but then suddenly got Fatal error (the first log):

01-Sep 00:51 srv123us-fd JobId 8789: Warning: bsock.c:112 Could not connect to Storage daemon on 1.1.1.1:9103. ERR=Connection timed out
01-Sep 01:17 srv123us-fd JobId 8789: Fatal error: bsock.c:118 Unable to connect to Storage daemon on 1.1.1.1:9103. ERR=Interrupted system call
01-Sep 01:17 srv123us-fd JobId 8789: Fatal error: job.c:1893 Failed to connect to Storage daemon: 1.1.1.1:9103
01-Sep 01:17 backup01-de-dir JobId 8789: Fatal error: Bad response to Storage command: wanted 2000 OK storage

And the problem is that, the Director (backup server) connects to the File Service of the client (the daemon on the client), but the opposite connection is not possible! When the backup is ready, the client daemon bacula file service connects to the bacula storage service (which could be on the same server with the director, but it could be on another server) to send the backup files and here is the problem! Client could not connect to the storage! So always check the two way connections: backup server -> client server-port:9102 and backup server-port:9103 (or storage server) <- client.
In the world of bacula:

bacula-dir -> bacula-fd:9102

bacula-sd:9103 -> bacula-fd

Misleading error on causal look it seems like bacula-sd is returning error to bacula-fd (which would mean that bacula-fd could connect to bacula-sd after all), but in reality bacula-dir received and logged that bacula-fd did not connect to bacula-sd resulting in Fatal error.

In our situation the firewall of the backup server was denying the connections from the client, but it could be a DNS resolve issue or another network problem. Most common problems are firewall or DNS resolve issues. The solution – just add accept rule for the IP of the client to connect to port 9103 of the backup (storage) server.

Busybox ash, Debian dash and simulating bash arrays

Busybox ash (Almquist shell) shell and Debian dash (Debian Almquist shell) are lightweight Unix shell and they are a variant of System V.4 variant of the Bourne shell. Ash/dash shell is known to be very small and is used mainly in embedded (ash) devices and installation scripts (Debian/Ubuntu setup).
Unfortunately they do not support arrays, which could be really a problem in many cases. But we can simulate the arrays with eval function.
So if you need to write a ash/dash script let’s say for an installation script of Ubuntu or Debian or a script for an embedded device, which uses busybox or even you do not want to use arrays in bash, you can follow the consepts below – create variable with a “name” concatenated with a number.

  • 1) Set a variable

    It can be done with two ways:

    1. for myi in 0 1 2 ; do
          setvar mvar$myi "Payload: $myi"
      done
      
    2. for myi in 0 1 2 ; do
          eval mvar$myi=\"Payload: $myi\"
      done
      

    This will create variables with names:

    mvar1, mvar2, mvar3

    and they can be used in any place of your script after the creation of the variables using “eval” or accessing them with the names.

    * bash shell do not support the command “setvar”, so for bash scripts use only eval version.

  • 2) Use a variable

    1. using “eval”
      for myi in 0 1 2 ; do
          eval echo \$mvar$myi
      done
      
      myi=1
      eval newvar="\$mvar$myi"
      echo $newvar
      
    2. direct access
      echo $mvar2
      $mvar2="Payload 20"
      echo $mvar2
      

tmpfs mount on /dev/shm in LXC container or chroot environment

When using LXC containers booting the lxc container would not populate it as the normal boot process. Or when you create a chroot jail /dev is not mounted or just some devices are created.
There is an option to populate (when using LXC containers) it with minimal required devices:

lxc.autodev = 1

which will create a tmpfs mount under /dev and create some basic devices, it will ensure /dev/shm to be mounted on with tmpfs!
If you omit this option, the /dev directory won’t be populated and will stay with the devices you made or copied when you made the LXC container (or the chroot jail) and /dev/shm will not be mounted using tmps, which could create numerous bad issues.
If you get errors like

 * configure has detected that the sem_open function is broken.
 * Please ensure that /dev/shm is mounted as a tmpfs with mode 1777.

You could mount the /dev/shm of the LXC container or the chroot jail (usually you can tune the size half of the server’s RAM) with

mkdir -p /dev/shm
mount -t tmpfs -o nodev,nosuid,noexec,mode=1777,size=6144m tmpfs /dev/shm

Or reboot your LXC container with a new configuration (probably in the “/var/lxc/[lxc_name]/config”) adding the following line:

lxc.mount.entry = none dev/shm tmpfs nodev,nosuid,noexec,mode=1777,create=dir 0 0

Thus you ensure the /dev/shm to be mounted on tmpfs and all semaphore functions to work properly.

* Real output of Gentoo failed compilation of python package:
 * configure has detected that the sem_open function is broken.
 * Please ensure that /dev/shm is mounted as a tmpfs with mode 1777.
 * ERROR: dev-lang/python-3.3.4-r1::gentoo failed (configure phase):
 *   Broken sem_open function (bug 496328)
 * 
 * Call stack:
 *     ebuild.sh, line 124:  Called src_configure
 *   environment, line 3542:  Called die
 * The specific snippet of code:
 *           die "Broken sem_open function (bug 496328)";
 * 
 * If you need support, post the output of `emerge --info '=dev-lang/python-3.3.4-r1::gentoo'`,
 * the complete build log and the output of `emerge -pqv '=dev-lang/python-3.3.4-r1::gentoo'`.
 * The complete build log is located at '/var/tmp/portage/dev-lang/python-3.3.4-r1/temp/build.log'.
 * The ebuild environment file is located at '/var/tmp/portage/dev-lang/python-3.3.4-r1/temp/environment'.
 * Working directory: '/var/tmp/portage/dev-lang/python-3.3.4-r1/work/x86_64-pc-linux-gnu'
 * S: '/var/tmp/portage/dev-lang/python-3.3.4-r1/work/Python-3.3.4'

>>> Failed to emerge dev-lang/python-3.3.4-r1, Log file:

SUPERMICRO IPMI/KVM module tips – reset the unit and the admin password

After the previous howto “SUPERMICRO IPMI to use one of the one interfaces or dedicated LAN port” (in the howto is showed how to install the needed tool for managing the IPMI/KVM unit under console) of setting the network configuration there are a couple of interesting and important tips when working with the IPMI/KVM module. Here are they are:

  1. Reset IPMI/KVM module – sometimes it happen the keyboard or mouse not to work when the Console Redirection is loaded, it is easy to reset the unit from the web interface, but there are case when the web interface is not working – so ssh to your server and try one of the following commands:
    * warm reset – it’s like a reboot, inform the IPMI/KVM to reboot itself.

    ipmitool -I open bmc reset warm
    

    It does not work in all situations! So try a cold reset
    * cold reset – resets the IPMI/KVM, it’s like unplug and plug the power to the unit.

    ipmitool -I open bmc reset cold
    
  2. Reset the configuration of an IPMI/KVM module to factory defaults. It is useful when something goes wrong when upgrading the firmware of the unit and the old configuration is not supported or it says it is, but at the end the unit does not work properly. In rare cases it might help when the KVM (Keyboard, Video, Monitor part aka Console redirection does not work)
    Here is the command for resetting to factory defaults:

    ipmitool -I open raw 0x3c 0x40
    
  3. Reset admin password – reset the password for the administrator login of the IPMI/KVM unit. It’s trivial losing the password so with the help of the local console to the server you can reset the password to a simple one and then change it from the web interface.
    ipmitool -I open user set password 2 ADMIN
    

    The number “2” is the ID of the user, check it with:

    [root@srv0 ~]# ipmitool -I open user list
    ID  Name             Callin  Link Auth  IPMI Msg   Channel Priv Limit
    1                    true    false      false      Unknown (0x00)
    2   ADMIN            true    false      false      Unknown (0x00)
    3                    true    false      false      Unknown (0x00)
    4                    true    false      false      Unknown (0x00)
    5                    true    false      false      Unknown (0x00)
    6                    true    false      false      Unknown (0x00)
    7                    true    false      false      Unknown (0x00)
    8                    true    false      false      Unknown (0x00)
    9                    true    false      false      Unknown (0x00)
    10                   true    false      false      Unknown (0x00)
    

    Sometimes if a hacker got to your IPMI/KVM you could see the user table with the above command. There was a serious bug aka backdoor in some of these units, the ID of the ADMIN user or even the username could be changed, so you should use the list command to list the current user table.
    Use set name to set the username of the user.

    ipmitool -I open user set name 2 ADMIN
    
  4. Set a new network configuration. It’s worth mentioning again the howto for this purpose – “SUPERMICRO IPMI to use one of the one interfaces or dedicated LAN port

All commands using the network option of the ipmitool

ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN bmc reset warm
ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN bmc reset cold
ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN raw 0x3c 0x40
ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN user set password 2 ADMIN
ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN user list

The IP 192.168.7.150 is the IP of your IPMI/KVM module, which you want to change with the above commands.

Tunneling the IPMI/KVM ports over ssh (supermicro ipmi ports)

The best security for the remote management unit in your server such as IPMI/KVM is to have local IP. All IPMI/KVM IP should be switched to a separated switch and a local sub-network used for the LAN Settings. So to be able to connect to the IPMI/KVM module you need a VPN connection to gain access to the local sub-network used for your servers’ management modules. However, sometimes the VPN cannot be used or it just happened the server is down, or you are at a place restricting unknown ports (or ports above 1024), which your VPN uses (that’s why the VPN server should use only one port from the most popular – 80, 443, but that’s a thing for another howto…) and so on. So you end with no ability to connect to the VPN server or you think you do not need at all a VPN server, because you always could use

openssh

to do the trick of tunneling ports from your computer to the IPMI/KVM module of your server through a server, which has an access to the local sub-network of the IPMI/KVM modules.

So here is what you need to get to the remote management of your server just using ssh for tunneling:

STEP 1) A server, which has access to the IP network of the IPMI/KVM modules.

Let’s say you set to all your servers’ IPMI/KVM modules IPs from network 192.168.7.0/24, so your server must have an IP from 192.168.7.0/24, for example 192.168.7.1, add it as an alias or to a dedicated LAN connected to the switch, in which of all your IPMI/KVM modules are plugged in. This server will be used as a transfer point to a selected IPMI/KVM IP.

STEP 2) Tunnel local selected ports using ssh to the server from STEP 1)

Use this command:

ssh -N -L 127.0.0.1:80:[IPMI-IP]:80 -L 127.0.0.1:443:[IPMI-IP]:443 -L 127.0.0.1:5900:[IPMI-IP]:5900 -L 127.0.0.1:623:[IPMI-IP]:623 root@[SERVER-IP]

For example using 192.168.7.150 for an IPMI/KVM IP:

[root@srv0 ~]# ssh -N -L 127.0.0.1:80:192.168.7.150:80 -L 127.0.0.1:443:192.168.7.150:443 -L 127.0.0.1:5900:192.168.7.150:5900 -L 127.0.0.1:623:192.168.7.150:623 root@example-server.com

With the above command you can use the web interface (https://127.0.0.1/, you could replace 127.0.0.1 with a local IP or a local IP alias of your machine), the java web start “Console Redirection” (the KVM – Keyboard, Video and Mouse) and you can mount Virtual Media from your computer to your server’s virtual CD/DVD device. Unfortunately to use properly the Virtual CD/DVD you must tunnel the UDP on port 623 (not only TCP 623), which is a little bit tricky. To tunnel the UDP packets

socat – Multipurpose relay (SOcket CAT)

program must be used.

STEP 3) Tunnel local selected ports using ssh to the server from STEP 1) and UDP port using socat

[root@srv0 ~]# socat -T15 udp4-recvfrom:623,reuseaddr,fork tcp:localhost:8000
[root@srv0 ~]# ssh -L8000:localhost:8000 -L 127.0.0.1:80:192.168.7.150:80 -L 127.0.0.1:443:192.168.7.150:443 -L 127.0.0.1:5900:192.168.7.150:5900 -L 127.0.0.1:623:192.168.7.150:623 root@example-server.com socat tcp4-listen:8000,reuseaddr,fork UDP:192.168.7.150:623

This will start a UDP listening socket on localhost port 8000. Every packet will be relayed using TCP to localhost 8000, which will be tunneled using ssh command to the remote server, where there is a started another socat listening TCP socket on port 8000, which will relay every packet to the UDP port 623 of IP 192.168.7.150. Replace the IP 192.168.7.150 with your IPMI/KVM IP.

* Here are the required ports for SUPERMICRO IPMI functionality in X9 and X10 motherboards

  • X9-motherboards, the ports are

    TCP Ports
    HTTP: 80
    HTTPS: 443
    SSH: 22
    WSMAN: 5985
    Video: 5901
    KVM: 5900
    CD/USB: 5120
    Floppy: 5123
    Virtual Media: 623
    SNMP: 161

    UDP ports:
    IPMI: 623

  • For X10-motherboards, the ports are

    TCP Ports
    HTTP: 80
    HTTPS: 443
    SSH: 22
    WSMAN: 5985
    Video: 5901
    KVM: 5900 , 3520
    CD/USB: 5120
    Floppy: 5123
    Virtual Media: 623
    SNMP: 161

    UDP ports:
    IPMI: 623

You could add the required port to the ssh command above if you need it!

Virtual Device mounted successfully

Successful mount in Console Redirection with Virtual Media:

main menu
Virtual Storage

if you are logged in the server and mount an ISO with the Virtual Device you’ll probably have this in “dmesg”:

[46683751.661063] usb 2-1.3.2: new high-speed USB device number 8 using ehci-pci
[46683751.795048] usb 2-1.3.2: New USB device found, idVendor=0ea0, idProduct=1111
[46683751.795051] usb 2-1.3.2: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[46683751.795365] usb-storage 2-1.3.2:1.0: USB Mass Storage device detected
[46683751.795553] scsi6 : usb-storage 2-1.3.2:1.0
[46683752.795730] scsi 6:0:0:0: CD-ROM            ATEN     Virtual CDROM    YS0J PQ: 0 ANSI: 0 CCS
[46683752.806839] sr0: scsi3-mmc drive: 40x/40x cd/rw xa/form2 cdda tray
[46683752.806842] cdrom: Uniform CD-ROM driver Revision: 3.20
[46683752.806933] sr 6:0:0:0: Attached scsi CD-ROM sr0
[46683752.806971] sr 6:0:0:0: Attached scsi generic sg1 type 5

SUPERMICRO IPMI to use one of the LAN interfaces or dedicated LAN port

If you happen to have a Supermicro server and you want to change the default behavior of the IPMI LAN interface, which is

Failover – on boot check whether the dedicated LAN port is connected if so use the it, otherwise use the shared LAN1

So if change it there are some magic commands to change this default behavior:

  • Always use dedicated LAN:
    within the server under console:

    ipmitool -I open raw 0x30 0x70 0x0c 1 0
    

    from remote using the network:

    ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN raw 0x30 0x70 0x0c 1 0
    

    Sometimes the output of the last command (that using the lanplus) will output:

    Unable to send RAW command (channel=0x0 netfn=0x30 lun=0x0 cmd=0x70)
    

    But it sets the value despite the error output “Unable to send”. You could check it with the read command (the last example).

  • Always use shared LAN1:
    within the server under console:

    ipmitool -I open raw 0x30 0x70 0xc 1 1 
    

    from remote using the network:

    ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN raw 0x30 0x70 0x0c 1 1
    

    Sometimes the output of the last command (that using the lanplus) will output:

    Unable to send RAW command (channel=0x0 netfn=0x30 lun=0x0 cmd=0x70)
    

    But it sets the value despite the error output “Unable to send”. You could check it with the read command (the last example).

  • Always use failover (factory default):
    within the server under console:

    ipmitool -I open raw 0x30 0x70 0xc 1 2
    

    from remote using the network:

    ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN raw 0x30 0x70 0x0c 1 2
    
  • Sometimes the output of the last command (that using the lanplus) will output:

    Unable to send RAW command (channel=0x0 netfn=0x30 lun=0x0 cmd=0x70)
    

    But it sets the value despite the error output “Unable to send”. You could check it with the read command (the last example).

Get the current value with:

[root@srv0 ~]# ipmitool -I open raw 0x30 0x70 0x0c 0
 02
[root@srv0 ~]#

Default (failover): you will see 02
Onboard LAN: you will see 01
Dedicated LAN: you will see 00

The 192.168.7.157 is the IP of the IPMI KVM module and the -U ADMIN and -P ADMIN are username and the password login details to the module (ADMIN/ADMIN are just default settings for the Supermicro IPMI/KVM)

* Here you can set the LAN IP configuration – “Set IP to the IPMI/KVM server module with ipmitool

Set IP to the IPMI/KVM server module with ipmitool

IPMI/KVM module is a pretty useful add-on module to every server. In fact, every server should have IPMI module installed for fast management of the server in critical cases!
Here are the commands to set a static IP to the IPMI/KVM module with ipmitool using a console to the server:

ipmitool -I open lan set 1 ipsrc static
ipmitool -I open lan set 1 ipaddr [IPADDR]
ipmitool -I open lan set 1 netmask [NETMASK]
ipmitool -I open lan set 1 defgw ipaddr [GW IPADDR]
ipmitool -I open lan set 1 access on
  • [IPADDR] – the IP address of the IPMI/KVM
  • [NETMASK] – the netmask of the network
  • [GW IPADDR] – the gateway of the network

Here is a real world example of setting properly the LAN settings of the IPMI module.

[root@srv0 ~]# ipmitool -I open lan set 1 ipsrc static
[root@srv0 ~]# ipmitool -I open lan set 1 ipaddr 192.168.6.45
Setting LAN IP Address to 192.168.6.45
[root@srv0 ~]# ipmitool -I open lan set 1 netmask 255.255.255.0
Setting LAN Subnet Mask to 255.255.255.0
[root@srv0 ~]# ipmitool -I open lan set 1 defgw ipaddr 192.168.6.1
Setting LAN Default Gateway IP to 192.168.6.1
[root@srv0 ~]# ipmitool -I open lan set 1 access on
Set Channel Access for channel 1 was successful.
[root@srv0 ~]#

To see the current settings use:

[root@srv0 ~]# ipmitool -I open lan print
Set in Progress         : Set Complete
Auth Type Support       : NONE MD2 MD5 PASSWORD 
Auth Type Enable        : Callback : MD2 MD5 PASSWORD 
                        : User     : MD2 MD5 PASSWORD 
                        : Operator : MD2 MD5 PASSWORD 
                        : Admin    : MD2 MD5 PASSWORD 
                        : OEM      : MD2 MD5 PASSWORD 
IP Address Source       : Static Address
IP Address              : 192.168.6.45
Subnet Mask             : 255.255.255.0
MAC Address             : 00:25:90:18:8b:c9
SNMP Community String   : public
IP Header               : TTL=0x00 Flags=0x00 Precedence=0x00 TOS=0x00
BMC ARP Control         : ARP Responses Enabled, Gratuitous ARP Disabled
Default Gateway IP      : 192.168.6.1
Default Gateway MAC     : 00:00:00:00:00:00
Backup Gateway IP       : 0.0.0.0
Backup Gateway MAC      : 00:00:00:00:00:00
802.1q VLAN ID          : Disabled
802.1q VLAN Priority    : 0
RMCP+ Cipher Suites     : 1,2,3,6,7,8,11,12
Cipher Suite Priv Max   : aaaaXXaaaXXaaXX
                        :     X=Cipher Suite Unused
                        :     c=CALLBACK
                        :     u=USER
                        :     o=OPERATOR
                        :     a=ADMIN
                        :     O=OEM
Bad Password Threshold  : Not Available

*Dependencies

Installation of ipmitool:

  • CentOS 7
    yum -y install ipmitool
    
  • Ubuntu 16+
  • apt-get install ipmitool
    
  • Gentoo
    emerge -vu sys-apps/ipmitool
    

*Troubleshooting

If you receive errors when you execute ipmitool:

[root@srv0 ~]# ipmitool -I open lan set 1 ipaddr 192.168.6.45
Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
[root@srv0 ~]# ipmitool -I open lan set 1 netmask 255.255.255.0
Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
[root@srv0 ~]# ipmitool -I open lan set 1 defgw ipaddr 192.168.6.1
Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory

The kernel module for the IPMI/KVM is not loaded by the system, so just execute:

[root@srv0 ~]# modprobe ipmi_si
[root@srv0 ~]# modprobe ipmi_devintf

And then you could use ipmitool commands above to set the network configuration of the IPMI/KVM add-on module.

megacli – restart a rebuild with a disk in failed state

Sometimes we need to start a rebuild with a disk in failed state when using a LSI hardware controller, but if we just return the good state of the failed disk, it will return immediately in the array and our filesystem will be broken for sure! In addition it happens that when we replace a disk the new disk to be in failed state, too.

So here are simple and tested steps for proper resetting a failed state of a disk to a good state and starting a rebuild. In the example below the disk in failed state is [32:1], replace with the proper [enclosure_id:slot_id] in your case.

  1. Make “Failed State” in “Unconfigured(BAD)”
    megacli -pdmarkmissing -physdrv[32:1] -aAll
    
  2. Prepare for removal (this command could fail, not a critical one)
    megacli -pdprprmv -physdrv[32:1] -a0
    
  3. Make the state of the disk “Unconfigured(Good), Spun Up”
    megacli -PDMakeGood -PhysDrv[32:1] -a0
    
  4. Start rebuild (this command could fail) – if the command fails continue with the next step, if not, the rebuild is restarted successfully.
    megacli -PDRbld -Start -PhysDrv[32:1] -a0
    

    Or

    megacli -pdlocate -start -physdrv[32:1] -a0
    

    One of the two commands will probably start the rebuild, but if the two fail then continue to the next step.

  5. Start rebuild, first clean the foreign configuration and then make the device hot spare (only if 4 the above command failed)
    megacli -CfgForeign -Clear -aALL
    #set global hostspare
    megacli -PDHSP -Set -PhysDrv [32:1] -a0
    

* If you need to unset/remove a global hotspare:

megacli -PDHSP -Rmv -PhysDrv [32:1] -aN

How to enable linux bonding without ifenslave

ifenslave is no more needed when configuring bonding under Linux. There are situations when we could have no network link without bonding, because of specific switch configuration and we do not have ifenslave package installed. We can configure bonding manually via Sysfs.
Here are the steps to configure bond0 in adaptive load balancing with two network cards in slave mode:

modprobe bonding
echo balance-alb > /sys/class/net/bond0/bonding/mode
echo +eth0 > /sys/class/net/bond0/bonding/slaves
echo +eth1 > /sys/class/net/bond0/bonding/slaves
ifconfig bond0 192.168.1.1 netmask 255.255.255.0 up

The adaptive load balancing does not require any special network setup. On the contrary, the mode “802.3ad” could be used only if you enable bonding of the interfaces of your server to have a network link.

echo 802.3ad > /sys/class/net/bond0/bonding/mode

For more detailed explanation:

https://www.kernel.org/doc/Documentation/networking/bonding.txt

If the “/sys/class/net/bond0” is missing you should do:

echo +bond0 > /sys/class/net/bonding_masters