Install CUDA and NVIDIA video driver under CentOS 7

Nvidia CUDA Toolkit supports CentOS 7 and it is relatively simple to install it. Nvidia provides three types of installation – a big setup file, a big rpm file and an official Nvidia repository, which we are going to use it in this article. The Nvidia repository contains the Nvidia video driver for the Nvidia video cards like GeForce, GTX, RTX and so on. You may need CUDA Toolkit if you are a game developer or you want to build yourself some of the mining software like XMR-STAK.
In this article, we are going to use the NVIDIA official repository for CUDA and the video driver module. There are other ways to install CUDA, which are not the purpose of this article. Using an official repository is the best practice for installing software on your system.

STEP 1) Update and install the NVIDIA official repository.

yum update -y
yum install -y yum-utils
yum-config-manager --add-repo http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-rhel7.repo

Keep on reading!

How to compile xmr-stak (2.10) under CentOS 7 for CPU mining cryptocurrencies in September 2019

A time to refresh our old article on how to compile xmr-stak for CPU mining with the new version and this time a new GNU GCC version (version 8.3, the last article we used 7.x – How to compile xmr-stak (2.4.5) under CentOS 7 for CPU mining cryptocurrencies). Always use the latest available GNU GCC packages because the latest version of GNU GCC could add some optimizations to the binary compiled code and you may have a CPU miner with better performance!
Thanks to xmr-stak we can have one application capable of mining many different cryptocurrencies based on different algorithms. XMR-STAK is GPU and CPU miner and here we present only the CPU ability under CentOS 7 using our AMD Threadripper 1950X.
The software in this article:

  • CentOS 7 – CentOS Linux release 7.6.1810 (Core)
  • GNU GCC – gcc version 8.3.1 20190311 (Red Hat 8.3.1-3) (GCC)
  • XMR-STAK – 2.10.7

As said many times working with crypto-currency it is mandatory to do the things yourself – do not trust any binary made by someone on the Internet. It is easy to build your miner yourself with the code from the official repository!

So here are the steps to build the XMR-STAK for CPU mining:

STEP 1) Update your system and install the following dependencies

Always start with update command and then install the dependencies in order first install all the new repositories and then the dependency binaries.

sudo yum update -y
sudo yum install -y centos-release-scl epel-release
sudo yum install -y cmake3 devtoolset-8-gcc* hwloc-devel libmicrohttpd-devel openssl openssl-devel make git screen wget

We are going to use GNU GCC 8 to build the XMR-STAK. More on the subject of how to install GNU GCC 8 and what is “devtoolset” here – How to install GNU GCC 8 on CentOS 7.
Keep on reading!

How to install GNU GCC 8 on CentOS 7

It has been long after releasing the GNU GCC 8.x, but at last, there is a trusted repository, which has offered us packages for GNU GCC 8.x, which won’t break your system! Many of us prefer CentOS 7 because it offers free enterprise-class operating system and as mentioned in our article before – How to install new gcc and development tools under CentOS 7 there are a couple of approved external repositories for CentOS, which you can trust https://wiki.centos.org/AdditionalResources/Repositories. In one of them Software Collection – https://www.softwarecollections.org/en/scls/ several months ago the GNU GCC 8.x packages were added!
At present, the GNU GCC version is gcc (GCC) 8.3.1 20190311 (Red Hat 8.3.1-3).
Here are the steps how you can install GNU GCC 8 and how you can use it:

STEP 1) Update your system and install the repository in your system

The commands:

yum update -y
yum -y install centos-release-scl

Keep on reading!

CentOS 7 dracut-initqueue timeout and could not boot – warning /dev/disk/by-id/md-uuid- does not exist

Let’s say you update your software raid layout – create, delete or modify your software raid and reboot the system and your server does not start normally. After loading your remote video console (KVM) you see the boot process reports for a missing device and you are under console (dracut console). Your system is in “Emergency mode”.

The warning:

dracut-initqueue[504]: Warning: dracut-initqueue timeout - starting timeout scripts
dracut-initqueue[504]: Warning: dracut-initqueue timeout - starting timeout scripts
dracut-initqueue[504]: Warning: dracut-initqueue timeout - starting timeout scripts
....
....
dracut-initqueue[504]: Warning: could not boot.
dracut-initqueue[504]: Warning: /dev/disk/by-id/md-uuid-2fdc509e:8dd05ed3:c2350cb4:ea5a620d does not exist
      Starting Dracut Emergency Shell...
Warning: /dev/disk/by-id/md-uuid-2fdc509e:8dd05ed3:c2350cb4:ea5a620d does not exist

Generating "/run/initramfs/rdsosreport.txt"


Entering emergency mode. Exit the shell to continue.
Type "journalctl" to view system logs.
You might want to save "/run/initramfs/rdsosreport.txt" to a USB stick or /boot
after mounting them and attach it to a bug report.


dracut:/#

This article is for problems, which occur after manipulating a storage RAID devices, not the system (root) or boot devices!!! If the missing device the RAIDmd-uuid-2fdc509e:8dd05ed3:c2350cb4:ea5a620d does not exist” is either the root or the boot device, the propose solution here would not help with just exiting the Emergency Shell! In those cases, when the missing device is the root or boot before exiting the Emergency Shell the problem must be resolved, so the devices and their file system should be available. There is another article on the subject, which may help the reader in such cases – CentOS 8 dracut-initqueue timeout and could not boot – warning /dev/disk/by-id/md-uuid- does not exist – inactive raids.

SCREENSHOT 1) The boot process reports mutiple warning messages of dracut-initqueue timeout, because a drive cannot be found.

main menu
Warning: dracut-initqueue timeout – starting timeout scripts

Keep on reading!

Failed to start Security Audit Service, Authorization Manager and Login Service

A power outrage caused one of our servers to shut down unexpectedly and after it had been powered up the server did not show up. The server was unreachable and apparently, the network did not bring up the interfaces.
Loading the IPMI KVM Console and rebooting the server there were three errors on the screen during the boot up of the CentOS 7:

[FAILED] Failed to start Security Audit Service.
See 'systemctl status auditd.service' for details.
....
....
[FAILED] Failed to start Authorization Manager.
See 'systemctl status polkit.service' for details.
....
....
[FAILED] Failed to start Login Service.
See 'systemctl status systemd-logind.service' for details.

And after the above last line, the system stopped loading.
The disks are clean, but there was no login service, so you cannot log in to the server through the keyboard and the monitor! There was no network as mentioned above, which meant no logging at all in the server. You might not know, but if auditd service is enabled you probably use Selinux!

STEP 1) Failed to start the three important services – Security Audit Service, Authorization Manager and Login Service.

So we ended up with unability to log in our server.

main menu

Not sure what exactly caused this problem (seems strange a perfectly working Selinux enabled CentOS 7 server to have miss-labeled files in the root only because of an unexpected shutdown), but to be able to fix the issue and bring back your server to life

you need a rescue CD/USB/DVD/PXE Server to boot from and mount the disks and relabel your root file system.

STEP 1) Boot from a rescue CD/USB/DVD/PXE Server.

In our case, we used the IPMI KVM Console and mounted a Gentoo ISO disk and then booted from it to have a bash shell in our system. Our root resides on software RAID 1, so cat the /proc/mdstat and mount your root file system somewhere (/mnt/gentoo is there by default…)

STEP 2) Booted in our rescue Gentoo CD and mount your root file system.

main menu
Rescue Gentoo CD

STEP 2) create a file “.autorelabel” in the mounting point of your root file system.

So in our case, we mounted our CentOS 7 root file system in /mnt/gentoo and you must create a file with patch “/mnt/gentoo/.autorelabel”. umount and reboot. And a few minutes later your server will be back from the dead. A quick and handful advice – edit your /etc/fstab to mount only the root file system by commenting out all other big storage mounts – of course, if it is possible. We have big storage with millions of files in /mnt/storage-01 and we put the “#” to comment out the line with it – we do not want to wait for relabeling this file system, because the problem apparently is in our root file system! If it is possible (it is highly recommended) to relabel only the root file system in such situations to be able to regain shell control over your server fast.

Bonus – booted in rescue but no logs

OK, we booted to the rescue and tried to see what was the error (with journalctl in chrooted /mnt/gentoo), which did not allow auditd, polkit and systemd-logind to fail to start, but it appeared by default the systemd logs are not persistent on the disk in CentOS 7, so when you reboot in rescue you do not have systemd logs from the last boot! As a piece of additional advice here you may consider enabling persistent systemd logs!

Unpack centos 7 initramfs file with and without dracut skipcpio

In CentOS 7 the initramfs consists of two concatenated gzipped cpio files. If you want to check what files and probably configuration files are included you can unpack it, but you should use

the dracut tool skipcpio

/usr/lib/dracut/skipcpio <initramfs-file> | zcat | cpio -id --no-absolute-filenames

The following is the output of a CentOS 7

[root@srv ~]# mkdir initramfs-unpacked
[root@srv ~]# cd initramfs-unpacked/
[root@srv initramfs-unpacked]# /usr/lib/dracut/skipcpio /boot/initramfs-3.10.0-957.10.1.el7.x86_64.img | zcat | cpio -id --no-absolute-filenames
164026 blocks
[root@srv initramfs-unpacked]# ls -al
общо 52
drwxr-xr-x. 12 root root 4096  1 Apr 11,48 .
dr-xr-x---.  5 root root 4096  1 Apr 11,48 ..
lrwxrwxrwx.  1 root root    7  1 Apr 11,48 bin -> usr/bin
drwxr-xr-x.  2 root root 4096  1 Apr 11,48 dev
drwxr-xr-x.  9 root root 4096  1 Apr 11,48 etc
lrwxrwxrwx.  1 root root   23  1 Apr 11,48 init -> usr/lib/systemd/systemd
lrwxrwxrwx.  1 root root    7  1 Apr 11,48 lib -> usr/lib
lrwxrwxrwx.  1 root root    9  1 Apr 11,48 lib64 -> usr/lib64
drwxr-xr-x.  2 root root 4096  1 Apr 11,48 proc
drwxr-xr-x.  2 root root 4096  1 Apr 11,48 root
drwxr-xr-x.  2 root root 4096  1 Apr 11,48 run
lrwxrwxrwx.  1 root root    8  1 Apr 11,48 sbin -> usr/sbin
-rwxr-xr-x.  1 root root 3117  1 Apr 11,48 shutdown
drwxr-xr-x.  2 root root 4096  1 Apr 11,48 sys
drwxr-xr-x.  2 root root 4096  1 Apr 11,48 sysroot
drwxr-xr-x.  2 root root 4096  1 Apr 11,48 tmp
drwxr-xr-x.  7 root root 4096  1 Apr 11,48 usr
drwxr-xr-x.  3 root root 4096  1 Apr 11,48 var
[root@srv initramfs-unpacked]# ls -al /boot/
общо 114812
dr-xr-xr-x.  6 root root     4096 30 Mar  2,36 .
dr-xr-xr-x. 19 root root     4096 30 Mar  2,37 ..
-rw-r--r--.  1 root root   151923 18 Mar 15,10 config-3.10.0-957.10.1.el7.x86_64
drwxr-xr-x.  3 root root     4096 28 Jan 20,52 efi
drwxr-xr-x.  2 root root     4096 30 Mar  2,29 grub
drwx------.  5 root root     4096 29 Mar 13,50 grub2
-rw-------.  1 root root 44256471 28 Jan 20,57 initramfs-0-rescue-05cb8c7b39fe0f70e3ce97e5beab809d.img
-rw-------.  1 root root 44821343 29 Mar 13,50 initramfs-3.10.0-957.10.1.el7.x86_64.img
-rw-------.  1 root root 10982937 30 Mar  2,36 initramfs-3.10.0-957.10.1.el7.x86_64kdump.img
drwx------.  2 root root    16384 29 Mar 13,46 lost+found
-rw-r--r--.  1 root root   314087 18 Mar 15,10 symvers-3.10.0-957.10.1.el7.x86_64.gz
-rw-------.  1 root root  3544363 18 Mar 15,10 System.map-3.10.0-957.10.1.el7.x86_64
-rwxr-xr-x.  1 root root  6639808 28 Jan 20,57 vmlinuz-0-rescue-05cb8c7b39fe0f70e3ce97e5beab809d
-rwxr-xr-x.  1 root root  6643904 18 Mar 15,10 vmlinuz-3.10.0-957.10.1.el7.x86_64
-rw-r--r--.  1 root root      171 18 Mar 15,10 .vmlinuz-3.10.0-957.10.1.el7.x86_64.hmac

You can see the init is handled by systemd!

Not using dracut skipcpio

early_cpio – dracut set this file at the beginning of the CentOS 7 initramfs. It contains the CPU microcode.
You can check it with “file” command and if it shows: “ASCII cpio archive (SVR4 with no CRC)” there is a microcode prepended to the initramfs file.

And here without the dracut skipcpio tool with an example:

  1. cpio the original initramfs and write down the number of blocks reported
  2. use dd to skip the first blocks from the above step
  3. Uncompress (and unpack) the file created by dd – this is the real initramfs file.

Here is how you can do it:

[root@srv ~]# file /boot/initramfs-3.10.0-957.10.1.el7.x86_64.img
/boot/initramfs-3.10.0-957.10.1.el7.x86_64.img: ASCII cpio archive (SVR4 with no CRC)
[root@srv ~]# mkdir initramfs-unpacked-3
[root@srv ~]# cd initramfs-unpacked-3
[root@srv initramfs-unpacked-3]# cat /boot/initramfs-3.10.0-957.10.1.el7.x86_64.img | cpio -idmv
.
early_cpio
kernel
kernel/x86
kernel/x86/microcode
kernel/x86/microcode/AuthenticAMD.bin
kernel/x86/microcode/GenuineIntel.bin
3412 blocks
[root@srv initramfs-unpacked-3]# dd if=/boot/initramfs-3.10.0-957.10.1.el7.x86_64.img of=initramfs-tmp.img bs=512 skip=3412
84129+1 records in
84129+1 records out
43074399 bytes (43 MB) copied, 0.191311 s, 225 MB/s
[root@srv initramfs-unpacked-3]# ls
early_cpio  initramfs-tmp.img  kernel
[root@srv initramfs-unpacked-3]# file initramfs-tmp.img 
initramfs-tmp.img: gzip compressed data, from Unix, last modified: Fri Mar 29 13:49:41 2019, max compression
[root@srv initramfs-unpacked-3]# zcat ./initramfs-tmp.img | cpio -idm
164026 blocks
[root@srv initramfs-unpacked-3]# ls -al
total 42128
drwxr-xr-x. 13 root root     4096 Apr  1 12:38 .
dr-xr-x---. 10 root root     4096 Apr  1 12:38 ..
lrwxrwxrwx.  1 root root        7 Apr  1 12:38 bin -> usr/bin
drwxr-xr-x.  2 root root     4096 Apr  1 12:38 dev
-rw-r--r--.  1 root root        2 Mar 29 13:49 early_cpio
drwxr-xr-x.  9 root root     4096 Apr  1 12:38 etc
lrwxrwxrwx.  1 root root       23 Apr  1 12:38 init -> usr/lib/systemd/systemd
-rw-r--r--.  1 root root 43074399 Apr  1 12:35 initramfs-tmp.img
drwxr-xr-x.  3 root root     4096 Mar 29 13:49 kernel
lrwxrwxrwx.  1 root root        7 Apr  1 12:38 lib -> usr/lib
lrwxrwxrwx.  1 root root        9 Apr  1 12:38 lib64 -> usr/lib64
drwxr-xr-x.  2 root root     4096 Mar 29 13:49 proc
drwxr-xr-x.  2 root root     4096 Mar 29 13:49 root
drwxr-xr-x.  2 root root     4096 Mar 29 13:49 run
lrwxrwxrwx.  1 root root        8 Apr  1 12:38 sbin -> usr/sbin
-rwxr-xr-x.  1 root root     3117 Nov  2 17:40 shutdown
drwxr-xr-x.  2 root root     4096 Mar 29 13:49 sys
drwxr-xr-x.  2 root root     4096 Mar 29 13:49 sysroot
drwxr-xr-x.  2 root root     4096 Mar 29 13:49 tmp
drwxr-xr-x.  7 root root     4096 Apr  1 12:38 usr
drwxr-xr-x.  3 root root     4096 Apr  1 12:38 var

Centos 7 Server hangs up on boot after deleting a software raid (mdadm device)

We have a CentOS 7 server with a simple two hard drives setup in RAID1 of total 4 devices for boot, root, swap and storage. The storage device (/dev/md5) was removed and recreated with RAID0 for better performance, because the server was promoted as only cache server. Then the server was restarted and it never went up.
On IPMI KVM it just started loading the kernel and hanged up after several seconds without any additional information:

The kernel loads the mdadm devices and do not continue and the device md5 is missing.

main menu
CentOS 7 kernel loading the mdadm RAID devices

To boot successfully you must remove the missing device

On the Grub 2 menu press “e” and you’ll get this screen. Here you can edit all lines if you need. You must remove the last rd.md.uuid in our case or the one you deleted. Remove it and press Ctrl+x to load the kernel.

main menu
Grub 2 edit

There are two options you can do:

  • OPTION 1) Remove rd.md.uuid option of your old mdadm device
  • OPTION 2) Replace the ID in rd.md.uuid= with the new ID of the mdadm device.

Each of these two options could be used to solve the booting problem. Edit /etc/default/grub and replace or remove rd.md.uuid and generate the grub.conf.
You can find old mdadm ID in /etc/mdadm.conf (if you have not replace it there).

[root@srv ~]# cat /etc/mdadm.conf 
ARRAY /dev/md2 level=raid1 num-devices=2 metadata=0.90 UUID=9c08f218:cd5c0f8f:d96bc0d1:57b77e99
ARRAY /dev/md3 level=raid1 num-devices=2 metadata=1.2 name=2035110:swap UUID=1f74a2e0:757bfb9f:9c860e50:325f37cb
ARRAY /dev/md4 level=raid1 num-devices=2 metadata=1.2 name=2035110:root UUID=29bf4aa8:b7dae21a:45f4c188:baea4c13
ARRAY /dev/md5 level=raid1 num-devices=2 metadata=1.2 name=2035110:storage1 UUID=e6eb2590:b767be36:c76bb869:45ff0c3c
[root@srv ~]# mdadm --detail --scan
ARRAY /dev/md2 metadata=0.90 UUID=9c08f218:cd5c0f8f:d96bc0d1:57b77e99
ARRAY /dev/md3 metadata=1.2 name=2035110:swap UUID=1f74a2e0:757bfb9f:9c860e50:325f37cb
ARRAY /dev/md4 metadata=1.2 name=2035110:root UUID=29bf4aa8:b7dae21a:45f4c188:baea4c13
ARRAY /dev/md/5 metadata=1.2 name=s2035110:5 UUID=901074eb:16ba7c5b:0af69934:e9444102
[root@srv ~]# mdadm --detail --scan > /etc/mdadm.conf 

Here is our old /etc/default/grub:

[root@srv ~]# cat /etc/default/grub 
GRUB_TIMEOUT=1
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL="serial console"
GRUB_SERIAL_COMMAND="serial --speed=115200"
GRUB_CMDLINE_LINUX="rd.md.uuid=9c08f218:cd5c0f8f:d96bc0d1:57b77e99 rd.md.uuid=1f74a2e0:757bfb9f:9c860e50:325f37cb rd.md.uuid=29bf4aa8:b7dae21a:45f4c188:baea4c13 rd.md.uuid=e6eb2590:b767be36:c76bb869:45ff0c3c console=tty0 crashkernel=auto console=ttyS0,115200 net.ifnames=1"
GRUB_DISABLE_RECOVERY="true"

Here we edit our /boot/grub2/grub.cfg, replace the old uuid and generate grub.cfg (legacy BIOS):

[root@srv ~]# cat /etc/default/grub 
GRUB_TIMEOUT=1
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL="serial console"
GRUB_SERIAL_COMMAND="serial --speed=115200"
GRUB_CMDLINE_LINUX="rd.md.uuid=9c08f218:cd5c0f8f:d96bc0d1:57b77e99 rd.md.uuid=1f74a2e0:757bfb9f:9c860e50:325f37cb rd.md.uuid=29bf4aa8:b7dae21a:45f4c188:baea4c13 rd.md.uuid=901074eb:16ba7c5b:0af69934:e9444102 console=tty0 crashkernel=auto console=ttyS0,115200 net.ifnames=1"
[root@srv ~]# grub2-mkconfig -o /boot/grub2/grub.cfg 
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-3.10.0-957.5.1.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-957.5.1.el7.x86_64.img
Found linux image: /boot/vmlinuz-0-rescue-05cb8c7b39fe0f70e3ce97e5beab809d
Found initrd image: /boot/initramfs-0-rescue-05cb8c7b39fe0f70e3ce97e5beab809d.img
done
[root@srv ~]# reboot

Use this for UEFI BIOS boot:
First check if /boot and /boot/efi are mounted and if not you must mount them with:

mount /boot
mount /boot/efi

Generate the grub.cfg

grub2-mkconfig -o /boot/efi/EFI/centos/grub.cfg

Bonus

In fact when the original device was removed and added a new one we formatted it as usual. But it was not possible to mount it, you just execute mount

/dev/md5 /mnt/stor1

no error, but no mount could be found, the device was not mounted and when you execute

umount /mnt/stor1

The OS told the “/mnt/stor1” was not mounted. Several more tries were made unsuccessfully to mount the “/dev/md5”, then the restart was performed and the server never went up.
Suppose the systemd just did not allow to mount the device because of the boot parameters rd.md.uuid!

Update firmware of AOC-USAS2LP-H8iR (smc2108) – LSI 2108 MegaRAID Hardware Controller

This card

AOC-USAS2LP-H8iR

is really old (probably 7-9 years), but still it works, so you can check if you are with the latest and greatest firmware. Hope the latest fixes more things than it beaks. To flash the firmware you need Megaraid cli and the firmware file, the two files you check in the sub-directories of https://www.supermicro.com/wftp/driver/SAS/LSI/2108/Firmware/ They are still there despite this product is discontinued. In this URL these are the latest, tested and verified versions by Supermicro so it is advisable to download them from this link or at least use the same versions if they are not available (in the future, now they are still available).
As you know LSI (they bought 3ware RAID in 2009) was bought by Avago (2013), then Avago bought Broadcom (2016 and renamed itself to Broadcom, 2018), so not so easy to find stuff for such old hardware (which still works). So this old MegaRAID controller is better managed by MegaCli despite you can do it with “storcli”, which is a modification of the tw_cli utility of 3ware RAID.

Keep on reading!

Reset lost admin password of Grafana server (CentOS 7)

It could happen to lose your Grafana admin password and luckily there is a pretty easy way for the system administrator of the Grafana server to reset the admin password.
There is really good and simple page of how you can reset the admin password with grafana-cli in the documentation of Grafana website. Probably you can check it, too.

The problem is that most distributions like the one we use (CentOS 7) rearrange the home directory of the Grafana software and if you try to use grafana-cli you’ll end up with the error mentioned in the section below the solution we offer you here! And you really could mess it up and hurry to say this software is buggy!
First, “grafana-cli admin reset-admin-password ” needs to know where is your configuration file and Grafana home directory, because from the configuration it will extract what kind of back-end server you use like sqlite3 or MySQL (or other) and the path or login data if needed to the back-end, then it will reset the password with the given one. At the bottom of this article, there is another method of resetting the Grafana admin password without grafana-cli – manual reset in the database.
So first you should check with what command-line arguments was your instance of Grafana server started:
Keep on reading!

CentOS 7 – load a new kernel without rebooting the server

Here are the commands needed to load a new kernel without rebooting your server or desktop computer. Why you need this? Sometime rebooting a server could take 5 to 10 minutes and loading a new kernel is just up to a minute. In fact in most cases loading the new kernel and starting the system then is just under 20-30 seconds, so upgrading your server even with new kernel is super easy lately. We tested it on CentOS 7 server and it was successful. The system uses systemd and the process is really easy and safe for the systems.

When the processes is initiated the system shutdowns normally (shutting down all running service with systemd) and then load the system immediately with the new kernel and starts the services as usual!

So no need to worry about unflushed data or not proper shutdown of a service! It’s like a normal reboot but without a hardware reboot and is a lot faster!
Here is what is required to load a kernel without hardware rebooting your computer box:

  1. kexec-tools
  2. Load the new kernel, initram file and the command line arguments with “kexec”
  3. Start a systemd target – kexec.target

CentOS 7 using kexec to load a new kernel

The real commands only for CentOS 7:

yum install -y kexec-tools
kexec -l /boot/vmlinuz-3.10.0-862.11.6.el7.x86_64 --initrd=/boot/initramfs-3.10.0-862.11.6.el7.x86_64.img --command-line="root=/dev/mapper/centos_srv-root ro crashkernel=auto rd.lvm.lv=centos_srv/root rd.lvm.lv=centos_srv/swap rhgb quiet LANG=en_US.UTF-8"
systemctl start kexec.target

Here is a real world example with all the output:
As you can see it is important to load the initram file and the exact arguments to the kernel. You should take them from the grub 2 configuration – /boot/grub2/grub.cfg
Here is the real output:

[root@srv ~]# yum -y update
Loaded plugins: fastestmirror
Determining fastest mirrors
 * base: mirrors.neterra.net
 * extras: mirrors.neterra.net
 * updates: mirrors.neterra.net
base                                                                                                                                                 | 3.6 kB  00:00:00     
centos-sclo-rh                                                                                                                                       | 3.0 kB  00:00:00     
centos-sclo-sclo                                                                                                                                     | 2.9 kB  00:00:00     
extras                                                                                                                                               | 3.4 kB  00:00:00     
updates                                                                                                                                              | 3.4 kB  00:00:00     
(1/4): extras/7/x86_64/primary_db                                                                                                                    | 187 kB  00:00:00     
(2/4): updates/7/x86_64/primary_db                                                                                                                   | 5.2 MB  00:00:01     
(3/4): centos-sclo-sclo/x86_64/primary_db                                                                                                            | 292 kB  00:00:01     
(4/4): centos-sclo-rh/x86_64/primary_db                                                                                                              | 3.7 MB  00:00:02     
Resolving Dependencies
--> Running transaction check
---> Package kernel.x86_64 0:3.10.0-862.11.6.el7 will be installed
---> Package kernel-headers.x86_64 0:3.10.0-862.3.2.el7 will be updated
---> Package kernel-headers.x86_64 0:3.10.0-862.11.6.el7 will be an update
---> Package kernel-tools.x86_64 0:3.10.0-862.3.2.el7 will be updated
---> Package kernel-tools.x86_64 0:3.10.0-862.11.6.el7 will be an update
---> Package kernel-tools-libs.x86_64 0:3.10.0-862.3.2.el7 will be updated
---> Package kernel-tools-libs.x86_64 0:3.10.0-862.11.6.el7 will be an update
....
....
[root@srv ~]# yum install -y kexec-tools
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirrors.neterra.net
 * extras: mirrors.neterra.net
 * updates: mirrors.neterra.net
Package kexec-tools-2.0.15-13.el7.x86_64 already installed and latest version
Nothing to do

So we performed an update and there was a new kernel kernel.x86_64 0:3.10.0-862.11.6.el7, which we would like to load without hardware reboot.
Here is our new kernel in “/boot”

[root@srv ~]# ls -altr /boot/
total 171812
-rw-------.  1 root root  3228420 22 Aug  2017 System.map-3.10.0-693.el7.x86_64
-rw-r--r--.  1 root root   140894 22 Aug  2017 config-3.10.0-693.el7.x86_64
-rw-r--r--.  1 root root      166 22 Aug  2017 .vmlinuz-3.10.0-693.el7.x86_64.hmac
-rwxr-xr-x.  1 root root  5877760 22 Aug  2017 vmlinuz-3.10.0-693.el7.x86_64
-rw-r--r--.  1 root root   293027 22 Aug  2017 symvers-3.10.0-693.el7.x86_64.gz
drwxr-xr-x.  3 root root       17 20 Feb  2018 efi
drwxr-xr-x.  2 root root       27 20 Feb  2018 grub
-rw-r--r--.  1 root root   611343 20 Feb  2018 initrd-plymouth.img
-rw-------.  1 root root 51315067 20 Feb  2018 initramfs-0-rescue-cc7889764e86441b8d1eb54e29e81a91.img
-rwxr-xr-x.  1 root root  5877760 20 Feb  2018 vmlinuz-0-rescue-cc7889764e86441b8d1eb54e29e81a91
-rw-------.  1 root root  3409912 21 May 23,50 System.map-3.10.0-862.3.2.el7.x86_64
-rw-r--r--.  1 root root   147823 21 May 23,50 config-3.10.0-862.3.2.el7.x86_64
-rw-r--r--.  1 root root      170 21 May 23,50 .vmlinuz-3.10.0-862.3.2.el7.x86_64.hmac
-rwxr-xr-x.  1 root root  6228832 21 May 23,50 vmlinuz-3.10.0-862.3.2.el7.x86_64
-rw-r--r--.  1 root root   304943 21 May 23,52 symvers-3.10.0-862.3.2.el7.x86_64.gz
dr-xr-xr-x. 17 root root      224 14 Jun  7,18 ..
-rw-------.  1 root root 12997841 14 Jun  7,19 initramfs-3.10.0-693.el7.x86_64kdump.img
-rw-------.  1 root root 20771492 14 Jun  7,20 initramfs-3.10.0-693.el7.x86_64.img
-rw-------.  1 root root 13007444 14 Jun  7,23 initramfs-3.10.0-862.3.2.el7.x86_64kdump.img
-rw-r--r--.  1 root root   147859 14 Aug 22,02 config-3.10.0-862.11.6.el7.x86_64
-rw-------.  1 root root  3414344 14 Aug 22,02 System.map-3.10.0-862.11.6.el7.x86_64
-rw-r--r--.  1 root root      171 14 Aug 22,02 .vmlinuz-3.10.0-862.11.6.el7.x86_64.hmac
-rwxr-xr-x.  1 root root  6242208 14 Aug 22,02 vmlinuz-3.10.0-862.11.6.el7.x86_64
-rw-r--r--.  1 root root   305158 14 Aug 22,05 symvers-3.10.0-862.11.6.el7.x86_64.gz
-rw-------.  1 root root 20774393  5 Sep 15,20 initramfs-3.10.0-862.11.6.el7.x86_64.img
drwx------.  5 root root       97  5 Sep 15,20 grub2
dr-xr-xr-x.  5 root root     4096  5 Sep 15,21 .
-rw-------.  1 root root 20782404  5 Sep 15,21 initramfs-3.10.0-862.3.2.el7.x86_64.img

Now we know the kernel and initram file names we just check the kernel arguments in the kernel, load them with kexec and start an systemd target to load the new kernel:

[root@srv ~]# grep vmlinuz-3.10.0-862.11.6.el7.x86_64 /boot/grub2/grub.cfg 
        linux16 /vmlinuz-3.10.0-862.11.6.el7.x86_64 root=/dev/mapper/centos_srv-root ro crashkernel=auto rd.lvm.lv=centos_srv/root rd.lvm.lv=centos_srv/swap rhgb quiet LANG=en_US.UTF-8
[root@srv ~]# kexec -l /boot/vmlinuz-3.10.0-862.11.6.el7.x86_64 --initrd=/boot/initramfs-3.10.0-862.11.6.el7.x86_64.img --command-line="root=/dev/mapper/centos_srv-root ro crashkernel=auto rd.lvm.lv=centos_srv/root rd.lvm.lv=centos_srv/swap rhgb quiet LANG=en_US.UTF-8"
[root@srv ~]# systemctl start kexec.target
Connection to srv closed by remote host.
Connection to srv closed.

As you can see systemd performs a normal shutdown of all services and targets.

main menu
Normal Shutdown

The ssh connection is immediately closed because the reboot is initiated.
After 10-15 seconds our host is alive and the new kernel is loaded successfully:

root@test ~ $ ssh root@srv
root@srv's password: 
Last login: Wed Sep  5 15:16:08 2018 from test
[root@srv ~]# uname -a
Linux srv.local 3.10.0-862.11.6.el7.x86_64 #1 SMP Tue Aug 14 21:49:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@srv ~]# 

Because we do not wanted to mess up the two output from different linux distros in one article we decided to split it in two separate ones, so here is the one for Ubuntu 16/18 LTS – “Ubuntu 16/18 LTS – load a new kernel without rebooting the server