A strange behavior has been monitored in a storage server holding more than 20 millions of files, which is not a really big figure, because we have storage with more than 300 millions of files. The server has a RAID5 of 4 x 10T HGST Ultrastar He10 and storage partition formatted with ext4 filesystem with total of 27T size. Despite the inodes are only 7% and the free space is at 80%, the server began to experience high load averages with almost no IO utilization across the disks and the RAID device.
So the numbers are:
20% free space.
93% free inodes.
no IO disk utilization above 10-20% of any of the disks. In fact, the IO above 10% is very rear.
really high loads above the number of the server’s cores. The server has 8 virtual processors under Linux and the loads are constantly above 10-14.
no Disk Sleep blocked processes or Running to justify the high load.
the top command reports one or two kernel processes with 100% running – kworker/u32:1+flush-9:3 (so flushing the data takes so many time?)
If the problem were related to the disks or hardware, it would probably have been seen a high utilization across the disks, because the physical operations would need more time to execute. The high IO utilization across any disk, in deed, may result in a high system time (i.e. kernel time). Apparently, there is no high IO utilization across the disks and even no Disk Sleep blocked processes.
The FlameGraph tool may help to suggest the area where the problem related to.
First, install the git and perf tool (there are two interesting links about it with examples – link1 and link2):
[root@srv ~]# perf record -F 99 -a -g -- sleep 60
[ perf record: Woken up 24 times to write data ]
[ perf record: Captured and wrote 7.132 MB perf.data (35687 samples) ]
And generate a SVG image of all the function, which are executed during this time frame of 60 seconds. It will give a good picture of what has been executed and how much time it took in the kernel and user space.
The SVG are shared bellow. It turned out most of the time during this sample period of 60 seconds, the Linux kernel was in ext4_mb_good_group, which is related to the EXT4 file system.
What fixed the problem (for now) – By removing files it was noticed the system time went down and the load went down, too. Then, after removing approximately a million files, the server performance returned to normal as the average other storage server (this one has unique hardware setup). So by removing files, i.e. INODES from the ext4 file system, the load average and the server ext4 performance were completely different from what it had been before.
The interesting part the inodes never went above 7% (around 24 million of 454 million available) and the free space never went down 20%, but when a good deal of files were removed, the bad ext4 performance disappeared. It still could be related to the physical disks or some strange ext4 bug, but this article is intended to report this behavior and to propose a some kind of a solution (even though a workaround).
SCREENSHOT 1) The kernel process running at 100% – kworker/u32:2-flush-9:3.
SCREENSHOT 2) Almost half of the time was spent at ext4 related calls.
Multiple Gentoo packages may need kernel sources to compile. There are packages, which are external modules such as virtualbox-modules or video drivers or wifi drivers or more. All these packages expect the current loaded kernel sources are present and to use them when compiling the external kernel module. But sometimes the proper kernel sources are missing, those needed to compile the kernel module in such a way to load it in the currently loaded kernel.
This article is valid not only for Gentoo Linux distribution but any Linux and kernel sources. So, if the user needs to have properly configured kernel sources for the currently loaded kernel, this is one way to do it right.
Here is an example: Updated kernel, but no sources are kept and then the VirtualBox needs to update to a newer version, but with the missing kernel sources of the currently loaded kernel updating the VirtualBox will cause the VirtualBox to stop working!
root@srv ~ # uname -a
Linux srv 5.15.5-gentoo #2 SMP Tue Nov 30 16:08:49 EET 2021 x86_64 Intel(R) Core(TM) i7-5500U CPU @ 2.40GHz GenuineIntel GNU/Linux
root@srv ~ # emerge -va app-emulation/virtualbox-modules
These are the packages that would be merged, in order:
[ebuild U ] app-emulation/virtualbox-modules-6.1.32:0/6.1::gentoo [6.1.26:0/6.1::gentoo] USE="dist-kernel -pax-kernel" 660 KiB
Total: 1 packages (1 upgrades), Size of downloads: 0 KiB
Would you like to merge these packages? [Yes/No] yes
>>> Verifying ebuild manifests
>>> Running pre-merge checks for app-emulation/virtualbox-6.1.32-r1
>>> Emerging (1 of 1) app-emulation/virtualbox-modules-6.1.32::gentoo
* Fetching files in the background.
* To view fetch progress, run in another terminal:
* tail -f /var/log/emerge-fetch.log
* vbox-kernel-module-src-6.1.32.tar.xz BLAKE2B SHA512 size ;-) ... [ ok ]
* Determining the location of the kernel source code
* Found kernel source directory:
* /usr/src/linux
* Found sources for kernel version:
* 5.14.2-gentoo-x86_64-genkernel-NEW2
* Checking for suitable kernel configuration options... [ ok ]
>>> Unpacking source...
>>> Unpacking vbox-kernel-module-src-6.1.32.tar.xz to /var/tmp/portage/app-emulation/virtualbox-modules-6.1.32/work
>>> Source unpacked in /var/tmp/portage/app-emulation/virtualbox-modules-6.1.32/work
>>> Preparing source in /var/tmp/portage/app-emulation/virtualbox-modules-6.1.32/work ...
>>> Source prepared.
>>> Configuring source in /var/tmp/portage/app-emulation/virtualbox-modules-6.1.32/work ...
>>> Source configured.
>>> Compiling source in /var/tmp/portage/app-emulation/virtualbox-modules-6.1.32/work ...
Here is the problem: the currently loaded kernel is version 5.15.5-gentoo and the emerge system finds only sources for 5.14.2-gentoo-x86_64-genkernel-NEW2, which will use to produce modules for the 5.14.2-gentoo-x86_64-genkernel-NEW2. It is obvious enough the modules compiled against kernel sources of 5.14.2-gentoo-x86_64-genkernel-NEW2 version won’t be possible to be loaded under the currently load kernel with version 5.15.5-gentoo.
Here is how to fix this:
Get the kernel sources for 5.15.5-gentoo in /usr/src/linux
Save the currently loaded kernel config in /usr/src/linux/.config
Load the configuration and prepare the kernel sources. No need to compile the kernel sources.
These commands will install the needed kernel version and a link to the kernel sources will be created.
Of course, change the kernel version to the proper version if needed.
STEP 2) Save the currently loaded kernel config in /usr/src/linux/.config
zcat /proc/config.gz > /usr/src/linux/.config
If the /proc/config.gz is missing, copy the configuration from the /boot for the currently loaded kernel:
STEP 3) Load the configuration and prepare the kernel sources.
No need to compile the whole kernel source tree. Just to commands to configure and prepare the kernel sources:
cd /usr/src/linux-5.15.5-gentoo
make oldconfig
make prepare
The commands take Here is the output of the last commands:
root@srv1 ~ # cd /usr/src/linux-5.15.5-gentoo
root@srv1 linux # make oldconfig
HOSTCC scripts/basic/fixdep
HOSTCC scripts/kconfig/conf.o
HOSTCC scripts/kconfig/confdata.o
HOSTCC scripts/kconfig/expr.o
HOSTCC scripts/kconfig/lexer.lex.o
HOSTCC scripts/kconfig/menu.o
HOSTCC scripts/kconfig/parser.tab.o
HOSTCC scripts/kconfig/preprocess.o
HOSTCC scripts/kconfig/symbol.o
HOSTCC scripts/kconfig/util.o
HOSTLD scripts/kconfig/conf
#
# configuration written to .config
#
root@srv1 linux # make prepare
SYNC include/config/auto.conf.cmd
HOSTCC arch/x86/tools/relocs_32.o
HOSTCC arch/x86/tools/relocs_64.o
HOSTCC arch/x86/tools/relocs_common.o
HOSTLD arch/x86/tools/relocs
HOSTCC scripts/selinux/genheaders/genheaders
HOSTCC scripts/selinux/mdp/mdp
HOSTCC scripts/bin2c
HOSTCC scripts/kallsyms
HOSTCC scripts/sorttable
HOSTCC scripts/asn1_compiler
HOSTCC scripts/extract-cert
UPD include/config/kernel.release
UPD include/generated/utsrelease.h
CC scripts/mod/empty.o
HOSTCC scripts/mod/mk_elfconfig
MKELF scripts/mod/elfconfig.h
HOSTCC scripts/mod/modpost.o
CC scripts/mod/devicetable-offsets.s
HOSTCC scripts/mod/file2alias.o
HOSTCC scripts/mod/sumversion.o
HOSTLD scripts/mod/modpost
CC kernel/bounds.s
UPD include/generated/bounds.h
CC arch/x86/kernel/asm-offsets.s
UPD include/generated/asm-offsets.h
CALL scripts/checksyscalls.sh
CALL scripts/atomic/check-atomics.sh
DESCEND objtool
HOSTCC /usr/src/linux-5.15.5-gentoo/tools/objtool/fixdep.o
HOSTLD /usr/src/linux-5.15.5-gentoo/tools/objtool/fixdep-in.o
LINK /usr/src/linux-5.15.5-gentoo/tools/objtool/fixdep
CC /usr/src/linux-5.15.5-gentoo/tools/objtool/exec-cmd.o
CC /usr/src/linux-5.15.5-gentoo/tools/objtool/help.o
CC /usr/src/linux-5.15.5-gentoo/tools/objtool/pager.o
CC /usr/src/linux-5.15.5-gentoo/tools/objtool/parse-options.o
CC /usr/src/linux-5.15.5-gentoo/tools/objtool/run-command.o
CC /usr/src/linux-5.15.5-gentoo/tools/objtool/sigchain.o
CC /usr/src/linux-5.15.5-gentoo/tools/objtool/subcmd-config.o
LD /usr/src/linux-5.15.5-gentoo/tools/objtool/libsubcmd-in.o
AR /usr/src/linux-5.15.5-gentoo/tools/objtool/libsubcmd.a
CC /usr/src/linux-5.15.5-gentoo/tools/objtool/arch/x86/special.o
CC /usr/src/linux-5.15.5-gentoo/tools/objtool/arch/x86/decode.o
LD /usr/src/linux-5.15.5-gentoo/tools/objtool/arch/x86/objtool-in.o
CC /usr/src/linux-5.15.5-gentoo/tools/objtool/weak.o
CC /usr/src/linux-5.15.5-gentoo/tools/objtool/check.o
CC /usr/src/linux-5.15.5-gentoo/tools/objtool/special.o
CC /usr/src/linux-5.15.5-gentoo/tools/objtool/orc_gen.o
CC /usr/src/linux-5.15.5-gentoo/tools/objtool/orc_dump.o
CC /usr/src/linux-5.15.5-gentoo/tools/objtool/builtin-check.o
CC /usr/src/linux-5.15.5-gentoo/tools/objtool/builtin-orc.o
CC /usr/src/linux-5.15.5-gentoo/tools/objtool/elf.o
CC /usr/src/linux-5.15.5-gentoo/tools/objtool/objtool.o
CC /usr/src/linux-5.15.5-gentoo/tools/objtool/libstring.o
CC /usr/src/linux-5.15.5-gentoo/tools/objtool/libctype.o
CC /usr/src/linux-5.15.5-gentoo/tools/objtool/str_error_r.o
CC /usr/src/linux-5.15.5-gentoo/tools/objtool/librbtree.o
LD /usr/src/linux-5.15.5-gentoo/tools/objtool/objtool-in.o
LINK /usr/src/linux-5.15.5-gentoo/tools/objtool/objtool
And from now on whenever the kernel sources are needed to compile modules or libraries against, the proper kernel sources will be used of the currently loaded kernel.
The Gentoo emerge command from the begging of this article, but this time with the properly configured kernel sources. The VirtualBox modules are compiled against the loaded kernel, so loading them is not an issue anymore!
root@srv ~ # uname -a
Linux srv 5.15.5-gentoo #2 SMP Tue Nov 30 16:08:49 EET 2021 x86_64 Intel(R) Core(TM) i7-5500U CPU @ 2.40GHz GenuineIntel GNU/Linux
root@srv ~ # emerge -va app-emulation/virtualbox-modules
These are the packages that would be merged, in order:
[ebuild U ] app-emulation/virtualbox-modules-6.1.32:0/6.1::gentoo [6.1.26:0/6.1::gentoo] USE="dist-kernel -pax-kernel" 660 KiB
Total: 1 packages (1 upgrades), Size of downloads: 0 KiB
Would you like to merge these packages? [Yes/No] yes
>>> Verifying ebuild manifests
>>> Running pre-merge checks for app-emulation/virtualbox-6.1.32-r1
>>> Emerging (1 of 1) app-emulation/virtualbox-modules-6.1.32::gentoo
* Fetching files in the background.
* To view fetch progress, run in another terminal:
* tail -f /var/log/emerge-fetch.log
* vbox-kernel-module-src-6.1.32.tar.xz BLAKE2B SHA512 size ;-) ... [ ok ]
* Determining the location of the kernel source code
* Found kernel source directory:
* /usr/src/linux
* Found sources for kernel version:
* 5.15.5-gentoo-gentoo
* Checking for suitable kernel configuration options... [ ok ]
>>> Unpacking source...
>>> Unpacking vbox-kernel-module-src-6.1.32.tar.xz to /var/tmp/portage/app-emulation/virtualbox-modules-6.1.32/work
>>> Source unpacked in /var/tmp/portage/app-emulation/virtualbox-modules-6.1.32/work
>>> Preparing source in /var/tmp/portage/app-emulation/virtualbox-modules-6.1.32/work ...
>>> Source prepared.
>>> Configuring source in /var/tmp/portage/app-emulation/virtualbox-modules-6.1.32/work ...
>>> Source configured.
>>> Compiling source in /var/tmp/portage/app-emulation/virtualbox-modules-6.1.32/work ...
....
....
root@srv ~ # modprobe vboxdrv
Feb 15 14:15:10 www kernel: vboxdrv: loading out-of-tree module taints kernel.
Feb 15 14:15:10 www kernel: vboxdrv: Found 4 processor cores
Feb 15 14:15:10 www kernel: vboxdrv: TSC mode is Invariant, tentative frequency 2394461773 Hz
Feb 15 14:15:10 www kernel: vboxdrv: Successfully loaded version 6.1.32 r149290 (interface 0x00320000)
Feb 15 14:15:10 www kernel: VBoxNetFlt: Successfully started.
You can see no kernel configuration entry “CONFIG_X86_X2APIC=y” is shown from the above command.
And if your BIOS enables the support of x2APIC you may end up with just one processor under Linux.
This was the case in our server. The x2APIC support is enabled in the BIOS and our kernel (version 4.18.12) does not have CONFIG_X86_X2APIC enabled.
To fix this issue you first might disable the feature in the BIOS and you are going to have all your processors shown and they could be used to compile fast your new kernel (of course, in the case you use custom kernel) after you enable the feature in the kernel CONFIG_X86_X2APIC, which is under Kernel Configuration —> [*] Support x2apic. The asterisk means it is enabled, so build your kernel. Check and enable this “Device Drivers –> IOMMU Hardware Support –> Support for Interrupt Remapping”, too.
Here you can see how to enable and disable processor x2APIC support in HP ProLiant DL160 Gen9 (2 processors Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz) – Enable or Disable the processor x2APIC support in HP ProLiant DL160 Gen9. Keep on reading!
There are multiple reasons to rebuild the official kernel of a Linux distro but this is not the purpose of this article
just cannot miss the chance to write that all the kernels are built therefore optimized for the very first 64bit Intel/AMD processor! But come on who wants the most important piece of software to be optimized not for his new and expensive processor but for one released 15 years ago???
. Here we are going to show you how to recompile the latest official Ubuntu kernel of Ubuntu 16.04 LTS – the one, which comes with the apt packages system, because this kernel comes with the latest and greatest patches of the Ubuntu team. You should not confuse this howto with the one, which compile a vanilla kernel or the mainline kernel from http://kernel.ubuntu.com/~kernel-ppa/mainline/. So if you want a new kernel or the latest from kernel.org this is the right tutorial for Ubuntu – Build your own kernel under Ubuntu using mainline (latest) kernel. The official latest kernel in the Ubuntu repository is not always the latest one from kernel.org, but you can be sure it is probably most secure one, because there are additional modifications, configurations and tests by the team. Here you can see what versions of the kernel are the officials in Ubuntu: https://wiki.ubuntu.com/Kernel/Support
Here are the commands needed to load a new kernel without rebooting your server or desktop computer. Why you need this? As said in our first article for CentOS 7 – sometime rebooting a server could take 5 to 10 minutes and loading a new kernel is just up to a minute. In fact in most cases loading the new kernel and starting the system then is just under 20-30 seconds, so upgrading your server even with new kernel is super easy lately. We tested it on Ubuntu 16 and Ubuntu 18 servers and it was successful. The system uses systemd and the process is really easy and safe for the systems.
When the processes is initiated the system shutdowns normally (shutting down all running service with systemd) and then load the system immediately with the new kernel and starts the services as usual!
So no need to worry about unflushed data or not proper shutdown of a service! It’s like a normal reboot but without a hardware reboot and is a lot faster!
Here is what is required to load a kernel without hardware rebooting your computer box:
kexec-tools
Load the new kernel, initram file and the command line arguments with “kexec”
Here is a real world example with all the output:
And again update your system to see if there is a new kernel and install “kexec-tools”. In our case indeed there is a new kernel – vmlinuz-4.15.0-33-generic
myuser@srv:~$ sudo apt -y upgrade
Reading package lists... Done
Building dependency tree
Reading state information... Done
Calculating upgrade... Done
The following package was automatically installed and is no longer required:
libllvm5.0
Use 'sudo apt autoremove' to remove it.
The following NEW packages will be installed:
amd64-microcode autotools-dev dblatex debhelper dh-strip-nondeterminism docbook-dsssl docbook-utils docbook-xml docbook-xsl fonts-lato fonts-lmodern fonts-texgyre
intel-microcode iucode-tool jadetex javascript-common kernel-common kernel-package kernel-patch-scripts kernel-wedge kerneloops kerneloops-applet kernelshark kerneltop
libfile-homedir-perl libfile-stripnondeterminism-perl libfile-which-perl libjs-jquery libllvm6.0 libmail-sendmail-perl libosp5 libostyle1c2 libpotrace0 libptexenc1
libqpdf21 libruby2.3 libsgmls-perl libsp1c2 libsynctex1 libsys-hostname-long-perl libtexlua52 libtexluajit2 libwebpdemux1 libxml2-utils libzzip-0-13
linux-headers-4.15.0-33 linux-headers-4.15.0-33-generic linux-image-4.15.0-33-generic linux-modules-4.15.0-33-generic linux-modules-extra-4.15.0-33-generic lmodern lynx
lynx-common openjade po-debconf preview-latex-style prosper ps2eps python-apt rake ruby ruby-did-you-mean ruby-minitest ruby-net-telnet ruby-power-assert ruby-test-unit
ruby2.3 rubygems-integration sgml-data sgmlspl sp tex-common tex-gyre texlive texlive-base texlive-bibtex-extra texlive-binaries texlive-extra-utils texlive-font-utils
texlive-fonts-recommended texlive-fonts-recommended-doc texlive-generic-recommended texlive-latex-base texlive-latex-base-doc texlive-latex-extra
texlive-latex-extra-doc texlive-latex-recommended texlive-latex-recommended-doc texlive-luatex texlive-math-extra texlive-pictures texlive-pictures-doc texlive-pstricks
texlive-pstricks-doc tipa trace-cmd xmlto xsltproc
The following packages will be upgraded:
.....
.....
Running mktexlsr /var/lib/texmf ... done.
Building format(s) --all.
This may take some time... done.
Processing triggers for linux-image-4.15.0-33-generic (4.15.0-33.36~16.04.1) ...
/etc/kernel/postinst.d/initramfs-tools:
update-initramfs: Generating /boot/initrd.img-4.15.0-33-generic
/etc/kernel/postinst.d/vboxadd:
VirtualBox Guest Additions: Building the VirtualBox Guest Additions kernel modules.
/etc/kernel/postinst.d/zz-update-grub:
Generating grub configuration file ...
Warning: Setting GRUB_TIMEOUT to a non-zero value when GRUB_HIDDEN_TIMEOUT is set is no longer supported.
Found linux image: /boot/vmlinuz-4.15.0-33-generic
Found initrd image: /boot/initrd.img-4.15.0-33-generic
Found linux image: /boot/vmlinuz-4.13.0-36-generic
Found initrd image: /boot/initrd.img-4.13.0-36-generic
Found memtest86+ image: /boot/memtest86+.elf
Found memtest86+ image: /boot/memtest86+.bin
done
myuser@srv:~$ sudo apt -y install kexec-tools
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following package was automatically installed and is no longer required:
libllvm5.0
Use 'sudo apt autoremove' to remove it.
The following NEW packages will be installed:
kexec-tools
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 77,4 kB of archives.
After this operation, 276 kB of additional disk space will be used.
Get:1 http://bg.archive.ubuntu.com/ubuntu xenial-updates/main amd64 kexec-tools amd64 1:2.0.16-1ubuntu1~16.04.1 [77,4 kB]
Fetched 77,4 kB in 0s (707 kB/s)
Preconfiguring packages ...
Selecting previously unselected package kexec-tools.
(Reading database ... 253895 files and directories currently installed.)
Preparing to unpack .../kexec-tools_1%3a2.0.16-1ubuntu1~16.04.1_amd64.deb ...
Unpacking kexec-tools (1:2.0.16-1ubuntu1~16.04.1) ...
Processing triggers for man-db (2.7.5-1) ...
Processing triggers for systemd (229-4ubuntu21.4) ...
Processing triggers for ureadahead (0.100.0-19) ...
Setting up kexec-tools (1:2.0.16-1ubuntu1~16.04.1) ...
Generating /etc/default/kexec...
Processing triggers for systemd (229-4ubuntu21.4) ...
Processing triggers for ureadahead (0.100.0-19) ...
┌──────────────────────────────────────────────────────────────────┤ Configuring kexec-tools ├───────────────────────────────────────────────────────────────────┐
│ │
│ If you choose this option, a system reboot will trigger a restart into a kernel loaded by kexec instead of going through the full system boot loader process. │
│ │
│ Should kexec-tools handle reboots (sysvinit only)? │
│ │
│ <Yes> <No> │
│ │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
On the above configuration question mark “” and press Enter.
So we performed an update and there was a new kernel vmlinuz-4.15.0-33-generic, which we would like to load without hardware reboot.
Here is our new kernel in “/boot”
Now we know the kernel and initram file names we just check the kernel arguments in the kernel, load them with kexec and start an systemd target to load the new kernel:
myuser@srv:~$ grep vmlinuz-4.15.0-33-generic /boot/grub/grub.cfg
linux /boot/vmlinuz-4.15.0-33-generic root=UUID=061b2936-34bf-4da3-b7d2-b8bde0899f03 ro quiet splash $vt_handoff
linux /boot/vmlinuz-4.15.0-33-generic root=UUID=061b2936-34bf-4da3-b7d2-b8bde0899f03 ro quiet splash $vt_handoff
linux /boot/vmlinuz-4.15.0-33-generic root=UUID=061b2936-34bf-4da3-b7d2-b8bde0899f03 ro quiet splash $vt_handoff init=/sbin/upstart
linux /boot/vmlinuz-4.15.0-33-generic root=UUID=061b2936-34bf-4da3-b7d2-b8bde0899f03 ro recovery nomodeset
myuser@srv:~$ sudo kexec -l /boot/vmlinuz-4.15.0-33-generic --initrd=/boot/initrd.img-4.15.0-33-generic --command-line="root=UUID=061b2936-34bf-4da3-b7d2-b8bde0899f03 ro quiet splash"
myuser@srv:~$ sudo systemctl start kexec.target
Connection to srv closed by remote host.
Connection to srv closed.
Use the first line of the grep output above (or you can cat the file and see what is in it if you have any doubts) to take the proper kernel boot arguments and do not include anything starting with “$”.
As you can see systemd performs a normal shutdown of all services and targets.
The ssh connection is immediately closed because the reboot is initiated.
After 10-15 seconds our host is alive and the new kernel is loaded successfully:
root@test ~ $ ssh root@srv
root@srv's password:
Last login: Wed Sep 5 17:15:08 2018 from test
[root@srv ~]# uname -a
Linux srv.local 4.15.0-33-generic #36~16.04.1-Ubuntu SMP Wed Aug 15 17:21:05 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@srv ~]#
Here are the commands needed to load a new kernel without rebooting your server or desktop computer. Why you need this? Sometime rebooting a server could take 5 to 10 minutes and loading a new kernel is just up to a minute. In fact in most cases loading the new kernel and starting the system then is just under 20-30 seconds, so upgrading your server even with new kernel is super easy lately. We tested it on CentOS 7 server and it was successful. The system uses systemd and the process is really easy and safe for the systems.
When the processes is initiated the system shutdowns normally (shutting down all running service with systemd) and then load the system immediately with the new kernel and starts the services as usual!
So no need to worry about unflushed data or not proper shutdown of a service! It’s like a normal reboot but without a hardware reboot and is a lot faster!
Here is what is required to load a kernel without hardware rebooting your computer box:
kexec-tools
Load the new kernel, initram file and the command line arguments with “kexec”
Here is a real world example with all the output:
As you can see it is important to load the initram file and the exact arguments to the kernel. You should take them from the grub 2 configuration – /boot/grub2/grub.cfg
Here is the real output:
[root@srv ~]# yum -y update
Loaded plugins: fastestmirror
Determining fastest mirrors
* base: mirrors.neterra.net
* extras: mirrors.neterra.net
* updates: mirrors.neterra.net
base | 3.6 kB 00:00:00
centos-sclo-rh | 3.0 kB 00:00:00
centos-sclo-sclo | 2.9 kB 00:00:00
extras | 3.4 kB 00:00:00
updates | 3.4 kB 00:00:00
(1/4): extras/7/x86_64/primary_db | 187 kB 00:00:00
(2/4): updates/7/x86_64/primary_db | 5.2 MB 00:00:01
(3/4): centos-sclo-sclo/x86_64/primary_db | 292 kB 00:00:01
(4/4): centos-sclo-rh/x86_64/primary_db | 3.7 MB 00:00:02
Resolving Dependencies
--> Running transaction check
---> Package kernel.x86_64 0:3.10.0-862.11.6.el7 will be installed
---> Package kernel-headers.x86_64 0:3.10.0-862.3.2.el7 will be updated
---> Package kernel-headers.x86_64 0:3.10.0-862.11.6.el7 will be an update
---> Package kernel-tools.x86_64 0:3.10.0-862.3.2.el7 will be updated
---> Package kernel-tools.x86_64 0:3.10.0-862.11.6.el7 will be an update
---> Package kernel-tools-libs.x86_64 0:3.10.0-862.3.2.el7 will be updated
---> Package kernel-tools-libs.x86_64 0:3.10.0-862.11.6.el7 will be an update
....
....
[root@srv ~]# yum install -y kexec-tools
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* base: mirrors.neterra.net
* extras: mirrors.neterra.net
* updates: mirrors.neterra.net
Package kexec-tools-2.0.15-13.el7.x86_64 already installed and latest version
Nothing to do
So we performed an update and there was a new kernel kernel.x86_64 0:3.10.0-862.11.6.el7, which we would like to load without hardware reboot.
Here is our new kernel in “/boot”
[root@srv ~]# ls -altr /boot/
total 171812
-rw-------. 1 root root 3228420 22 Aug 2017 System.map-3.10.0-693.el7.x86_64
-rw-r--r--. 1 root root 140894 22 Aug 2017 config-3.10.0-693.el7.x86_64
-rw-r--r--. 1 root root 166 22 Aug 2017 .vmlinuz-3.10.0-693.el7.x86_64.hmac
-rwxr-xr-x. 1 root root 5877760 22 Aug 2017 vmlinuz-3.10.0-693.el7.x86_64
-rw-r--r--. 1 root root 293027 22 Aug 2017 symvers-3.10.0-693.el7.x86_64.gz
drwxr-xr-x. 3 root root 17 20 Feb 2018 efi
drwxr-xr-x. 2 root root 27 20 Feb 2018 grub
-rw-r--r--. 1 root root 611343 20 Feb 2018 initrd-plymouth.img
-rw-------. 1 root root 51315067 20 Feb 2018 initramfs-0-rescue-cc7889764e86441b8d1eb54e29e81a91.img
-rwxr-xr-x. 1 root root 5877760 20 Feb 2018 vmlinuz-0-rescue-cc7889764e86441b8d1eb54e29e81a91
-rw-------. 1 root root 3409912 21 May 23,50 System.map-3.10.0-862.3.2.el7.x86_64
-rw-r--r--. 1 root root 147823 21 May 23,50 config-3.10.0-862.3.2.el7.x86_64
-rw-r--r--. 1 root root 170 21 May 23,50 .vmlinuz-3.10.0-862.3.2.el7.x86_64.hmac
-rwxr-xr-x. 1 root root 6228832 21 May 23,50 vmlinuz-3.10.0-862.3.2.el7.x86_64
-rw-r--r--. 1 root root 304943 21 May 23,52 symvers-3.10.0-862.3.2.el7.x86_64.gz
dr-xr-xr-x. 17 root root 224 14 Jun 7,18 ..
-rw-------. 1 root root 12997841 14 Jun 7,19 initramfs-3.10.0-693.el7.x86_64kdump.img
-rw-------. 1 root root 20771492 14 Jun 7,20 initramfs-3.10.0-693.el7.x86_64.img
-rw-------. 1 root root 13007444 14 Jun 7,23 initramfs-3.10.0-862.3.2.el7.x86_64kdump.img
-rw-r--r--. 1 root root 147859 14 Aug 22,02 config-3.10.0-862.11.6.el7.x86_64
-rw-------. 1 root root 3414344 14 Aug 22,02 System.map-3.10.0-862.11.6.el7.x86_64
-rw-r--r--. 1 root root 171 14 Aug 22,02 .vmlinuz-3.10.0-862.11.6.el7.x86_64.hmac
-rwxr-xr-x. 1 root root 6242208 14 Aug 22,02 vmlinuz-3.10.0-862.11.6.el7.x86_64
-rw-r--r--. 1 root root 305158 14 Aug 22,05 symvers-3.10.0-862.11.6.el7.x86_64.gz
-rw-------. 1 root root 20774393 5 Sep 15,20 initramfs-3.10.0-862.11.6.el7.x86_64.img
drwx------. 5 root root 97 5 Sep 15,20 grub2
dr-xr-xr-x. 5 root root 4096 5 Sep 15,21 .
-rw-------. 1 root root 20782404 5 Sep 15,21 initramfs-3.10.0-862.3.2.el7.x86_64.img
Now we know the kernel and initram file names we just check the kernel arguments in the kernel, load them with kexec and start an systemd target to load the new kernel:
As you can see systemd performs a normal shutdown of all services and targets.
The ssh connection is immediately closed because the reboot is initiated.
After 10-15 seconds our host is alive and the new kernel is loaded successfully:
root@test ~ $ ssh root@srv
root@srv's password:
Last login: Wed Sep 5 15:16:08 2018 from test
[root@srv ~]# uname -a
Linux srv.local 3.10.0-862.11.6.el7.x86_64 #1 SMP Tue Aug 14 21:49:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@srv ~]#
and you wonder why your system did not have network and even no interface is shown with the commands
ip addr and ifconfig -a
, it is probably because you used unsupported SFP+ module.
In the “dmesg” you’ll see two line regarding the problem with the interface:
[ 3142.439304 ] ixgbe 0000:05:00.0: failed to load because an unsupported SFP+ or QSFP module type was detected.
[ 3142.439306 ] ixgbe 0000:05:00.0: Reload the driver after installing a supported module.
Only tested modules are supported by default and it’s probably normal there are a great deal of different SFP+ devices and the creators of the kernel driver would not be able to test all the different SFP+ modules in the world, so all tested SFP+ modules work by default and all other not tested need to enable an option in the kernel to instruct the driver to use them!
Enable the unsupported SF+ module of Intel X520. The kernel module name is “ixgbe”:
Built-in kernel module, when the ixgbe is build in the kernel, you must pass a configuration parameters to the kernel boot line
ixgbe.allow_unsupported_sfp=1
Under CentOS 7 and Ubuntu 16/17/18 you can do the following:
Edit
/etc/default/grub
and add at the end of the GRUB_CMDLINE_LINUX the above line. So it should look like:
DO NOT forget to execute a grub2 make configuration command, because the grub2 configuration file must be regenerated to include the option you’ve set.
Most linux distros use ixgbe as a kernel module, so they load the module with the initrd image after booting the kernel. We must instruct how the kernel module will be loaded from the initrd image, so the initrd image must be changed. Here is how to do it properly for CentOS 7 and Ubuntu/Debian:
For the both distros execute the command:
It will create a configuration file for the ixgbe kernel module with the configuration you see between the double quotes. And then reinstall the kernel!!! Because the initrd image will be regenerated. If you do not want to reinstall the current kernel, you must manually regenerate the initrd image.
Regenerate all the initrd images for the all installed kernels.
Under CentOS 7:
Here you can see the steps to install the latest (mainline stable) kernel under CentOS 7, whether we need the latest driver, because we bought a new laptop released for the first time last month with the latest hardware or there is a hot fix of some nasty bug it is of no matter.
Here are the steps about installing the latest kernel to our CentOS 7, you must have root access:
STEP 1 Import the public key of the repository, which offer us the packages of the CentOS 7 mainline stable kernel (and some other kernels, like the Red Hat Enterprise Linux (RHEL) 6 and 7 and more, you can check out the site). The site of the repository is
https://elrepo.org/
Import the public key
#change to root user (skip the first line if you are root)
sudo su
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
STEP 2 Check the latest version of the rpm install package of the repository, the command is under
To install ELRepo for RHEL-7, SL-7 or CentOS-7:
So at present the latest version is “7.0-3” and execute to wget to download the package for the repository elrepo:
And as you see, no it’ll not boot to the new kernel, so you must configure grub2 to boot your newly just installed kernel. First check all the installed kernels, set the right kernel and then it is mandatory to call
grub2-mkconfig
to update the grub2 configuration:
[root@srv ~]# awk -F\' /^menuentry/{print\$2} /boot/grub2/grub.cfg
CentOS Linux (4.15.4-1.el7.elrepo.x86_64) 7 (Core)
CentOS Linux (3.10.0-693.el7.x86_64) 7 (Core)
CentOS Linux (0-rescue-0a26fe4b81d845209fb8958c8e29d600) 7 (Core)
The position 0 (YES, it starts from ZERO!) is “CentOS Linux (4.15.4-1.el7.elrepo.x86_64) 7 (Core)”, so you have two options to set it:
Sometimes we need to install the latest kernel version of our linux distro (btw it is called “mainline” if you need to google something about it)! Whether we need the latest driver, because we bought a new laptop released for the first time last month with the latest hardware or there is a hot fix of some nasty bug it is of no matter. Here are the steps (and some troubleshooting) about installing the latest kernel to our Ubuntu install 17.10 (with old versions like 16, 15 and 14 should be the same):
STEP 1 is to choose our desired kernel from
http://kernel.ubuntu.com/~kernel-ppa/mainline/
Open this address in your favorite browser and choose the latest. In our example the latest kernel, which is no release candidate, 4.14.15. Enter the directory
If you have Ryzen Threadripper (in my case 1950X) and Ubuntu 17.10 (because the others versions do not boot at all!) with the latest kernel 4.15.x and the default vmlinuz-4.13.0-21-generic you might experience frequently wifi, network cards and video cards problems like wifi disconnects and network cards lost connections and some kind of freezes for a second or two of the video! If you have Ryzen Threadripper 1950X and or ASUS ROG ZENITH EXTREME (or any other motherboard with the chipset AMD X399) and ubuntu 17.10 or probably any linux distro with any kernel you might experience really unstable system. Look at your dmesg and if you see something like
Apparently everything on the pci express could be affected! So till the kernel team fixes the issues, you can use the following workaround: add to the kernel boot parameters
pcie_aspm=off
Under Ubuntu you can add in the file
/etc/default/grub
the following line
GRUB_CMDLINE_LINUX="pcie_aspm=off"
And then execute
update-grub
the changes to take effect in the grub configuration. Then reboot the machine!
Under CentOS 7 the configuration file and the line to put are the same just the update of the grub configuration is different:
For UEFI-based systems execute:
grub2-mkconfig -o /boot/efi/EFI/centos/grub.cfg
For legacy-BIOS systems execute:
grub2-mkconfig -o /boot/grub2/grub.cfg
*You should know what is your setup, probably UEFI-based, you could check in your boot directory, if there is /boot/efi/EFI/centos/grub.cfg you probably should use the option for UEFI-based, if you do not have any “efi” directory under “/boot” use the second option for legacy-BIOS.
Then reboot the machine!
Manage Cookie Consent
We use technologies like cookies to store and/or access device information. We do this to improve browsing experience and to show (non-) personalized ads. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional
Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.