Moving existing Elasticsearch and Kibana installation from CentOS 7 to CentOS Stream 9

main menu
install Elasticsearch and Kibana

Despite having only two additional installed software under CentOS 7 it is not a good idea to just try upgrading only CentOS 7 to CentOS Stream 9. There is no clear and supported path for upgrading from CentOS 7 to CentOS Stream 9 and even to the older one CentOS 8 (or CentOS Stream 8). The best way is to just make a clean install of CentOS Stream 9 and copy all the Elasticsearch and Kibana files and this article is how to do it without problems.
Here is the plan to move the existing installation of Elasticsearch and Kibana services from CentOS 7 to CentOS Stream 9:

  1. Make a clean install of CentOS Stream 9
  2. Update the current Elasticsearch and Kibana installations to their last versions (from their branch or minor versions).
  3. Add Elasticsearch and Kibana repositories to the new system. Tune the system crypto policies.
  4. Install Elasticsearch and Kibana software packages, but do not start the services.
  5. Copy Elasticsearch and Kibana important files such as the index directory and the configuration directories. Check the user and group IDs of the files.
  6. Start the Elasticsearch and Kibana services.

In this example, the installation of the new server is just starting a new LXC container, which will host the Elasticsearch and Kibana services. There is no difference between using a container or a physical machine. With LXC container it is easier to copy the needed files such as the Elasticsearch index files, which may be tens of terabytes or more, and various configuration files.

STEP 1) Make a clean install of CentOS Stream 9

Check out the following article on the purpose – Network installation of CentOS Stream 9 (20220606.0) – minimal server installation or if LXC container is preferred – Run LXC CentOS Stream 9 container with bridged network under CentOS Stream 9.

Creating a LXC container of CentOS Stream 9 is really simple and fast:

[root@srv ~]# lxc-create --template download -n kibana.u1x2.com -- --dist centos --release 9-Stream --arch amd64
The cached copy has expired, re-downloading...
Downloading the image index
Downloading the rootfs
Downloading the metadata
The image cache is now ready
Unpacking the rootfs

---
You just created a Centos 9-Stream x86_64 (20230511_19:27) container.

Then tune the network following the above article. It is a good idea when configuring the network to preserve the original UUIDs and network addresses (MAC address, too) of the LXC containers network and the inner container’s interface.
So copy the UUID from /var/lib/lxc/loganalyzer-old/rootfs/etc/sysconfig/network-scripts/ifcfg-eth0 to the CentOS Stream 9 network configuration – /var/lib/lxc/loganalyzer/rootfs/etc/NetworkManager/system-connections/ethernet-eth0.nmconnection, which uses NetworkManager. And the LXC container’s MAC address: the variable lxc.net.0.hwaddr from /var/lib/lxc/loganalyzer/config to /var/lib/lxc/loganalyzer/config.
The last step is to run the newly installed system. No errors in the output signals for a successful start-up of the LXC container with the name loganalyzer.

[root@srv ~]# lxc-start -n loganalyzer
[root@srv ~]# 

STEP 2) Upgrade the current Elasticsearch and Kibana installations to their last versions (from their branch or minor versions).

For example, if the current Elasticsearch is version 7. It is good to upgrade it to the latest version from 7.x before proceeding with the next steps.
The current installed versions of Elasticsearch and Kibana software are from the branch 77.17.4-1 and the latest version is 7.17.10-1.
Check in the old system with (CentOS 7):

[root@loganalyzer-old ~]# yum list installed|egrep -e "(elasticsearch|kibana)"
elasticsearch.x86_64               7.17.4-1                               @elasticsearch
kibana.x86_64                      7.17.4-1                               @elasticsearch

Keep on reading!

Change the LXC container root folder under CentOS with SELinux

The default LXC container folder in CentOS (all versions – 7,8, Stream 8 and Stream 9) is /var/lib/lxc, which may resides in the root partition. When changing the lxc.rootfs or (the main directory /var/lib/lxc) to another place, the containers may still work without any additional SELinux permissions. Some tools like lxc-attach would definitely stop working with permission errors – lxc_attach_run_shell: 1333 Permission denied – failed to exec shell. This article will show how to use lxc-create and SELinux commands to properly change the LXC container’s rootfs.
For detailed information how to create a LXC container check out – Run LXC CentOS Stream 9 container with bridged network under CentOS Stream 9 or Run LXC Ubuntu 22.04 LTS container with bridged network under CentOS Stream 9.

Create LXC container with not default path

  • Change the rootfs only. To change only the LXC container root filesystem location use “–dir=” lxc-create option:
    lxc-create --template download -n mycontainer2 --dir=/mnt/storage/servers/mycontainer2 -- --dist centos --release 9-Stream --arch amd64
    

    It will place the files under /mnt/storage/servers/mycontainer2, but the configuration will still be located in /var/lib/lxc/mycontainer2/.

    [root@srv ~]# ls -altr /var/lib/lxc/mycontainer2/
    total 16
    drwxr-xr-x. 3 root root 4096 Oct 14 13:42 ..
    drwxr-xr-x. 2 root root 4096 Oct 14 13:42 rootfs
    -rw-r-----. 1 root root  775 Oct 14 13:42 config
    drwxrwx---. 3 root root 4096 Oct 14 13:42 .
    [root@srv ~]# ls -altr /var/lib/lxc/mycontainer2/rootfs/
    total 8
    drwxr-xr-x. 2 root root 4096 Oct 14 13:42 .
    drwxrwx---. 3 root root 4096 Oct 14 13:42 ..
    [root@srv ~]# ls -altr /mnt/storage/servers/mycontainer2/
    total 76
    drwxrwxrwt.  2 root root 4096 Aug  9  2021 tmp
    drwxr-xr-x.  2 root root 4096 Aug  9  2021 srv
    lrwxrwxrwx.  1 root root    8 Aug  9  2021 sbin -> usr/sbin
    drwxr-xr-x.  2 root root 4096 Aug  9  2021 opt
    drwxr-xr-x.  2 root root 4096 Aug  9  2021 mnt
    drwxr-xr-x.  2 root root 4096 Aug  9  2021 media
    lrwxrwxrwx.  1 root root    9 Aug  9  2021 lib64 -> usr/lib64
    lrwxrwxrwx.  1 root root    7 Aug  9  2021 lib -> usr/lib
    drwxr-xr-x.  2 root root 4096 Aug  9  2021 home
    dr-xr-xr-x.  2 root root 4096 Aug  9  2021 boot
    lrwxrwxrwx.  1 root root    7 Aug  9  2021 bin -> usr/bin
    dr-xr-xr-x.  2 root root 4096 Aug  9  2021 afs
    dr-xr-xr-x.  2 root root 4096 Oct 14 07:11 sys
    dr-xr-xr-x.  2 root root 4096 Oct 14 07:11 proc
    drwxr-xr-x. 12 root root 4096 Oct 14 07:11 usr
    drwxr-xr-x.  8 root root 4096 Oct 14 07:11 run
    drwxr-xr-x. 18 root root 4096 Oct 14 07:11 var
    dr-xr-x---.  2 root root 4096 Oct 14 07:12 root
    drwxr-xr-x.  2 root root 4096 Oct 14 07:12 selinux
    drwxr-xr-x. 19 root root 4096 Oct 14 07:15 .
    drwxr-xr-x.  4 root root 4096 Oct 14 13:41 ..
    drwxr-xr-x.  3 root root 4096 Oct 14 13:42 dev
    drwxr-xr-x. 63 root root 4096 Oct 14 13:42 etc
    
  • Change the LXC container path – the folder containing the configuration and the container’s root filesystems use “-P”
    lxc-create -P /mnt/storage/servers/ --template download -n mycontainer -- --dist centos --release 9-Stream --arch amd64
    

    All the LXC container configuration and root filesystem will be placed under /mnt/storage/servers/[container_name], which in the example above is /mnt/storage/servers/mycontainer

    [root@srv ~]# ls -al /mnt/storage/servers/mycontainer
    total 16
    drwxrwx---.  3 root root 4096 Oct 14 13:38 .
    drwxr-xr-x.  4 root root 4096 Oct 14 13:41 ..
    -rw-r-----.  1 root root  780 Oct 14 13:38 config
    drwxr-xr-x. 19 root root 4096 Oct 14 07:15 rootfs
    

It is better to use the “-P” and to change the LXC container location than only the filesystem path. In this case, a good practice is to make a symbolic link in /var/lib/lxc/[container-name] to the new location:

ln -s /mnt/storage/servers/mycontainer /var/lib/lxc/mycontainer

So all LXC tools will continue to work without explicitly adding an option for the new path of this container.

Change the SELinux file context to be container_var_lib_t of the LXC root filesystem

Add the file context container_var_lib_t to the container’s root filesystem path and change the SELinux labels.
First, verify all the needed tools are installed:

dnf install -y policycoreutils-python-utils container-selinux

Then, add a new file context to the path /mnt/storage/servers/mycontainer and run the restorecon to change the SELinux labels to container_var_lib_t

semanage fcontext -a -t container_var_lib_t '/mnt/storage/servers/mycontainer(/.*)?'
restorecon -Rv /mnt/storage/servers/mycontainer

The file context may be shown with:

[root@srv ~]# ls -alZ /mnt/storage/servers/mycontainer
total 16
drwxrwx---.  3 root root unconfined_u:object_r:container_var_lib_t:s0 4096 Oct 14 13:38 .
drwxr-xr-x.  4 root root unconfined_u:object_r:mnt_t:s0               4096 Oct 14 13:41 ..
-rw-r-----.  1 root root unconfined_u:object_r:container_var_lib_t:s0  780 Oct 14 13:38 config
drwxr-xr-x. 19 root root unconfined_u:object_r:container_var_lib_t:s0 4096 Oct 14 07:15 rootfs

Failing to set the proper SELinux labels may result to errors such as lxc_attach_run_shell: 1333 Permission denied – failed to exec shell

Copying partition table from one disk to another with older sfdisk under CentOS 7

Older version of sfdisk may still be used for msdos partition tables.
To copy the partition table from one disk to another using sfdisk a temporary file should be used to store the data for the partition table.

Here is an example of how to copy the msdos partition table from disk sda to disk sdb! Two simple commands

  1. Dump the source (sda) partition table to a temporary file.
  2. Redirect the standard input of the sfdisk utility with the above temporary file.

A copying partition table is really useful when recovering from a drive failure in a Linux software raid. Sometimes it is difficult or just easier to create the exact layout as the source mirror disk!

mdadm --add /dev/md1 /dev/sdb2
mdadm: /dev/sdb2 not large enough to join array

Errors such as the above are easily resolved with just two commands. The new versions of disk programs align the partitions, which may be a problem for a software RAID to join in a partition.

STEP 1) Dump the source partition table.

The source partition table is from /dev/sda:

[root@srv ~]# sfdisk -d /dev/sda > part_table_sda
sfdisk: Warning: extended partition does not start at a cylinder boundary.
DOS and Linux will interpret the contents differently.

There is a warning about not aligned partition, which may cause problems when creating from scratch, so copying the partition table is the best option in such cases.

Here is what the temporary file part_table_sda with the partition table information contains:

[root@srv ~]# cat part_table_sda 
# partition table of /dev/sda
unit: sectors

/dev/sda1 : start=     2048, size= 67045376, Id=fd, bootable
/dev/sda2 : start= 67110912, size=  1048576, Id=fd
/dev/sda3 : start= 68159488, size= 62883840, Id=fd
/dev/sda4 : start=131043328, size=845729792, Id= f
/dev/sda5 : start=131076096, size=845434880, Id=fd

Keep on reading!

Caching NFS files with cachefilesd

A great tool for caching a network filesystem like NFS mounts is cachefilesd! It is easy to use it and a good deal of stats can be retrieved from the tool. More on how it works here https://www.kernel.org/doc/Documentation/filesystems/caching/fscache.txt

Here are quick steps to cache an NFS mounts (it works with NFS-Ganesha servers, too):

  1. Install the daemon tool cachefilesd
  2. Check the configuration file /etc/cachefilesd.conf. In most cases, no need to edit the file! Just check the disk limits if they are good.
  3. Start the cachefilesd daemon.
  4. Mount the network directories with “fsc” option. Umount and mount them all if they’ve been already mounted. The fsc is mandatory option to enable file cacheing of a network mount.
  5. Check stats to see if the file cching is working properly.

The example below is under CentOS 8, but it is almost the same in most Linux distributions.

STEP 1) Install the daemon tool cachefilesd

This is straight forward, just install it with the package manager:

[root@srv ~]# dnf install cachefilesd
Last metadata expiration check: 2:33:44 ago on Tue 08 Dec 2020 07:18:01 AM UTC.
Dependencies resolved.
=============================================================================================================================================================================================
 Package                                        Architecture                              Version                                            Repository                                 Size
=============================================================================================================================================================================================
Installing:
 cachefilesd                                    x86_64                                    0.10.10-4.el8                                      BaseOS                                     43 k

Transaction Summary
=============================================================================================================================================================================================
Install  1 Package

Total download size: 43 k
Installed size: 71 k
Is this ok [y/N]: y
Downloading Packages:
cachefilesd-0.10.10-4.el8.x86_64.rpm                                                                                                                         3.1 MB/s |  43 kB     00:00    
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total                                                                                                                                                        2.8 MB/s |  43 kB     00:00     
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                                                                                                                                     1/1 
  Installing       : cachefilesd-0.10.10-4.el8.x86_64                                                                                                                                    1/1 
  Running scriptlet: cachefilesd-0.10.10-4.el8.x86_64                                                                                                                                    1/1 
  Verifying        : cachefilesd-0.10.10-4.el8.x86_64                                                                                                                                    1/1 

Installed:
  cachefilesd-0.10.10-4.el8.x86_64                                                                                                                                                           

Complete!

STEP 2) Check the configuration file and tune for your system.

In most cases, the defaults in /etc/cachefilesd.conf are good to start with:

dir /var/cache/fscache
tag mycache
brun 10%
bcull 7%
bstop 3%
frun 10%
fcull 7%
fstop 3%

# Assuming you're using SELinux with the default security policy included in
# this package
secctx system_u:system_r:cachefiles_kernel_t:s0

The directory where the cache will reside and the lines with the percentages are for disk space limitation. “brun 10%” means cache can runs freely till the disk space drops below 10%. “bcull 7%” – culling the cache when the free space drops below “7%” and more in the man page (or https://linux.die.net/man/5/cachefilesd.conf).
So if one maintains disk free space below 10% the configuration file should be edited.

STEP 3) Start the cachefilesd daemon.

And enable on boot to start automatically.

[root@srv ~]# systemctl start cachefilesd
[root@srv ~]# systemctl enable cachefilesd
Created symlink /etc/systemd/system/multi-user.target.wants/cachefilesd.service → /usr/lib/systemd/system/cachefilesd.service.
[root@srv ~]# systemctl status cachefilesd
● cachefilesd.service - Local network file caching management daemon
   Loaded: loaded (/usr/lib/systemd/system/cachefilesd.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2020-12-08 10:01:24 UTC; 11s ago
 Main PID: 29786 (cachefilesd)
    Tasks: 1 (limit: 408616)
   Memory: 2.5M
   CGroup: /system.slice/cachefilesd.service
           └─29786 /usr/sbin/cachefilesd -n -f /etc/cachefilesd.conf

Dec 08 10:01:24 srv systemd[1]: Starting Local network file caching management daemon...
Dec 08 10:01:24 srv systemd[1]: Started Local network file caching management daemon.
Dec 08 10:01:24 srv cachefilesd[29786]: About to bind cache
Dec 08 10:01:24 srv cachefilesd[29786]: Bound cache
Dec 08 10:01:24 srv cachefilesd[29786]: Daemon Started

The status command shows the daemon cachefilesd is running. But does it cache?

STEP 4) Mount the network filesystems with option fsc

To make cachefilesd cache a network mount the option fsc must be included in the mount options. Remount may not work correctly, so to be sure a full umount/mount should be executed. Here is an example /etc/fstab file:

192.168.0.1:/mnt/storage /mnt/storage  nfs defaults,hard,intr,noexec,nosuid,_netdev,fsc,vers=4 0 0

And then mount with simple command:

mount /mnt/storage

Check whether the mounts if the FS cache is used. FSC must be “yes”.

[root@srv ~]# cat /proc/fs/nfsfs/volumes
NV SERVER   PORT DEV          FSID                              FSC
v4 c0a80001  801 0:41         d4098a2af096148:ec7560388cbe5b83  yes

There is a proc file for cache statistics:

[root@srv ~]# cat /proc/fs/fscache/stats
FS-Cache statistics
Cookies: idx=49 dat=4385599 spc=0
Objects: alc=43666 nal=0 avl=43666 ded=36002
ChkAux : non=0 ok=12289 upd=0 obs=761
Pages  : mrk=24915179 unc=24492585
Acquire: n=4385648 nul=0 noc=0 ok=4385648 nbf=0 oom=0
Lookups: n=43666 neg=31372 pos=12294 crt=31372 tmo=0
Invals : n=1 run=1
Updates: n=0 nul=0 run=1
Relinqs: n=4377930 nul=0 wcr=0 rtr=0
AttrChg: n=0 ok=0 nbf=0 oom=0 run=0
Allocs : n=0 ok=0 wt=0 nbf=0 int=0
Allocs : ops=0 owt=0 abt=0
Retrvls: n=751549 ok=716860 wt=21436 nod=34689 nbf=0 int=0 oom=0
Retrvls: ops=751549 owt=9158 abt=0
Stores : n=550412 ok=550412 agn=0 nbf=0 oom=0
Stores : ops=33238 run=583650 pgs=550412 rxd=550412 olm=0
VmScan : nos=23963352 gon=0 bsy=0 can=0 wt=0
Ops    : pend=9160 run=784788 enq=26874960 can=0 rej=0
Ops    : ini=1301962 dfr=265 rel=1301962 gc=265
CacheOp: alo=0 luo=0 luc=0 gro=0
CacheOp: inv=0 upo=0 dro=0 pto=0 atc=0 syn=0
CacheOp: rap=0 ras=0 alp=0 als=0 wrp=0 ucp=0 dsp=0
CacheEv: nsp=761 stl=0 rtr=0 cul=0

And here is the cache directory filled with files. If there are no files, the FS cache is not used, probably the mount is not mounted with FSC! Umount and mount the mounts again.

[root@srv ~]# find /var/cache/fscache|head -n 20
/var/cache/fscache
/var/cache/fscache/graveyard
/var/cache/fscache/cache
/var/cache/fscache/cache/@4a
/var/cache/fscache/cache/@4a/I03nfs
/var/cache/fscache/cache/@4a/I03nfs/@84
/var/cache/fscache/cache/@4a/I03nfs/@84/Jc0010800840000cG0400
/var/cache/fscache/cache/@4a/I03nfs/@84/Jc0010800840000cG0400/@68
/var/cache/fscache/cache/@4a/I03nfs/@84/Jc0010800840000cG0400/@68/J1100000000000Q0goaWH946iIn7oUMELrd80000000040000g00Kb000wFe000jt000oG3000000040000g000000000
/var/cache/fscache/cache/@4a/I03nfs/@84/Jc0010800840000cG0400/@68/J1100000000000Q0goaWH946iIn7oUMELrd80000000040000g00Kb000wFe000jt000oG3000000040000g000000000/@8b
/var/cache/fscache/cache/@4a/I03nfs/@84/Jc0010800840000cG0400/@68/J1100000000000Q0goaWH946iIn7oUMELrd80000000040000g00Kb000wFe000jt000oG3000000040000g000000000/@8b/EA0g00sg0200n8000000ixBMHyy9gdcUm_O8ewl7XMW1n8cXgyl60
/var/cache/fscache/cache/@4a/I03nfs/@84/Jc0010800840000cG0400/@68/J1100000000000Q0goaWH946iIn7oUMELrd80000000040000g00Kb000wFe000jt000oG3000000040000g000000000/@8b/EA0g00sg0200n8000000ixBMHyy9gdcUm_O8ewl7XZG2n84qfxl60
/var/cache/fscache/cache/@4a/I03nfs/@84/Jc0010800840000cG0400/@68/J1100000000000Q0goaWH946iIn7oUMELrd80000000040000g00Kb000wFe000jt000oG3000000040000g000000000/@8b/EA0g00sg0200n8000000ixBMHyy9gdcUm_O8ewl7XQM2n8IQ4El60
/var/cache/fscache/cache/@4a/I03nfs/@84/Jc0010800840000cG0400/@68/J1100000000000Q0goaWH946iIn7oUMELrd80000000040000g00Kb000wFe000jt000oG3000000040000g000000000/@8b/EA0g00sg0200n8000000ixBMHyy9gdcUm_O8ewl7XZO2n8Ms7kl60
/var/cache/fscache/cache/@4a/I03nfs/@84/Jc0010800840000cG0400/@68/J1100000000000Q0goaWH946iIn7oUMELrd80000000040000g00Kb000wFe000jt000oG3000000040000g000000000/@8b/EA0g00sg0200n8000000ixBMHyy9gdcUm_O8ewl7XvQ2n88dKgl60
/var/cache/fscache/cache/@4a/I03nfs/@84/Jc0010800840000cG0400/@68/J1100000000000Q0goaWH946iIn7oUMELrd80000000040000g00Kb000wFe000jt000oG3000000040000g000000000/@8b/EA0g00sg0200n8000000ixBMHyy9gdcUm_O8ewl7XV40o8cREC6b0
/var/cache/fscache/cache/@4a/I03nfs/@84/Jc0010800840000cG0400/@68/J1100000000000Q0goaWH946iIn7oUMELrd80000000040000g00Kb000wFe000jt000oG3000000040000g000000000/@8b/EA0g00sg0200n8000000ixBMHyy9gdcUm_O8ewl7X3Y1n8MHO_l60
/var/cache/fscache/cache/@4a/I03nfs/@84/Jc0010800840000cG0400/@68/J1100000000000Q0goaWH946iIn7oUMELrd80000000040000g00Kb000wFe000jt000oG3000000040000g000000000/@8b/EA0g00sg0200n8000000ixBMHyy9gdcUm_O8ewl7Xhkwn80f23zb0
/var/cache/fscache/cache/@4a/I03nfs/@84/Jc0010800840000cG0400/@68/J1100000000000Q0goaWH946iIn7oUMELrd80000000040000g00Kb000wFe000jt000oG3000000040000g000000000/@8b/EA0g00sg0200n8000000ixBMHyy9gdcUm_O8ewl7XEQ2n8IuAll60
/var/cache/fscache/cache/@4a/I03nfs/@84/Jc0010800840000cG0400/@68/J1100000000000Q0goaWH946iIn7oUMELrd80000000040000g00Kb000wFe000jt000oG3000000040000g000000000/@8b/EA0g00sg0200n8000000ixBMHyy9gdcUm_O8ewl7XoO2n8A7Bll60
[root@srv ~]# du -d 1 -h /var/cache/fscache
4.0K    /var/cache/fscache/graveyard
3.8G    /var/cache/fscache/cache
3.8G    /var/cache/fscache

There are 3.8G in the cache.

libelf was not found in the pkg-config search path

Building from source under CentOS the user may stumble on some compilation errors and most of them are for missing -devel packages. Here is such example with not so easy to find the name of a missing library:

[/tmp/netdata-libbpf-El77Ld/libbpf-0.0.9_netdata-1/src]# env CFLAGS=-fPIC CXXFLAGS= LDFLAGS= BUILD_STATIC_ONLY=y OBJDIR=build DESTDIR=.. make install 
Package libelf was not found in the pkg-config search path.
Perhaps you should add the directory containing `libelf.pc'
to the PKG_CONFIG_PATH environment variable
No package 'libelf' found
mkdir -p build/staticobjs
cc -I. -I../include -I../include/uapi -DCOMPAT_NEED_REALLOCARRAY -fPIC -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64   -c bpf.c -o build/staticobjs/bpf.o
cc -I. -I../include -I../include/uapi -DCOMPAT_NEED_REALLOCARRAY -fPIC -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64   -c btf.c -o build/staticobjs/btf.o
btf.c:17:18: fatal error: gelf.h: No such file or directory
 #include <gelf.h>
                  ^
compilation terminated.
make: *** [build/staticobjs/btf.o] Error 1
 FAILED   

The missing development library file is with the name: elfutils-libelf-devel. Installing the package with yum or dnf will resolve the above error:

yum install -y elfutils-libelf-devel

Or for CentOS 8 and newer Fedora versions:

dnf install -y elfutils-libelf-devel

removing the default kernel in CentOS 8 – remove elrepo kernel

Removing the default kernel aka the loaded kernel in CentOS 8 maybe challenging because the package is protected and cannot be removed by the yum or dnf.
Here is the case: an elrepo kernel-ml loaded and the dnf prints it cannot remove the package, because it is protected:

[root@srv ~]# dnf remove kernel-ml kernel-ml-core kernel-ml-modules
Error: 
 Problem: The operation would result in removing the following protected packages: kernel-ml-core
(try to add '--skip-broken' to skip uninstallable packages or '--nobest' to use not only best candidate packages)
[root@srv ~]# uname -a
Linux srv.localhost 5.10.4-1.el8.elrepo.x86_64 #1 SMP Tue Dec 29 11:04:23 EST 2020 x86_64 x86_64 x86_64 GNU/Linux
[root@srv ~]# grubby --default-kernel
/boot/vmlinuz-5.10.4-1.el8.elrepo.x86_64

The system is booted up with the kernel we are trying to remove, which is impossible.

The solution is to set a new default kernel and load it. Then dnf will be able to remove the first kernel.

For CentOS 7, just use the yum instead of dnf command.
Using grubby is really easy and straightforward:

STEP 1) List all installed and available to boot kernels

[root@srv ~]# grubby --info=ALL |grep ^kernel
kernel="/boot/vmlinuz-5.10.4-1.el8.elrepo.x86_64"
kernel="/boot/vmlinuz-4.18.0-259.el8.x86_64"
kernel="/boot/vmlinuz-4.18.0-257.el8.x86_64"
kernel="/boot/vmlinuz-0-rescue-45e12f0814fd4947b99cbdcb88950361"

STEP 2) Select the kernel to load the next time

[root@srv ~]# grubby --set-default "/boot/vmlinuz-4.18.0-259.el8.x86_64"
The default is /boot/loader/entries/45e12f0814fd4947b99cbdcb88950361-4.18.0-259.el8.x86_64.conf with index 1 and kernel /boot/vmlinuz-4.18.0-259.el8.x86_64

Keep on reading!

Using xtrabackup to make fast MySQL backups – backup and restore

Percona provides a really interesting tool for MySQL backups! It works on a live running MySQL server by copying the MySQL binary files from the data directory and because the tool knows how the engines work (InnoDB, MyISAM, and so on) it can make point-in-time consistent MySQL data files. Of course, using this tool on a database with InnoDB tables only is the best case, because no write lock will be used.

There are two main tasks when making a backup and one, which is not mandatory:

  1. Execute xtrabackup with –backup option to copy the MySQL data files and additional information for the tool
  2. Execute xtrabackup with –prepare option to prepare the MySQL (InnoDB) files ready for use of a MySQL server. The files are consistented, i.e. the whole backup is consistent despite the copy of the different files that happened at different times.
  3. Execute xtrabackup with –prepare option again to further prepare the MySQL (InnoDB) files ready for use of a MySQL server. Additional preparation such as InnoDB log files and more. This step is not mandatory and it may be skipped because the MySQL server, which uses the data files, will create the InnoDB log files.

The directory with MySQL copied files contains not only the MySQL files but additional information plus my.cnf (MySQL current configuration) backup. The backup files may be prepared (and restore) in a different server than the original, on which the backup was made.

xtrabackup may enable compressing on-the-fly when copying the binary files with the option –compress but the steps to use (restore) the backup in a new server are different – additional step to decompress before prepare. So the above steps become:

  1. xtrabackup –backup –compress – copy and compress on-the-fly.
  2. xtrabackup –decompress – decompress the backup files
  3. xtrabackup –prepare – make point-in-time consistent MySQL data files, which are ready for a MySQL server.
  4. xtrabackup –prepare – additional preparation to make the start up of the MySQL server with those files faster.

And here is a real-world example:

STEP 1) Install the xtrabackup utility

yum install -y https://repo.percona.com/yum/percona-release-latest.noarch.rpm
yum install -y percona-xtrabackup-24 qpress

The percona-xtrabackup-24 provides the percona xtrabackup tool for backup of MySQL 5.1, 5.5, 5.6 and 5.7 servers, as well as Percona Server for MySQL with XtraDB. Install percona-xtrabackup-80 for use with MySQL 8 and later.
The whole output is included in the Bonus 3 section below.

STEP 2) Make backup

Make the backup. The datadir of the MySQL server is needed and a new direcotry for the backup files. The MySQL server is running and serving requests.

[root@srv ~]# xtrabackup --backup --slave-info --datadir=/var/lib/mysql/ --target-dir=/mnt/backups/sql-xtrabackup/
xtrabackup: recognized server arguments: --datadir=/var/lib/mysql --log_bin --server-id=1 --open_files_limit=5000 --innodb_buffer_pool_size=256M --innodb_log_buffer_size=32M --innodb_log_files_in_group=2 --innodb_flush_log_at_trx_commit=0 --innodb_file_per_table=1 --innodb_flush_method=O_DIRECT --datadir=/var/lib/mysql/ 
xtrabackup: recognized client arguments: --password=* --backup=1 --slave-info=1 --target-dir=/mnt/backups/sql-xtrabackup/ 
200919 14:35:11  version_check Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_group=xtrabackup' (using password: YES).
200919 14:35:11  version_check Connected to MySQL server
200919 14:35:11  version_check Executing a version check against the server...
200919 14:35:11  version_check Done.
200919 14:35:11 Connecting to MySQL server host: localhost, user: not set, password: set, port: not set, socket: not set
Using server version 5.7.31-log
xtrabackup version 2.4.20 based on MySQL server 5.7.26 Linux (x86_64) (revision id: c8b4056)
xtrabackup: uses posix_fadvise().
xtrabackup: cd to /var/lib/mysql/
xtrabackup: open files limit requested 5000, set to 5000
xtrabackup: using the following InnoDB configuration:
xtrabackup:   innodb_data_home_dir = .
xtrabackup:   innodb_data_file_path = ibdata1:12M:autoextend
xtrabackup:   innodb_log_group_home_dir = ./
xtrabackup:   innodb_log_files_in_group = 2
xtrabackup:   innodb_log_file_size = 50331648
xtrabackup: using O_DIRECT
InnoDB: Number of pools: 1
200919 14:35:11 >> log scanned up to (1135932912)
xtrabackup: Generating a list of tablespaces
InnoDB: Allocated tablespace ID 34 for mywordpress/wp_users, old maximum was 0
200919 14:35:12 [01] Copying ./ibdata1 to /mnt/backups/sql-xtrabackup/ibdata1
200919 14:35:12 >> log scanned up to (1135932912)
200919 14:35:13 >> log scanned up to (1135932912)
200919 14:35:14 [01]        ...done
200919 14:35:14 >> log scanned up to (1135932912)
200919 14:35:15 >> log scanned up to (1135932912)
200919 14:35:16 [01] Copying ./mywordpress/wp_users.ibd to /mnt/backups/sql-xtrabackup/mywordpress/wp_users.ibd
200919 14:35:16 [01]        ...done
200919 14:35:16 [01] Copying ./mywordpress/wp_gglcptch_whitelist.ibd to /mnt/backups/sql-xtrabackup/mywordpress/wp_gglcptch_whitelist.ibd
200919 14:35:16 [01]        ...done
200919 14:35:16 [01] Copying ./mywordpress/wp_usermeta.ibd to /mnt/backups/sql-xtrabackup/mywordpress/wp_usermeta.ibd
200919 14:35:16 [01]        ...done
200919 14:35:16 >> log scanned up to (1135932912)
200919 14:35:17 [01] Copying ./mywordpress/wp_postmeta.ibd to /mnt/backups/sql-xtrabackup/mywordpress/wp_postmeta.ibd
200919 14:35:17 [01]        ...done
200919 14:35:17 [01] Copying ./mywordpress/wp_posts.ibd to /mnt/backups/sql-xtrabackup/mywordpress/wp_posts.ibd
200919 14:35:17 >> log scanned up to (1135932912)
200919 14:35:18 >> log scanned up to (1135932912)
200919 14:35:19 [01]        ...done
200919 14:35:19 >> log scanned up to (1135932912)
.....
.....
200919 14:36:30 [01] Copying ./mysql/columns_priv.MYD to /mnt/backups/sql-xtrabackup/mysql/columns_priv.MYD
200919 14:36:30 [01]        ...done
200919 14:36:30 >> log scanned up to (1135933567)
200919 14:36:31 [01] Copying ./mysql/proxies_priv.frm to /mnt/backups/sql-xtrabackup/mysql/proxies_priv.frm
200919 14:36:31 [01]        ...done
200919 14:36:31 [01] Copying ./mysql/help_category.frm to /mnt/backups/sql-xtrabackup/mysql/help_category.frm
200919 14:36:31 [01]        ...done
200919 14:36:31 [01] Copying ./mysql/proc.frm to /mnt/backups/sql-xtrabackup/mysql/proc.frm
200919 14:36:31 [01]        ...done
200919 14:36:31 Finished backing up non-InnoDB tables and files
200919 14:36:31 [00] Writing /mnt/backups/sql-xtrabackup/xtrabackup_slave_info
200919 14:36:31 [00]        ...done
200919 14:36:31 [00] Writing /mnt/backups/sql-xtrabackup/xtrabackup_binlog_info
200919 14:36:31 [00]        ...done
200919 14:36:31 Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS...
xtrabackup: The latest check point (for incremental): '1135933558'
xtrabackup: Stopping log copying thread.
.200919 14:36:31 >> log scanned up to (1135933567)

200919 14:36:32 Executing UNLOCK TABLES
200919 14:36:32 All tables unlocked
200919 14:36:32 [00] Copying ib_buffer_pool to /mnt/backups/sql-xtrabackup/ib_buffer_pool
200919 14:36:32 [00]        ...done
200919 14:36:32 Backup created in directory '/mnt/backups/sql-xtrabackup/'
MySQL binlog position: filename 'srv-bin.000001', position '1004'
200919 14:36:32 [00] Writing /mnt/backups/sql-xtrabackup/backup-my.cnf
200919 14:36:32 [00]        ...done
200919 14:36:32 [00] Writing /mnt/backups/sql-xtrabackup/xtrabackup_info
200919 14:36:32 [00]        ...done
xtrabackup: Transaction log of lsn (1135932903) to (1135933567) was copied.
200919 14:36:33 completed OK!

In the end, there must be written: “completed OK!”.
The whole output of the command is included in the Bonus 1 section below. An example with compress option enabled is in Bonus 2 section below.
Here is what contains the backup diectory:
Keep on reading!

Data too large, data for [] would be [] which is larger than the limit of

Rsyslog writing to Elasticsearch could lead to an error for some of the records and missing to save them in the backend:

{ ... { "error": { "root_cause": [ { "type": "circuit_breaking_exception", 
"reason": "[parent] Data too large, data for [<http_request>] would be [1008813778\/962mb], which is larger than the limit of [986061209\/940.3mb], 
real usage: [1008812248\/962mb], new bytes reserved: [1530\/1.4kb], usages [request=0\/0b, fielddata=317\/317b, in_flight_requests=1530\/1.4kb, accounting=178301893\/170mb]",
"bytes_wanted": 1008813778, "bytes_limit": 986061209, "durability": "PERMANENT" }], 
"type": "circuit_breaking_exception", "reason": "[parent] Data too large, data for [<http_request>] would be [1008813778\/962mb], which is larger than the limit of [986061209\/940.3mb], 
real usage: [1008812248\/962mb], new bytes reserved: [1530\/1.4kb], usages [request=0\/0b, fielddata=317\/317b, in_flight_requests=1530\/1.4kb, accounting=178301893\/170mb]",
"bytes_wanted": 1008813778, "bytes_limit": 986061209, "durability": "PERMANENT" }, "status": 429 } }

Unfortunately, such writes are not saved in the Elasticsearch and the data has been lost.

The problem here is that the Java VM has reached the maximum allowed memory and more memory should be allowed to be used by the Java Virtual Machine.

Find the Java VM option for the Elasticsearchjvm.options. In CentOS 7 the file is located in /etc/elasticsearch/jvm.options and set more memory with the variables “-Xms[SIZE]g -Xmx[SIZE]g”, such as:

.....
-Xms4g
-Xmx4g
.....

|grep -v grep
This will allow 4G “maximum size of total heap space” to be used by the Java Virtual Machine. By default, it is 1G (-Xms1g -Xmx1g). It is a good idea to set it half of the server’s memory. Save and restart the Elasticsearch service as usual:

systemctl restart elasticsearch

You should see the variable in the command line with ps command:

[root@loganalyzer ~]# ps axuf|grep elasticsearch
elastic+   592 10.8 34.4 168638848 5493156 ?   Ssl  00:56   4:23 /usr/share/elasticsearch/jdk/bin/java -Des.networkaddress.cache.ttl=60
-Des.networkaddress.cache.negative.ttl=10 -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 
-Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true
-Dio.netty.recycler.maxCapacityPerThread=0 -Dio.netty.allocator.numDirectArenas=0 -Dlog4j.shutdownHookEnabled=false 
-Dlog4j2.disable.jmx=true -Djava.locale.providers=COMPAT 
-Xms4g -Xmx4g 
-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly 
-Djava.io.tmpdir=/tmp/elasticsearch-16851535740012150929 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/lib/elasticsearch 
-XX:ErrorFile=/var/log/elasticsearch/hs_err_pid%p.log elasticsearch
-Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m 
-XX:MaxDirectMemorySize=2147483648 -Des.path.home=/usr/share/elasticsearch -Des.path.conf=/etc/elasticsearch 
-Des.distribution.flavor=default -Des.distribution.type=rpm -Des.bundled_jdk=true 
-cp /usr/share/elasticsearch/lib/* org.elasticsearch.bootstrap.Elasticsearch -p /var/run/elasticsearch/elasticsearch.pid --quiet
elastic+   690  0.0  0.0  70448  4516 ?        Sl   00:56   0:00  \_ /usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/bin/controller

The environment variable ES_JAVA_OPTS could be used, too.

ES_JAVA_OPTS="-Xms4g -Xmx4g" ./bin/elasticsearch 

Install OpenStack swift client only

To manage files in the OpenStack cloud you need the swift client. This is not so tiny python tool (a lot of dependencies), which offers by means of files (as files are objects for the OpenStack):

  • delete – Delete files, directories and sub-directories. Be careful with a simple command you can delete all your files at once. “Delete a container or objects within a container.”
  • download – Download files to your local storage. “Download objects from containers.”
  • list – List all files in the main directory (i.e. the container), cannot be listed files in sub-directories. The output will be a file with path per line or you can enable extended output to show file size, time modified, the type of the file and the full file path. “Lists the containers for the account or the objects for a container.”
  • post – Creates containers and updates metadata, could create seurity keys for temporary urls – “Updates meta information for the account, container, or object; creates containers if not present.”
  • copy – Copy files to a new place within a container or to a different container. “Copies object, optionally adds meta”
  • stat – Display information for files, container and your account. No information is available per directory. Most typically you will read the file length, the type and modifcation date. “Displays information for the account, container, or object”
  • upload – uploads files and whole directories in a container – “Uploads files or directories to the given container.” You can read our article here – Upload files and directories with swift in OpenStack
  • capabilities – List the configuration options of your account like file size limits, file amount limits , limit requests per seconds and so on and which additional middleware your instance supports like bulk_delete, bulk_upload (the name are self-explainatory) – “List cluster capabilities.”
  • tempurl – Generate a temporary url for a file protected with time validity and a key – “Create a temporary URL.”
  • auth – Show your authentication token and storage url – “Display auth related environment variables.”

The text in the hyphens is from the swift help command. If you do not know what is a container with simple words these are the main sub-directories in your account if you use the list command without any argument (not adding a name after “list”):

myuser@myserver:~$ swift --os-username myusr --os-tenant-name myusr --os-password mypass --os-auth-url https://auth01.example.com:5000/v2.0 list
mycontainer1
anyname

The best way is to install the swift client from the package system of your Linux Distribution. The package system ensures the programs you install and upgrade are consistent within dependencies upgrades.

When you install this package “python-swiftclient” it depends on multiple python packages, which might be upgraded later, too, but the package system of a Linux distribution ensure the programs of python-swiftclient will work with them

. On the contrary, if you install it manually even with “pip” as it is offered in OpenStack site it is unsure what will happen and even your client program from “python-swiftclient” could stop working because of an upgraded dependancy library. Of course, the packages from the official package system could be a little bit older (but probably more stable) than the manual installs from the OpenStack official site. Still if you would like to install it with “pip” here you can check how you can do it: https://www.swiftstack.com/docs/integration/python-swiftclient.html
Keep on reading!

CentOS 7 dracut-initqueue timeout and could not boot – warning /dev/disk/by-id/md-uuid- does not exist

Let’s say you update your software raid layout – create, delete or modify your software raid and reboot the system and your server does not start normally. After loading your remote video console (KVM) you see the boot process reports for a missing device and you are under console (dracut console). Your system is in “Emergency mode”.

The warning:

dracut-initqueue[504]: Warning: dracut-initqueue timeout - starting timeout scripts
dracut-initqueue[504]: Warning: dracut-initqueue timeout - starting timeout scripts
dracut-initqueue[504]: Warning: dracut-initqueue timeout - starting timeout scripts
....
....
dracut-initqueue[504]: Warning: could not boot.
dracut-initqueue[504]: Warning: /dev/disk/by-id/md-uuid-2fdc509e:8dd05ed3:c2350cb4:ea5a620d does not exist
      Starting Dracut Emergency Shell...
Warning: /dev/disk/by-id/md-uuid-2fdc509e:8dd05ed3:c2350cb4:ea5a620d does not exist

Generating "/run/initramfs/rdsosreport.txt"


Entering emergency mode. Exit the shell to continue.
Type "journalctl" to view system logs.
You might want to save "/run/initramfs/rdsosreport.txt" to a USB stick or /boot
after mounting them and attach it to a bug report.


dracut:/#

This article is for problems, which occur after manipulating a storage RAID devices, not the system (root) or boot devices!!! If the missing device the RAIDmd-uuid-2fdc509e:8dd05ed3:c2350cb4:ea5a620d does not exist” is either the root or the boot device, the propose solution here would not help with just exiting the Emergency Shell! In those cases, when the missing device is the root or boot before exiting the Emergency Shell the problem must be resolved, so the devices and their file system should be available. There is another article on the subject, which may help the reader in such cases – CentOS 8 dracut-initqueue timeout and could not boot – warning /dev/disk/by-id/md-uuid- does not exist – inactive raids.

SCREENSHOT 1) The boot process reports mutiple warning messages of dracut-initqueue timeout, because a drive cannot be found.

main menu
Warning: dracut-initqueue timeout – starting timeout scripts

Keep on reading!