Delete Glusterfs volume when a peer is down – failed: Some of the peers are down

Deleting GlusterFS volumes may fail with an error, pointing out some of the peers are down, i.e. they are disconnected. Even all the volume’s peers of the volume the user is trying to delete are available, still the error appears and it is not possible to delete the volume.
That’s because GlusterFS by design stores the volume configuration spread to all peers – no matter they host a brick/arbiter of the volume or not. If a peer is a part of a GlusterFS setup, it is mandatory to be available and online in the peer status, to be able to delete a volume.
If the user still wants to delete the volume:

  1. * Force remove the brink, which was hosted on the detached peer. If any!
  2. Detach the disconnected peer from the peers
  3. Delete the volume

Here are real examples with and without a brick on the unavailable peer.
The initial volumes and peers configuration:

[root@srv1 ~]# gluster volume info
 
Volume Name: VOL1
Type: Replicate
Volume ID: 02ff2995-7307-4f3d-aa24-862edda7ce81
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: ng1:/mnt/storage1/glusterfs/brick1
Brick2: ng3:/mnt/storage1/glusterfs/brick1
Brick3: ng1:/mnt/storage1/glusterfs/arbiter1 (arbiter)
Options Reconfigured:
features.scrub: Active
features.bitrot: on
cluster.self-heal-daemon: enable
storage.linux-io_uring: off
client.event-threads: 4
performance.cache-max-file-size: 50MB
performance.parallel-readdir: on
network.inode-lru-limit: 200000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
performance.cache-size: 2048MB
performance.client-io-threads: on
nfs.disable: on
transport.address-family: inet
 
Volume Name: VOL2
Type: Replicate
Volume ID: fc2e82e4-2576-4bb1-b9bf-c6b2aff10ef0
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: ng1:/mnt/storage1/glusterfs/brick2
Brick2: ng2:/mnt/storage1/glusterfs/brick2
Brick3: ng1:/mnt/storage1/glusterfs/arbiter2 (arbiter)
Options Reconfigured:
features.scrub: Active
features.bitrot: on
cluster.self-heal-daemon: enable
storage.linux-io_uring: off
performance.parallel-readdir: on
network.compression: off
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
features.cache-invalidation: on

[root@srv ~]# gluster peer status
Number of Peers: 2

Hostname: ng1
Uuid: 7953514b-b52c-4a5c-be03-763c3e24eb4e
State: Peer in Cluster (Connected)

Hostname: ng3
Uuid: 3d273834-eca6-4997-871f-1a282ca90fb0
State: Peer in Cluster (Disconnected)

Delete a GlusterFS volume – all bricks and bricks’ peers are available, but another peer is not.

First, the error, when the disconnected peer is still in peer status list.

[root@srv ~]# gluster volume stop VOL2
Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y
volume stop: VOL2: success
[root@srv ~]# gluster volume delete VOL2
Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y
volume delete: VOL2: failed: Some of the peers are down

Keep on reading!

Removing of kwayland-server and kwayland-server” is soft blocking kde-plasma/kwin-5.25.2

A big change for Plasma KDE happened two months ago – a “Merge kwayland-server into kwin“.
So after KDE Plasma 5.25, there is no kwayland-server any more (respectively no kwayland-server with version 5.25 and no package in Gentoo) and it may block a Gentoo update with the following error:

mydesktop root # emerge -va --verbose-conflicts --verbose --backtrack=300 $(qlist -IC|grep -i kde)
......
......
[ebuild     U  ] dev-util/kdevelop-php-22.04.2:5::gentoo [21.12.3:5::gentoo] USE="handbook -debug -test" 1,057 KiB
[ebuild     U  ] kde-apps/umbrello-22.04.2:5::gentoo [21.12.3:5::gentoo] USE="handbook php -debug -test" 5,544 KiB
[ebuild     U  ] kde-apps/kross-interpreters-22.04.2:5::gentoo [21.12.3:5::gentoo] USE="-debug" 149 KiB
[blocks B      ] kde-plasma/kwayland-server ("kde-plasma/kwayland-server" is soft blocking kde-plasma/kwin-5.25.2)

Total: 340 packages (329 upgrades, 5 new, 6 reinstalls), Size of downloads: 1,001,699 KiB
Conflict: 1 block (1 unsatisfied)

 * Error: The above package list contains packages which cannot be
 * installed at the same time on the same system.

  (kde-plasma/kwayland-server-5.24.5-r1:5/5::gentoo, ebuild scheduled for merge) pulled in by
    kde-plasma/kwayland-server
    kde-plasma/kwayland-server:5::gentoo required by @selected 
    kde-plasma/kwayland-server required by @selected 

  (kde-plasma/kwin-5.25.2:5/5::gentoo, ebuild scheduled for merge) pulled in by
    >=kde-plasma/kwin-5.25.2:5 required by (kde-plasma/plasma-desktop-5.25.2:5/5::gentoo, ebuild scheduled for merge) USE="handbook ibus kaccounts scim semantic-desktop -debug -emoji -telemetry -test" ABI_X86="(64)"
    >=kde-plasma/kwin-5.25.2:5[lock] required by (kde-plasma/plasma-meta-5.25.2:5/5::gentoo, ebuild scheduled for merge) USE="accessibility bluetooth browser-integration crash-handler crypt desktop-portal display-manager elogind gtk handbook kwallet legacy-systray networkmanager pulseaudio sddm smart wallpapers -colord -discover (-firewall) -grub -plymouth -sdk -systemd -thunderbolt" ABI_X86="(64)"
    >=kde-plasma/kwin-5.25.2:5 required by (kde-plasma/libkworkspace-5.25.2:5/5::gentoo, ebuild scheduled for merge) USE="-debug -test" ABI_X86="(64)"
    >=kde-plasma/kwin-5.25.2:5 required by (kde-plasma/plasma-workspace-5.25.2:5/5::gentoo, ebuild scheduled for merge) USE="calendar fontconfig geolocation handbook policykit semantic-desktop -appstream -debug -gps -screencast -telemetry -test" ABI_X86="(64)"

emerge could not continue with the upgrade to KDE Platform 5.25.2.

main menu
emerge error

kwayland-server is pulled by selected, but the last version of the package is from 5.24 release, which should immediately signal that there is something wrong with it, because the emerge command shows the latest KDE Plasma version to be 5.25 (with the exact version 5.25.2).

Solution – deselect/remove kde-plasma/kwayland-server

The solution is simple, just deselect it from the world slot to be sure it won’t be pulled again in the future. Remove the package manually if the error still persists, but only deselecting should work. Of course, it should not be selected in the command-line with emerge, neither. In general, such package won’t be available any more.
Always keep eye on the pulled versions and the versions you are trying to install, most of the time the problem is obvious and from a single “wrong/bad” package, which may generate e great deal of erroneous and frightening dependencies output.

mydesktop root # emerge --deselect kwayland-server
>>> Removing kde-plasma/kwayland-server from "world" favorites file...
>>> Removing kde-plasma/kwayland-server:5::gentoo from "world" favorites file...

And now the emerge command is OK and no problem with the dependencies and blocks:

mydesktop root # emerge -va --verbose-conflicts --verbose --backtrack=300 $(qlist -IC|grep -i kde|grep -v kwayland-server)
......
......
[ebuild  N     ] kde-plasma/kwin-5.25.2:5::gentoo  USE="accessibility (caps) handbook lock multimedia -debug -gles2-only -plasma -screencast -test" 6,468 KiB
[uninstall     ] kde-plasma/kwayland-server-5.24.3:5::gentoo  USE="-debug -doc -test" 
[blocks b      ] kde-plasma/kwayland-server ("kde-plasma/kwayland-server" is soft blocking kde-plasma/kwin-5.25.2)
[ebuild     U  ] kde-plasma/libkworkspace-5.25.2:5::gentoo [5.24.3:5::gentoo] USE="-debug -test" 0 KiB
......
......
[ebuild     U  ] kde-apps/akregator-22.04.2:5::gentoo [21.12.3:5::gentoo] USE="handbook -debug -speech -telemetry -test" 2,209 KiB

Total: 339 packages (328 upgrades, 5 new, 6 reinstalls, 1 uninstall), Size of downloads: 1,001,483 KiB
Conflict: 1 block (all satisfied)

More on Gentoo blocking – Gentoo update tips when updating packages with blocks and masked files

Building python 3.10.4 and possibly undefined macro: AC_MSG_ERROR

Emerging the new python 3.10 in Gentoo may lead to the following error, despite all the dependencies installed. This error might also occur in any other Linux distro! During the configure stage the autoconf tool outputs error:

root@srv ~ # cat /var/tmp/portage/dev-lang/python-3.10.5/temp/autoconf.out 
***** autoconf *****
***** PWD: /var/tmp/portage/dev-lang/python-3.10.5/work/Python-3.10.5
***** autoconf --force

configure.ac:59: warning: The macro `AC_CONFIG_HEADER' is obsolete.
configure.ac:59: You should run autoupdate.
./lib/autoconf/status.m4:719: AC_CONFIG_HEADER is expanded from...
configure.ac:59: the top level
configure.ac:911: warning: AC_LINK_IFELSE was called before AC_USE_SYSTEM_EXTENSIONS
./lib/autoconf/specific.m4:364: AC_USE_SYSTEM_EXTENSIONS is expanded from...
configure.ac:911: the top level
configure.ac:2214: warning: The macro `AC_HEADER_STDC' is obsolete.
configure.ac:2214: You should run autoupdate.
./lib/autoconf/headers.m4:704: AC_HEADER_STDC is expanded from...
configure.ac:2214: the top level
configure.ac:4250: warning: The macro `AC_HEADER_TIME' is obsolete.
configure.ac:4250: You should run autoupdate.
./lib/autoconf/headers.m4:743: AC_HEADER_TIME is expanded from...
configure.ac:4250: the top level
configure.ac:18: error: possibly undefined macro: AC_MSG_ERROR
      If this token and others are legitimate, please use m4_pattern_allow.
      See the Autoconf documentation

It appears a dependency is missing! First, build the package sys-devel/autoconf-archive and then the building of python-3.10.5 will finish successfully.

root@srv ~ # emerge -va autoconf-archive

These are the packages that would be merged, in order:

Calculating dependencies... done!
[ebuild  N     ] sys-devel/autoconf-archive-2022.02.11::gentoo  660 KiB

Total: 1 package (1 new), Size of downloads: 660 KiB

Would you like to merge these packages? [Yes/No] yes

Emerging the dev-lang/python-3.10.5::gentoo outputs the error and the building process stops. The error output in emerge command is so informative. The actual error is in the /var/tmp/portage/dev-lang/python-3.10.5/temp/autoconf.out.
Keep on reading!

Recovery of MySQL 8 Cluster instance after server crash and corrupted data in log event

There is a MySQL 8 Cluster InnoDB of three servers and one of the server crashed with a bad RAM. The same setup is described here – Install and deploy MySQL 8 InnoDB Cluster with 3 nodes under CentOS 8 and MySQL Router for HA. The failed server got restarted without clean shutdown and after booting up the MySQL Cluster node tried to recover automatically, but the recover process failed and the node left the group of the three server:

2022-05-31T04:00:00.322469Z 24 [ERROR] [MY-011620] [Repl] Plugin group_replication reported: 'Fatal error during the incremental recovery process of Group Replication. The server will leave the group.'
2022-05-31T04:00:00.322489Z 24 [Warning] [MY-011645] [Repl] Plugin group_replication reported: 'Skipping leave operation: concurrent attempt to leave the group is on-going.'
2022-05-31T04:00:00.322500Z 24 [ERROR] [MY-011712] [Repl] Plugin group_replication reported: 'The server was automatically set into read only mode after an error was detected.'
2022-05-31T04:00:03.448475Z 0 [System] [MY-011504] [Repl] Plugin group_replication reported: 'Group membership changed: This member has left the group.'

The recovery process proposed here follows these steps

  1. Connect with mysqlsh (MySQL Shell) to a MySQL instance, which is currently a part of the cluster group. The member, which left the group is not part any more, though the MySQL Cluster status shows it is part of the cluster topology, but with error.
  2. Remove the bad instance from the MySQL Cluster with removeInstance
  3. Add the instance with addInstance and the recovery process will kick in. The type of the recovery process will be chosen by the setup if not specified. In this case, the setup chooses the Incremental state recovery over (full) clone mode.
  4. Initiate the cluster rescan operation to recovery the group replication and the MySQL Cluster.

mysql

Summery of the recovery process

  • The recovery process was successful.
  • The distributed recovery with Incremental state recovery has finished for 24 hours for 200Mbyte database, which is really strange and the speed was really bad. The instance uses ordinary disks, not SSDs and a 1Gbps network.
  • No need to change or manage the MySQL Router in any of the steps or the recovery stages. It handled the situation from the very beginning by removing the bad instance and then adding it again only after the recovery process had finished successfully.
  • MySQL Shell should be connected to an healthy instance currently a part of the Cluster.

In the console output logs all commands and important lines are highlighted.

STEP 1) Remove the bad instance from the cluster.

The status of the cluster with the bad instance.

[root@db-cluster-3 ~]# mysqlsh
MySQL Shell 8.0.28

Copyright (c) 2016, 2022, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.

Type '\help' or '\?' for help; '\quit' to exit.
 MySQL  JS > \connect clusteradmin@db-cluster-1
Creating a session to 'clusteradmin@db-cluster-1'
Fetching schema names for autocompletion... Press ^C to stop.
Closing old connection...
Your MySQL connection id is 39806649 (X protocol)
Server version: 8.0.28 MySQL Community Server - GPL
No default schema selected; type \use <schema> to set one.
 MySQL  db-cluster-1:33060+ ssl  JS > var cluster = dba.getCluster()
 MySQL  db-cluster-1:33060+ ssl  JS > cluster.status()
{
    "clusterName": "mycluster1", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "db-cluster-1:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active.", 
        "topology": {
            "db-cluster-1:3306": {
                "address": "db-cluster-1:3306", 
                "memberRole": "PRIMARY", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "replicationLag": null, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.28"
            }, 
            "db-cluster-2:3306": {
                "address": "db-cluster-2:3306", 
                "memberRole": "SECONDARY", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "replicationLag": null, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.28"
            }, 
            "db-cluster-3:3306": {
                "address": "db-cluster-3:3306", 
                "instanceErrors": [
                    "ERROR: group_replication has stopped with an error."
                ], 
                "memberRole": "SECONDARY", 
                "memberState": "ERROR", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)", 
                "version": "8.0.28"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "db-cluster-1:3306"
}

Keep on reading!

lxc and interface lo does not exist in virtualized server

Virtualizing a real server with an LXC container is pretty easy – do a rsync and run it. Sometimes there are some glitches when starting the LXC container for the first time. Such errors like the following – no networking available at the start, but when attached to the started container it seems to have the network interfaces with no IPs. Even, though it is possible to set the IPs manually the init scripts do not work.

[root@srv ~]# lxc-start -F -n n7763.node-int.info
lxc-start: live300.mytv.bg: start.c: proc_pidfd_open: 1607 Function not implemented - Failed to send signal through pidfd
INIT: version 2.88 booting

   OpenRC 0.12.4 is starting up Gentoo Linux (x86_64) [LXC]

 * /proc is already mounted
 * Mounting /run ... * /run/openrc: creating directory
 * /run/lock: creating directory
 * /run/lock: correcting owner
 * Caching service dependencies ... [ ok ]
 * setting up tmpfiles.d entries for /dev ... [ ok ]
 * Creating user login records ... [ ok ]
 * Wiping /tmp directory ... [ ok ]
 * Bringing up network interface lo ...RTNETLINK answers: File exists
 [ ok ]
 * Updating /etc/mtab ... [ ok ]
 * Bringing up interface lo
 *   ERROR: interface lo does not exist
 *   Ensure that you have loaded the correct kernel module for your hardware
 * ERROR: net.lo failed to start
 * setting up tmpfiles.d entries ... [ ok ]
INIT: Entering runlevel: 3
 * Loading iptables state and starting firewall ... [ ok ]
 * Bringing up interface lo
 *   ERROR: interface lo does not exist
 *   Ensure that you have loaded the correct kernel module for your hardware
 * ERROR: net.lo failed to start
 * Bringing up interface eth0
 *   ERROR: interface eth0 does not exist
 *   Ensure that you have loaded the correct kernel module for your hardware
 * ERROR: net.eth0 failed to start

And it appeared that the old /dev was still in place, which messed up with virtualization and the init scripts.
The solution is simple just

  1. remove the existing /dev
  2. create a new empty one

And the LXC container of the real server will start with a network as usual.

So when virtualizing a real server into LXC container after doing RSYNC of the storage, it is mandatory to create an empty /dev, /proc, and /sys directories!

More on the LXC containers – Run LXC CentOS 8 container with bridged network under CentOS 8.

git status and bus error on SSD – fix READ errors by recovering part of the file

SSD and Linux encryption may not be the best idea, especially without the TRIM (allow-discards) option (or never executed fstrim?). Nevertheless, this error may occur not only on an SSD device, but just where there is a corrupted file system or device.
In our case, the SSD has some read errors. Apparently, some files or some parts of files could not be read by the git command:

[myuser@dekstop kernel]# git status -v
Bus error 84/115708)

In the case of SSD bad reads, the only working solution is to find and overwrite the problem file(s) or remove the file(s) and recreate them. A more sophisticated solution is to dump the file with dd and skip errors option enabled to another location and then overwrite the old file with the new one. So only the corrupted area of the file will be lost, which in most cases is just one or two sectors, i.e. one or two 512 bytes of data.

STEP 1) Find the bad files with the find command.

Use find Linux command and read all the files with the cat Linux command, so a bad sector will output an input/output error on READ. On write errors won’t be generated, but the sector will be automatically moved to a healthy one (the bad sector is marked and never used more).

[myuser@dekstop kernel]#  find -type f -exec cat {} > /dev/null \;
cat: ./servers/logo_description.txt: Input/output error

If multiple files are found repeat the procedure with each file.

STEP 2) Copy the healthy portion of the file.

The easiest way to remove the error is just to delete the file (or overwrite it), but if the healthy portion of the file is desirable the dd utility may be used to recover it:
Keep on reading!

Debug options for LXC and lxc-start when lxc container could not start

Setup and running LXC container is really easy, but sometimes it is unclear why the LXC container could not start. Most of the time, there is a generic error, which says nothing for the real reason:

root@srv ~ # lxc-start -n test-lxc
lxc-start: test-lxc: lxccontainer.c: wait_on_daemonized_start: 867 Received container state "ABORTING" instead of "RUNNING"
lxc-start: test-lxc: tools/lxc_start.c: main: 306 The container failed to start
lxc-start: test-lxc: tools/lxc_start.c: main: 309 To get more details, run the container in foreground mode
lxc-start: test-lxc: tools/lxc_start.c: main: 311 Additional information can be obtained by setting the --logfile and --logpriority options

No specific reason why the LXC container test-lxc can not be started and the lxc-start command failed. There is just an offer to use the logging options and here is how the administrator of the box may do it by including the following lxc-start options:

-l DEBUG –logfile=test-lxc.log –logpriority=9

Here is a real-world example of an old kernel trying to run LXC 4.0
Keep on reading!

Gentoo – bash: su: command not found – missing su flag

Upgrading multiple packages may lead to interesting results especially if the queue has not finished yet or the fails with an error! Apparently, there are two main ways to have the basic command SU in the system installed:

  1. sys-apps/shadow
  2. sys-apps/util-linux

At some point, the default inclusion of su flags in the above packages had changed from sys-apps/shadow to sys-apps/util-linux, which may lead to the following interesting error:

user@srv ~ $ su
bash: /bin/su: command not found

Just check, which of the above packages includes the su flag and re-emerge it. At present, sys-apps/util-linux includes it by default and it should work without any explicit activation in portage package use (i.e. /etc/portage/package.use/mybase or /etc/portage/make.conf, for example) file.

At the moment, here is the default:

user@srv ~ # emerge -vp shadow util-linux

These are the packages that would be merged, in order:

Calculating dependencies... done!
[ebuild   R    ] sys-apps/shadow-4.11.1:0/4::gentoo  USE="acl (audit) nls pam (selinux) (split-usr) xattr -bcrypt -cracklib -skey -su" 0 KiB
[ebuild   R    ] sys-apps/util-linux-2.37.2-r3::gentoo  USE="(audit) (caps) cramfs hardlink logger ncurses nls pam python readline (selinux) (split-usr) su suid udev (unicode) -build -cryptsetup -fdformat -kill -magic (-rtas) -slang -static-libs -systemd -test -tty-helpers" ABI_X86="32 (64) (-x32)" PYTHON_TARGETS="python3_8 -python3_9 -python3_10" 0 KiB

Total: 2 packages (2 reinstalls), Size of downloads: 0 KiB

su flag is missing in sys-apps/shadow and is included in sys-apps/util-linux

Kernel building failure – unable to initialize decompress status for section .debug_info

Upgrading the Gentoo system may lead to some glitches especially if not emerge the world slot very often!
In fact, leaving older packages living with the new one using slots may also lead to glitches of the same type! Here is one example, where leaving old packages may prevent the user to build some packages or the Linux kernel itself.

.....
.....
  x86_64-pc-linux-gnu-ld.bfd -r -m elf_x86_64 --build-id=sha1  -T scripts/module.lds -o lib/slub_kunit.ko lib/slub_kunit.o lib/slub_kunit.mod.o;  true
  x86_64-pc-linux-gnu-ld.bfd -r -m elf_x86_64 --build-id=sha1  -T scripts/module.lds -o lib/ts_bm.ko lib/ts_bm.o lib/ts_bm.mod.o;  true
  x86_64-pc-linux-gnu-ld.bfd -r -m elf_x86_64 --build-id=sha1  -T scripts/module.lds -o lib/ts_fsm.ko lib/ts_fsm.o lib/ts_fsm.mod.o;  true
  x86_64-pc-linux-gnu-ld.bfd -r -m elf_x86_64 --build-id=sha1  -T scripts/module.lds -o lib/ts_kmp.ko lib/ts_kmp.o lib/ts_kmp.mod.o;  true
  x86_64-pc-linux-gnu-ld.bfd -r -m elf_x86_64 --build-id=sha1  -T scripts/module.lds -o mm/kfence/kfence_test.ko mm/kfence/kfence_test.o mm/kfence/kfence_test.mod.o;  true
  x86_64-pc-linux-gnu-ld.bfd -r -m elf_x86_64 --build-id=sha1  -T scripts/module.lds -o net/6lowpan/6lowpan.ko net/6lowpan/6lowpan.o net/6lowpan/6lowpan.mod.o;  true
  x86_64-pc-linux-gnu-ld.bfd -r -m elf_x86_64 --build-id=sha1  -T scripts/module.lds -o net/6lowpan/nhc_dest.ko net/6lowpan/nhc_dest.o net/6lowpan/nhc_dest.mod.o;  true
  x86_64-pc-linux-gnu-ld.bfd -r -m elf_x86_64 --build-id=sha1  -T scripts/module.lds -o net/6lowpan/nhc_fragment.ko net/6lowpan/nhc_fragment.o net/6lowpan/nhc_fragment.mod.o;  true
  x86_64-pc-linux-gnu-ld.bfd -r -m elf_x86_64 --build-id=sha1  -T scripts/module.lds -o net/6lowpan/nhc_ghc_ext_dest.ko net/6lowpan/nhc_ghc_ext_dest.o net/6lowpan/nhc_ghc_ext_dest.mod.o;  true
x86_64-pc-linux-gnu-ld.bfd: mm/kfence/kfence_test.o: unable to initialize decompress status for section .debug_info
x86_64-pc-linux-gnu-ld.bfd: mm/kfence/kfence_test.o: unable to initialize decompress status for section .debug_info
x86_64-pc-linux-gnu-ld.bfd: mm/kfence/kfence_test.o: unable to initialize decompress status for section .debug_info
x86_64-pc-linux-gnu-ld.bfd: mm/kfence/kfence_test.o: unable to initialize decompress status for section .debug_info
mm/kfence/kfence_test.o: file not recognized: file format not recognized
  x86_64-pc-linux-gnu-ld.bfd -r -m elf_x86_64 --build-id=sha1  -T scripts/module.lds -o net/6lowpan/nhc_ghc_ext_frag.ko net/6lowpan/nhc_ghc_ext_frag.o net/6lowpan/nhc_ghc_ext_frag.mod.o;  true
make[3]: *** [/var/tmp/portage/sys-kernel/gentoo-kernel-5.15.5/work/linux-5.15/scripts/Makefile.modfinal:59: mm/kfence/kfence_test.ko] Error 1
make[3]: *** Waiting for unfinished jobs....
  x86_64-pc-linux-gnu-ld.bfd -r -m elf_x86_64 --build-id=sha1  -T scripts/module.lds -o net/6lowpan/nhc_ghc_ext_hop.ko net/6lowpan/nhc_ghc_ext_hop.o net/6lowpan/nhc_ghc_ext_hop.mod.o;  true
  x86_64-pc-linux-gnu-ld.bfd -r -m elf_x86_64 --build-id=sha1  -T scripts/module.lds -o net/6lowpan/nhc_ghc_ext_route.ko net/6lowpan/nhc_ghc_ext_route.o net/6lowpan/nhc_ghc_ext_route.mod.o;  true
  x86_64-pc-linux-gnu-ld.bfd -r -m elf_x86_64 --build-id=sha1  -T scripts/module.lds -o net/6lowpan/nhc_ghc_icmpv6.ko net/6lowpan/nhc_ghc_icmpv6.o net/6lowpan/nhc_ghc_icmpv6.mod.o;  true
  x86_64-pc-linux-gnu-ld.bfd -r -m elf_x86_64 --build-id=sha1  -T scripts/module.lds -o net/6lowpan/nhc_ghc_udp.ko net/6lowpan/nhc_ghc_udp.o net/6lowpan/nhc_ghc_udp.mod.o;  true
make[2]: *** [/var/tmp/portage/sys-kernel/gentoo-kernel-5.15.5/work/linux-5.15/scripts/Makefile.modpost:140: __modpost] Error 2
make[1]: *** [/var/tmp/portage/sys-kernel/gentoo-kernel-5.15.5/work/linux-5.15/Makefile:1783: modules] Error 2
make[1]: Leaving directory '/var/tmp/portage/sys-kernel/gentoo-kernel-5.15.5/work/build'
make: *** [Makefile:219: __sub-make] Error 2
 * ERROR: sys-kernel/gentoo-kernel-5.15.5::gentoo failed (compile phase):
 *   emake failed
 * 
 * If you need support, post the output of `emerge --info '=sys-kernel/gentoo-kernel-5.15.5::gentoo'`,
 * the complete build log and the output of `emerge -pqv '=sys-kernel/gentoo-kernel-5.15.5::gentoo'`.
 * The complete build log is located at '/var/tmp/portage/sys-kernel/gentoo-kernel-5.15.5/temp/build.log'.
 * The ebuild environment file is located at '/var/tmp/portage/sys-kernel/gentoo-kernel-5.15.5/temp/environment'.
 * Working directory: '/var/tmp/portage/sys-kernel/gentoo-kernel-5.15.5/work/linux-5.15'
 * S: '/var/tmp/portage/sys-kernel/gentoo-kernel-5.15.5/work/linux-5.15'

>>> Failed to emerge sys-kernel/gentoo-kernel-5.15.5, Log file:

>>>  '/var/tmp/portage/sys-kernel/gentoo-kernel-5.15.5/temp/build.log'

 * Messages for package sys-kernel/gentoo-kernel-5.15.5:

 * ERROR: sys-kernel/gentoo-kernel-5.15.5::gentoo failed (compile phase):
 *   emake failed
 * 
 * If you need support, post the output of `emerge --info '=sys-kernel/gentoo-kernel-5.15.5::gentoo'`,
 * the complete build log and the output of `emerge -pqv '=sys-kernel/gentoo-kernel-5.15.5::gentoo'`.
 * The complete build log is located at '/var/tmp/portage/sys-kernel/gentoo-kernel-5.15.5/temp/build.log'.
 * The ebuild environment file is located at '/var/tmp/portage/sys-kernel/gentoo-kernel-5.15.5/temp/environment'.
 * Working directory: '/var/tmp/portage/sys-kernel/gentoo-kernel-5.15.5/work/linux-5.15'
 * S: '/var/tmp/portage/sys-kernel/gentoo-kernel-5.15.5/work/linux-5.15'

The problem appears to be leaving multiple old sys-devel/binutils:

 sys-devel/binutils
    selected: 2.31.1-r3 2.32-r1 2.34 2.35.2  
   protected: none 
     omitted: 2.37_p1-r1

Removing the old versions of sys-devel/binutils 2.31.1-r3 2.32-r1 2.34 2.35.2 and leaving only the latest one solves the problem with the above error “unable to initialize decompress status for section .debug_info”.

emerge -vaC "<sys-devel/binutils-2.37_p1-r1"

Of course, an older version of sys-devel/binutils or a buggy one could lead to such an error! Update the sys-devel/binutils or change the version if the above error is hit.

Elasticsearch failed to set password apm_system error in initial setup

A relatively typical error when installing a single node Elastic Elasticsearch software is when the passwords are set:

[root@loganalyzer elasticsearch]# ./bin/elasticsearch-setup-passwords -v auto

Initiating the setup of passwords for reserved users elastic,apm_system,kibana,kibana_system,logstash_system,beats_system,remote_monitoring_user.
The passwords will be randomly generated and printed to the console.
Please confirm that you would like to continue [y/N]y



Connection failure to: http://192.168.0.4:9200/_security/user/apm_system/_password?pretty failed: Read timed out


ERROR: Failed to set password for user [apm_system].

Such error may prevent the initial password setting of several important passwords and compromise the Elasticsearch software security model. Even including the

discovery.type: single-node

in the /etc/elasticsearch/elasticsearch.yml would lead to such error. The missing option in the configuration /etc/elasticsearch/elasticsearch.yml is:

discovery.seed_hosts: ["node-1"]

By default, this option is commented out and it should be set on initial installation, though it is not required when starting the elasticsearch node (with no security model enabled)!
This is an array with all the servers’ hostnames in the cluster setup. In single-node mode, this option (discovery.seed_hosts) should be set only to the hostname of the single node like in this case “node-1”. This is the hostname of the server. The user must include the user’s current server hostname, not this example name “node-1”!

Setting the right hostname for discovery.seed_hosts in /etc/elasticsearch/elasticsearch.yml would let the user to set all password with the Elasticsearch tool elasticsearch-setup-passwords

The error may occur in a cluster setup with multiple servers, too, if the hosts are not filled in this option – discovery.seed_hosts.
Here is what to expect when executing elasticsearch-setup-passwords (even with some RED indexes):

[root@loganalyzer ~]# cd /usr/share/elasticsearch/
[root@loganalyzer elasticsearch]# ./bin/elasticsearch-setup-passwords -v auto

Your cluster health is currently RED.
This means that some cluster data is unavailable and your cluster is not fully functional.

It is recommended that you resolve the issues with your cluster before running elasticsearch-setup-passwords.
It is very likely that the password changes will fail when run against an unhealthy cluster.

Do you want to continue with the password setup process [y/N]y

Initiating the setup of passwords for reserved users elastic,apm_system,kibana,kibana_system,logstash_system,beats_system,remote_monitoring_user.
The passwords will be randomly generated and printed to the console.
Please confirm that you would like to continue [y/N]y


Changed password for user apm_system
PASSWORD apm_system = judakai2Wai9Saiph8ah

Changed password for user kibana_system
PASSWORD kibana_system = eisiadit3CieG4Requie

Changed password for user kibana
PASSWORD kibana = bi3NohquohLoonaizei1

Changed password for user logstash_system
PASSWORD logstash_system = AhC2kue5eeR4eK1LeeZa

Changed password for user beats_system
PASSWORD beats_system = reeyu8ooj8Eebee5ni2c

Changed password for user remote_monitoring_user
PASSWORD remote_monitoring_user = aeshahx9Ohkoph3rai6a

Changed password for user elastic
PASSWORD elastic = beiPhei4xu5iXailocei

No errors and the password are set successfully.