Kibana server is not ready yet – and Waiting for that migration to complete in the logs

Now, living in the cloud and big data there is a time when the admin may need to save all their logs in a central place! Elasticsearch and Kibana look good for the job! And after months of hassle-free work of the Elasticsearch and Kibana, Elasticsearch just stopped working and after a restart and upgrade (Elasticsearch and Kibana) Kibana showed an error message:

Kibana server is not ready yet

And if you have tried the STOP/START of Kibana and Elasticsearch and Kibana would still show the above message here is what you should do:

  1. Check if the two services are running! Kibana and Elasticsearch, if some of them is missing start it.
  2. Search for the logs and especially Elasticsearch logs. The first place to check is systemd logs with journalctl program (systemctl status also will point out the problem showing last lines of the logs).
  3. Look for the last lines and if they include

    Another Kibana instance appears to be migrating the index

    This article is probably the right place to solve the issue and start your setup successful.

STEP 1) Running services and analyzing the logs.

If kibana and elastisearch use systemd system to operate it is easy to access the logs with systemctl and journalctl
Check whether the kibana and elasticsearch are running with:

[root@loganalyzer ~]# ps ax|grep elasticsearch|grep -v grep
  258 ?        Ssl  836:31 /usr/share/elasticsearch/jdk/bin/java -Des.networkaddress.cache.ttl=60 -Des.networkaddress.cache.negative.ttl=10 -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dio.netty.allocator.numDirectArenas=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Djava.locale.providers=COMPAT -Xms1g -Xmx1g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Djava.io.tmpdir=/tmp/elasticsearch-13303119363353782625 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/lib/elasticsearch -XX:ErrorFile=/var/log/elasticsearch/hs_err_pid%p.log -Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m -XX:MaxDirectMemorySize=536870912 -Des.path.home=/usr/share/elasticsearch -Des.path.conf=/etc/elasticsearch -Des.distribution.flavor=default -Des.distribution.type=rpm -Des.bundled_jdk=true -cp /usr/share/elasticsearch/lib/* org.elasticsearch.bootstrap.Elasticsearch -p /var/run/elasticsearch/elasticsearch.pid --quiet
  360 ?        Sl     0:00 /usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/bin/controller
[root@loganalyzer ~]# ps ax|grep kibana|grep -v grep
 1284 ?        Ssl    4:32 /usr/share/kibana/bin/../node/bin/node /usr/share/kibana/bin/../src/cli -c /etc/kibana/kibana.yml

If one of the two services are missing you must start it! And second, each service should have only one instance (i.e. process)!
And check the kibana logs with journalctl

[root@loganalyzer ~]# journalctl -u kibana.service
.....
.....
Apr 24 23:09:31 loganalyzer kibana[1219]: {"type":"log","@timestamp":"2020-04-24T23:09:31Z","tags":["info","plugins","bfetch"],"pid":1219,"message":"Setting up plugin"}
Apr 24 23:09:31 loganalyzer kibana[1219]: {"type":"log","@timestamp":"2020-04-24T23:09:31Z","tags":["info","savedobjects-service"],"pid":1219,"message":"Waiting until all Elasticsearch nodes
 are compatible with Kibana before starting saved objects migrations..."}
Apr 24 23:09:31 loganalyzer kibana[1219]: {"type":"log","@timestamp":"2020-04-24T23:09:31Z","tags":["info","savedobjects-service"],"pid":1219,"message":"Starting saved objects migrations"}
Apr 24 23:09:31 loganalyzer kibana[1219]: {"type":"log","@timestamp":"2020-04-24T23:09:31Z","tags":["info","savedobjects-service"],"pid":1219,"message":"Creating index .kibana_task_manager_2
."}
Apr 24 23:09:31 loganalyzer kibana[1219]: {"type":"log","@timestamp":"2020-04-24T23:09:31Z","tags":["warning","savedobjects-service"],"pid":1219,"message":"Unable to connect to Elasticsearch
. Error: [resource_already_exists_exception] index [.kibana_task_manager_2/O070AunfSyG6hwd6_pqqRA] already exists, with { index_uuid=\"O070AunfSyG6hwd6_pqqRA\" & index=\".kibana_task_manager
_2\" }"}
Apr 24 23:09:31 loganalyzer kibana[1219]: {"type":"log","@timestamp":"2020-04-24T23:09:31Z","tags":["warning","savedobjects-service"],"pid":1219,"message":"Another Kibana instance appears to
 be migrating the index. Waiting for that migration to complete. If no other Kibana instance is attempting migrations, you can get past this message by deleting index .kibana_task_manager_2 
and restarting Kibana."}

The systemctl status may be used, too. The error and the index are shown in the last lines of the status output – look below.

The problem here is there was a migration of index .kibana_task_manager_2, but it was abandoned because of unknown reason and now we should delete it to be able to use our kibana service. The index name might be with another name but it is the same problem.

STEP 2) Delete kibana index

Delete the kibana index in elasticsearch backend using curl and HTTP/HTTPS request such as:

[root@loganalyzer kibana]# curl -XDELETE http://192.168.0.2:9200/.kibana_task_manager_2
{"acknowledged":true}

Keep on reading!

systemd service freezes in activating (start-post) status – mysqld or other services

We’ve experienced this with the MySQL server under CentOS 7, but you can have this state with other services!
After updating our MySQL we tried to start it up, but the service got this strange state after “systemctl start” returned:

[root@mysql2 ~]# systemctl start mysqld
Job for mysqld.service failed because a timeout was exceeded. See "systemctl status mysqld.service" and "journalctl -xe" for details.

The timeout is big it’s something like 5 to 10 minutes and so it is typical (do not do it!) to type “ctrl+c” and you end up without this message and a strange state of the mysql:

[root@mysql2 ~]# systemctl status mysqld
● mysqld.service - MySQL Community Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: activating (start-post) since Fri 2018-11-09 09:00:55 UTC; 6min ago
  Process: 8333 ExecStart=/usr/bin/mysqld_safe --basedir=/usr (code=exited, status=0/SUCCESS)
  Process: 8321 ExecStartPre=/usr/bin/mysql-systemd-start pre (code=exited, status=0/SUCCESS)
 Main PID: 8333 (code=exited, status=0/SUCCESS);         : 8334 (mysql-systemd-s)
   CGroup: /user.slice/user-0.slice/session-2395.scope/system.slice/mysqld.service
           └─control
             ├─ 8334 /bin/bash /usr/bin/mysql-systemd-start post
             └─10152 sleep 1

Nov 09 09:00:55 mysql2.mytv.bg systemd[1]: Starting MySQL Community Server...
Nov 09 09:00:56 mysql2.mytv.bg mysqld_safe[8333]: 181109 09:00:56 mysqld_safe Logging to '/var/log/mysqld.log'.
Nov 09 09:00:56 mysql2.mytv.bg mysqld_safe[8333]: 181109 09:00:56 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql

Meanwhile with “pstree”:

[root@mysql2 ~]# pstree
systemd─┬─agetty
        ├─crond
        ├─dbus-daemon
        ├─mysql-systemd-s───sleep
        ├─rsyslogd───2*[{rsyslogd}]
        ├─sshd─┬─sshd───bash───systemctl─┬─systemctl
        │      │                         └─systemd-tty-ask
        │      └─sshd───bash───pstree
        ├─systemd-journal
        └─systemd-logind

So as you can see no mysqld process! Apparently systemctl had tried to start MySQL server process and it failed.
So the first thing to do was to check the MySQL logs. In our case it was a obsolete option in my.cnf:

2018-11-09 09:10:57 11384 [ERROR] /usr/sbin/mysqld: unknown variable 'default-character-set=utf8'
2018-11-09 09:10:57 11384 [ERROR] Aborting

The interesting part is that

the service got “Active: activating (start-post)” and when you fix the problem you cannot “systemctl start mysqld” it just start to wait for the current timeout.

In fact this state means “I’m trying to start the service…” and it is in an endless loop to start the service and if you the service has a big start timeout like 5-10 minutes you must wait for the next iteration of the loop to start the service successfully (if you fixed the problem!). And if you want not to wait you must execute first stop to the service and then start – you’ll not wait for any timeout and you can check immediately if the service was started successfully:

[root@mysql2 ~]# systemctl status mysqld
● mysqld.service - MySQL Community Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: activating (start-post) since Fri 2018-11-09 09:20:56 UTC; 2min 50s ago
  Process: 13208 ExecStart=/usr/bin/mysqld_safe --basedir=/usr (code=exited, status=0/SUCCESS)
  Process: 13196 ExecStartPre=/usr/bin/mysql-systemd-start pre (code=exited, status=0/SUCCESS)
 Main PID: 13208 (code=exited, status=0/SUCCESS);         : 13209 (mysql-systemd-s)
   CGroup: /user.slice/user-0.slice/session-2395.scope/system.slice/mysqld.service
           └─control
             ├─13209 /bin/bash /usr/bin/mysql-systemd-start post
             └─14357 sleep 1

Nov 09 09:20:56 mysql2.mytv.bg systemd[1]: Starting MySQL Community Server...
Nov 09 09:20:56 mysql2.mytv.bg mysqld_safe[13208]: 181109 09:20:56 mysqld_safe Logging to '/var/log/mysqld.log'.
Nov 09 09:20:56 mysql2.mytv.bg mysqld_safe[13208]: 181109 09:20:56 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
[root@mysql2 ~]# systemctl start mysqld
Job for mysqld.service failed because a timeout was exceeded. See "systemctl status mysqld.service" and "journalctl -xe" for details.
[root@mysql2 ~]# systemctl status mysqld
● mysqld.service - MySQL Community Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2018-11-09 09:30:59 UTC; 2s ago
  Process: 15656 ExecStartPost=/usr/bin/mysql-systemd-start post (code=exited, status=0/SUCCESS)
  Process: 15643 ExecStartPre=/usr/bin/mysql-systemd-start pre (code=exited, status=0/SUCCESS)
 Main PID: 15655 (mysqld_safe)
   CGroup: /user.slice/user-0.slice/session-2395.scope/system.slice/mysqld.service
           ├─15655 /bin/sh /usr/bin/mysqld_safe --basedir=/usr
           └─16243 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --log-error=/var/log/mysqld.log --open-files-limit=10000...

Nov 09 09:30:56 mysql2.mytv.bg systemd[1]: Starting MySQL Community Server...
Nov 09 09:30:57 mysql2.mytv.bg mysqld_safe[15655]: 181109 09:30:57 mysqld_safe Logging to '/var/log/mysqld.log'.
Nov 09 09:30:57 mysql2.mytv.bg mysqld_safe[15655]: 181109 09:30:57 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
Nov 09 09:30:59 mysql2.mytv.bg systemd[1]: Started MySQL Community Server.

As you can see we even received error again that the service cannot be started and immediately after that the service status is in normal “active (running)” state! And we waited for around 10 minutes! You can see the times in the logs above.
So to summarize it up:

If you have a service in “activating (start-post)” the service cannot be started because of an error, check and fix the problem and then issue “stop and start”:

[root@mysql2 ~]# systemctl start mysqld
Job for mysqld.service failed because a timeout was exceeded. See "systemctl status mysqld.service" and "journalctl -xe" for details.
[root@mysql2 ~]# systemctl status mysqld
● mysqld.service - MySQL Community Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: activating (start-post) since Fri 2018-11-09 10:05:20 UTC; 2min 17s ago
  Process: 23601 ExecStart=/usr/bin/mysqld_safe --basedir=/usr (code=exited, status=0/SUCCESS)
  Process: 23589 ExecStartPre=/usr/bin/mysql-systemd-start pre (code=exited, status=0/SUCCESS)
 Main PID: 23601 (code=exited, status=0/SUCCESS);         : 23602 (mysql-systemd-s)
   CGroup: /user.slice/user-0.slice/session-2395.scope/system.slice/mysqld.service
           └─control
             ├─23602 /bin/bash /usr/bin/mysql-systemd-start post
             └─24646 sleep 1

Nov 09 10:05:20 mysql2.mytv.bg systemd[1]: Starting MySQL Community Server...
Nov 09 10:05:21 mysql2.mytv.bg mysqld_safe[23601]: 181109 10:05:21 mysqld_safe Logging to '/var/log/mysqld.log'.
Nov 09 10:05:21 mysql2.mytv.bg mysqld_safe[23601]: 181109 10:05:21 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
[root@mysql2 ~]# systemctl stop mysqld
[root@mysql2 ~]# systemctl status mysqld
● mysqld.service - MySQL Community Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Fri 2018-11-09 10:07:52 UTC; 4s ago
  Process: 23602 ExecStartPost=/usr/bin/mysql-systemd-start post (code=killed, signal=TERM)
  Process: 23601 ExecStart=/usr/bin/mysqld_safe --basedir=/usr (code=exited, status=0/SUCCESS)
  Process: 23589 ExecStartPre=/usr/bin/mysql-systemd-start pre (code=exited, status=0/SUCCESS)
 Main PID: 23601 (code=exited, status=0/SUCCESS)

Nov 09 10:05:20 mysql2.mytv.bg systemd[1]: Starting MySQL Community Server...
Nov 09 10:05:21 mysql2.mytv.bg mysqld_safe[23601]: 181109 10:05:21 mysqld_safe Logging to '/var/log/mysqld.log'.
Nov 09 10:05:21 mysql2.mytv.bg mysqld_safe[23601]: 181109 10:05:21 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
Nov 09 10:07:52 mysql2.mytv.bg systemd[1]: Stopped MySQL Community Server.
[root@mysql2 ~]# systemctl start mysqld
[root@mysql2 ~]# systemctl status mysqld
● mysqld.service - MySQL Community Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2018-11-09 10:08:06 UTC; 3s ago
  Process: 24711 ExecStartPost=/usr/bin/mysql-systemd-start post (code=exited, status=0/SUCCESS)
  Process: 24698 ExecStartPre=/usr/bin/mysql-systemd-start pre (code=exited, status=0/SUCCESS)
 Main PID: 24710 (mysqld_safe)
   CGroup: /user.slice/user-0.slice/session-2395.scope/system.slice/mysqld.service
           ├─24710 /bin/sh /usr/bin/mysqld_safe --basedir=/usr
           └─25298 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --log-error=/var/log/mysqld.log --open-files-limit=10000...

Nov 09 10:08:04 mysql2.mytv.bg systemd[1]: Starting MySQL Community Server...
Nov 09 10:08:04 mysql2.mytv.bg mysqld_safe[24710]: 181109 10:08:04 mysqld_safe Logging to '/var/log/mysqld.log'.
Nov 09 10:08:04 mysql2.mytv.bg mysqld_safe[24710]: 181109 10:08:04 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
Nov 09 10:08:06 mysql2.mytv.bg systemd[1]: Started MySQL Community Server.