how to skip mysqlbinlog reading error – ERROR: Found out of order GTID

main menu
MariaDB select gtid_strict_mode

When trying to read the MariaDB binlog files, the reading may be interrupted with an error message, and the reading will stop:

ERROR: Found out of order GTID. Got 0-3-855835750755 after 0-1-855835750756

Some positions are shown but after 24 lines of 1236849 file, the mariadb-binlog / mysqlbinlog program interrupts with an error.

It turns out the problem is that the mariadb-binlog / mysqlbinlog reads the binlog files with GTID strict mode, because it is enabled by default! But the servers may have turned it off, so the binlog files are in the wrong format for the strict mode reading. The mode to read the binlog files should be the same as the MySQL / MariaDB server mode that created the binlog files.

root@srv binlog # mariadb-binlog  mysql-bin.52349 
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!40019 SET @@session.max_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#230820 10:53:45 server id 1  end_log_pos 256 CRC32 0xffa88ac1        Start: binlog v 4, server v 11.0.2-MariaDB-1:11.0.2+maria~ubu2204-log created 230820 10:53:45
BINLOG '
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAEEwwwwwwwwwwwwww
'/*!*/;
# at 256
#230820 10:53:45 server id 1  end_log_pos 379 CRC32 0xd434c2c6        Gtid list [0-5-76636859354,
# 0-4-847593757253,
# 0-2-857258567526,
# 0-3-855835750755,
# 0-1-855835750756,
# 1-4-8637504]
# at 379
#230820 10:53:45 server id 1  end_log_pos 423 CRC32 0xc23ef7af        Binlog checkpoint mysql-bin.52348
ERROR: Found out of order GTID. Got 0-3-855835750755 after 0-1-855835750756

Show the slave status to determine, which mode is in use:

root@srv binlog # mysql
mysql: Deprecated program name. It will be removed in a future release, use '/usr/bin/mariadb' instead
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 371408455
Server version: 11.0.2-MariaDB-1:11.0.2+maria~ubu2204-log mariadb.org binary distribution

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> select @@gtid_strict_mode;
+--------------------+
| @@gtid_strict_mode |
+--------------------+
|                  0 |
+--------------------+
1 row in set (0.000 sec)

MariaDB [(none)]>

When the binlog files are generated with gtid_strict_mode = 0 the “–skip-gtid-strict-mode” option should be used with mariadb-binlog / mysqlbinlog to read the binlog files without errors. More on gtid_strict_mode for MariaDB – https://mariadb.com/docs/skysql-dbaas/ref/mdb/system-variables/gtid_strict_mode/.
Related errors GTID slave position – MariaDB/MySQL replication error – Error during XID COMMIT: failed to update GTID state in mysql.gtid_slave_pos. More on MariaDB on this site with tag mariadb tag.

How To Install Linux, Apache, MySQL (MariaDB), PHP-FPM (LAMP) Stack on CentOS Stream 9

main menu
PHP Version 8.0.20

This article describes how to install a Web server with application back-end PHP and database back-end MySQL using MariaDB. In continuing the same topic, but with different software from the previous article – How To Install Linux, Nginx, MySQL (MariaDB), PHP-FPM (LEMP) Stack on CentOS Stream 9, where the Web server is Nginx with application back-end PHP-FPM, which is a sort of CGI (FastCGI). In this article, the Web server is Apache and the application is again PHP-FPM, because since the CentOS 8 the Apache mod_php is deprecated.
All the software installed throughout this article is from the CentOS Stream 9 official repositories including the EPEL repository. The machine is installed with a minimal installation of CentOS Stream 9 and there is a how-to here – Network installation of CentOS Stream 9 (20220606.0) – minimal server installation.
Here are the steps to perform:

  1. Install, configure and start the database MariaDB.
  2. Install, configure and start the PHP-FPM and PHP cli.
  3. Install, configure and start the Web server Apache 2.x.
  4. Configure the system – firewall and SELinux.
  5. Test the installation with a phpMyAdmin installation.
  6. Bonus – Apache HTTPS with SSL certificate – self-signed and letsencrypt.

STEP 1) Install, configure and start the database MariaDB.

First, install the MariaDB server by:

dnf install -y mariadb-server

To configure the MariaDB server, the main file is /etc/my.cnf, which just includes all files under the folder /etc/my.cnf.d/

[root@srv ~]# cat /etc/my.cnf
#
# This group is read both both by the client and the server
# use it for options that affect everything
#
[client-server]

#
# include all files from the config directory
#
!includedir /etc/my.cnf.d

[root@srv ~]# ls -altr /etc/my.cnf.d/
total 32
-rw-r--r--.  1 root root  295 Mar 25  2022 client.cnf
-rw-r--r--.  1 root root  120 May 18 07:55 spider.cnf
-rw-r--r--.  1 root root  232 May 18 07:55 mysql-clients.cnf
-rw-r--r--.  1 root root  763 May 18 07:55 enable_encryption.preset
-rw-r--r--.  1 root root 1458 Jun 13 13:24 mariadb-server.cnf
-rw-r--r--.  1 root root   42 Jun 13 13:29 auth_gssapi.cnf
drwxr-xr-x.  2 root root 4096 Oct  6 06:34 .
drwxr-xr-x. 81 root root 4096 Oct  6 06:34 ..

The most important file for the MariaDB server is /etc/my.cnf.d/mariadb-server.cnf, where all the server options are included. Under section “[mysqld]” add options to tune the MariaDB server. Supported options could be found here: https://mariadb.com/kb/en/mysqld-options/
Add the following options under “[mysqld]” in /etc/my.cnf.d/mariadb-server.cnf
Keep on reading!

How To Install Linux, Nginx, MySQL (MariaDB), PHP-FPM (LEMP) Stack on CentOS Stream 9

main menu
dnf mariadb

This article presents how to install a Web server with application back-end PHP and database back-end MySQL using MariaDB. All the software installed throughout this article is from the CentOS Stream 9 official repositories including the EPEL repository. The machine is installed with a minimal installation of CentOS Stream 9 and there is a how-to here – Network installation of CentOS Stream 9 (20220606.0) – minimal server installation.
Here are the steps to perform:

  1. Install, configure and start the database MariaDB.
  2. Install, configure and start the PHP-FPM and PHP cli.
  3. Install, configure and start the Web server Nginx.
  4. Configure the system – firewall and SELinux.
  5. Test the installation with a phpMyAdmin installation.
  6. Bonus – Nginx HTTPS with SSL certificate – self-signed and letsencrypt.

STEP 1) Install, configure and start the database MariaDB.

First, install the MariaDB server by:

dnf install -y mariadb-server

To configure the MariaDB server, the main file is /etc/my.cnf, which just includes all files under the folder /etc/my.cnf.d/

[root@srv ~]# cat /etc/my.cnf
#
# This group is read both both by the client and the server
# use it for options that affect everything
#
[client-server]

#
# include all files from the config directory
#
!includedir /etc/my.cnf.d

[root@srv ~]# ls -altr /etc/my.cnf.d/
total 32
-rw-r--r--.  1 root root  295 Mar 25  2022 client.cnf
-rw-r--r--.  1 root root  120 May 18 07:55 spider.cnf
-rw-r--r--.  1 root root  232 May 18 07:55 mysql-clients.cnf
-rw-r--r--.  1 root root  763 May 18 07:55 enable_encryption.preset
-rw-r--r--.  1 root root 1458 Jun 13 13:24 mariadb-server.cnf
-rw-r--r--.  1 root root   42 Jun 13 13:29 auth_gssapi.cnf
drwxr-xr-x.  2 root root 4096 Oct  6 06:34 .
drwxr-xr-x. 81 root root 4096 Oct  6 06:34 ..

The most important file for the MariaDB server is /etc/my.cnf.d/mariadb-server.cnf, where all the server options are included. Under section “[mysqld]” add options to tune the MariaDB server. Supported options could be found here: https://mariadb.com/kb/en/mysqld-options/
Add the following options under “[mysqld]” in /etc/my.cnf.d/mariadb-server.cnf
Keep on reading!

MariaDB/MySQL replication error – Error during XID COMMIT: failed to update GTID state in mysql.gtid_slave_pos

When in aggressive parallel mode MariaDB/MySQL replication could fail with:

Last_Errno: 1942
Last_Error: Error during XID COMMIT: failed to update GTID state in mysql.gtid_slave_pos: 1062: Duplicate entry '0-46158188501' for key 'PRIMARY'

This table is used for tracking the replication process and you might probably just do:

STOP/START SLAVE i.e. restart the replication and it would continue without errors.

MariaDB [(none)]> STOP SLAVE;
Query OK, 0 rows affected (0.08 sec)

MariaDB [(none)]> START SLAVE;
Query OK, 0 rows affected (0.00 sec)

Optimistic or aggressive mode runs conflicting transactions in parallel and it sometimes happens to roll back. In our case probably something happened and the rollback failed and STOP/START saved the replication.

* Additional thoughts

If you try STOP/START and you get the same error, probably it worth trying truncating the table “mysql.gtid_slave_pos” if you do not use GTID Replication feature (the “show slave status” says “Using_Gtid: No”). And even if you use “Using_Gtid: No” you could probably always stop the replication, “change master” to use the old style and start again? Probably switching off the aggressive mode might help, too!
Keep on reading!

Replication hangs with Relay_Master_Log_File mysql-bin.999999 -wrong variables in mariadb (MySQL) show slave status

Several days after another .999999 hang out – mariadb (MySQL) stopped flushing relay-bin log after mysqld-relay-bin.999999, the monitoring of one of the slaves got critical with replication delayed. Then several hours it kept delaying without any apparent reason.
The slave status was weird, the

  • “Relay_Master_Log_File: mysql-bin.999999” and
  • Exec_Master_Log_Pos: 104858214” were not changing,

which is essential for the slave server! In fact without these two values we do not know the real position, which is executed in the slave! And we cannot recover the slave if anything happened.
Keep on reading!

mariadb mysql slave cannot stop and then crashes on start up with segmentation fault or abort

Let’s show you what means

“That’s specifics of InnoDB data storage.”

Here is the what this specific case will teach you:

  1. basic debug with strace
  2. MySQL could hangs in a infinite loop during a shutdown (could be seen with strace)
  3. InnoDB files could get corrupted without a hardware issue, but with a kill -KILL or probably it was corrupted before the shutdown and the kill?
  4. list the opened files (and the IDs = file descriptors) of a mysql process
  5. start a mysql slave without a starting the replication on start with “skip-slave-start”
  6. Remove a corrupted innodb file, which causes database (mysql process) crash leaving your with no database at all – replace a innodb file, start the server, then drop the innodb table with sql command and recreate it to continue using healthy table
  7. 2 rows (4 INTEGER columns) could eat up a lot of space – 12G and probably infinite! === “That’s specifics of InnoDB data storage.”
    This table file is 12 Gigabytes in size!!!

    MariaDB [mysql]> select * from gtid_slave_pos;
    +-----------+----------+-----------+-------------+
    | domain_id | sub_id   | server_id | seq_no      |
    +-----------+----------+-----------+-------------+
    |         0 | 16983943 |       101 | 45790450502 |
    |         0 | 16983944 |       101 | 45790450503 |
    +-----------+----------+-----------+-------------+
    2 rows in set (0.00 sec)
    

Here is presented a specific case with replication, but in your case you may not use replication, your problem table could be another (and your mariadb/mysql server crashes on start up or selecting from a specific table or on shutdown?), so find the problem table and remove it, here we show you how to do it! BACKUP the mysql datadir BEFORE any intervention!

Here we have a situation: a mariadb (mysql) server running as slave to a really busy master server, so you could expect 10 000 update/insert/delete queries. Everything was working till the time we wanted to shutdown the mysql process, which occurred to be impossible.

STEP 1) We tried everything from “systemctl stop mysql” to kill -TERM multiple times

5 hour the mysql process was running with 2000 opened file descriptors to multiple table files. The strace showed this:

[pid 14824] <... io_getevents resumed> []{0, 500000000}) = 0
[pid 14815] <... io_getevents resumed> []{0, 500000000}) = 0
[pid 14824] io_getevents(139723876253696, 1, 256,  <unfinished ...>
[pid 14815] io_getevents(139723876356096, 1, 256,  <unfinished ...>
[pid 14825] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
[pid 14825] futex(0x55d16885deb0, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 14825] futex(0x55d16885dedc, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 3576261, {1525268239, 312230000}, ffffffff <unfinished ...>
[pid 14852] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
[pid 14852] futex(0x55d16885dd30, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 14852] futex(0x55d16885dd5c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 2775717, {1525268240, 308225000}, ffffffff <unfinished ...>
[pid 14825] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
[pid 14825] futex(0x55d16885deb0, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 14825] futex(0x55d16885dedc, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 3576263, {1525268240, 304889000}, ffffffff <unfinished ...>
[pid 14862] <... nanosleep resumed> NULL) = 0
[pid 14862] nanosleep({1, 0},  <unfinished ...>
[pid 14821] <... io_getevents resumed> []{0, 500000000}) = 0
[pid 14821] io_getevents(139723876315136, 1, 256,  <unfinished ...>
[pid 14820] <... io_getevents resumed> []{0, 500000000}) = 0

And this has been repeating many times for hours without any disk activity on flushing any IO…so no use to wait for something, which apparently won’t finish at all.

STEP 2) So a

kill -9

was used to stop the mysql process. What could go wrong??? InnoDB is awesome and cannot corrupt if the hardware is OK, right? Yeahhh right…
When the start command was executed, the mysql process started, the innodb engine recovery completed successfully and after 5 minutes without listening socket and heavy IO reading there is the segmentation fault crash and YOU have no database….
So here is one of the crashes taken from the log:

2018-05-02 19:51:54 139990018914496 [ERROR] Plugin 'innodb' already installed
2018-05-02 19:51:54 139990018914496 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2018-05-02 19:51:54 139990018914496 [Note] InnoDB: Uses event mutexes
2018-05-02 19:51:54 139990018914496 [Note] InnoDB: Compressed tables use zlib 1.2.8
2018-05-02 19:51:54 139990018914496 [Note] InnoDB: Using Linux native AIO
2018-05-02 19:51:54 139990018914496 [Note] InnoDB: Number of pools: 1
2018-05-02 19:51:54 139990018914496 [Note] InnoDB: Using SSE2 crc32 instructions
2018-05-02 19:51:54 139990018914496 [Note] InnoDB: Initializing buffer pool, total size = 64G, instances = 8, chunk size = 128M
2018-05-02 19:51:56 139990018914496 [Note] InnoDB: Completed initialization of buffer pool
2018-05-02 19:51:56 139913709942528 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2018-05-02 19:51:56 139990018914496 [Note] InnoDB: Highest supported file format is Barracuda.
2018-05-02 19:51:56 139990018914496 [Note] InnoDB: Starting crash recovery from checkpoint LSN=311205983427362
2018-05-02 19:51:57 139990018914496 [Note] InnoDB: Last binlog file '/var/log/mysql-binlog/mysql-bin.227009', position 38590879
2018-05-02 19:52:12 139990018914496 [Note] InnoDB: 128 out of 128 rollback segments are active.
2018-05-02 19:52:12 139990018914496 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
2018-05-02 19:52:12 139990018914496 [Note] InnoDB: Creating shared tablespace for temporary tables
2018-05-02 19:52:12 139990018914496 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2018-05-02 19:52:12 139990018914496 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2018-05-02 19:52:12 139990018914496 [Note] InnoDB: Waiting for purge to start
2018-05-02 19:52:12 139990018914496 [Note] InnoDB: 5.7.21 started; log sequence number 311205983427371
2018-05-02 19:52:12 139915446597376 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
2018-05-02 19:52:12 139990018914496 [Note] Plugin 'FEEDBACK' is disabled.
2018-05-02 19:52:12 139990018914496 [Note] Recovering after a crash using tc.log
2018-05-02 19:52:12 139990018914496 [Note] Starting crash recovery...
2018-05-02 19:52:12 139990018914496 [Note] Crash recovery finished.
2018-05-02 19:52:12 139990018914496 [Note] Server socket created on IP: '0.0.0.0'.
2018-05-02 19:52:12 139990018914496 [Warning] 'user' entry 'root@srvdns2' ignored in --skip-name-resolve mode.
2018-05-02 19:52:12 139990018914496 [Warning] 'proxies_priv' entry '@% root@srvdns1' ignored in --skip-name-resolve mode.
2018-05-02 19:53:52 139915446597376 [Note] InnoDB: Buffer pool(s) load completed at 180502 19:53:52
180502 19:57:12 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.

Server version: 10.2.13-MariaDB-10.2.13+maria~xenial
key_buffer_size=2147483648
read_buffer_size=262144
max_used_connections=0
max_threads=10002
thread_count=6
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 6871600 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7f40700009a8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f515c4fecf8 thread_stack 0x49000
*** buffer overflow detected ***: /usr/sbin/mysqld terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f51f54eb7e5]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x5c)[0x7f51f558d15c]
/lib/x86_64-linux-gnu/libc.so.6(+0x117160)[0x7f51f558b160]
/lib/x86_64-linux-gnu/libc.so.6(+0x1190a7)[0x7f51f558d0a7]
/usr/sbin/mysqld(my_addr_resolve+0xde)[0x55a9e9f1832e]
/usr/sbin/mysqld(my_print_stacktrace+0x1e2)[0x55a9e9eff2e2]
/usr/sbin/mysqld(handle_fatal_signal+0x345)[0x55a9e9999b95]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7f51f5eda390]
/lib/x86_64-linux-gnu/libc.so.6(+0x9f849)[0x7f51f5513849]
/usr/sbin/mysqld(insert_dynamic+0x2a)[0x55a9e9ed3a4a]
/usr/sbin/mysqld(_Z25rpl_load_gtid_slave_stateP3THD+0x424)[0x55a9e98c43a4]
/usr/sbin/mysqld(handle_slave_background+0xe7)[0x55a9e97834e7]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7f51f5ed06ba]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f51f557b41d]
======= Memory map: ========
55a9e933e000-55a9ea423000 r-xp 00000000 08:03 148565                     /usr/sbin/mysqld
55a9ea622000-55a9ea6f4000 r--p 010e4000 08:03 148565                     /usr/sbin/mysqld
55a9ea6f4000-55a9ea7aa000 rw-p 011b6000 08:03 148565                     /usr/sbin/mysqld
55a9ea7aa000-55a9eb03d000 rw-p 00000000 00:00 0 
55a9eba46000-55aa4e67b000 rw-p 00000000 00:00 0                          [heap]

STEP 3) So the default way of repairing the InnoDB is to use

innodb_force_recovery

in your my.cnf configuration file:

[mysqld]
innodb_force_recovery=1

But again and again crashes even enabling all the options of innodb_force_recovery=1,2,3,4,5 and last 6. But when using “innodb_force_recovery=4” and 5 and 6 we have some strange additional error:

2018-05-02 21:43:34 139667439187712 [ERROR] InnoDB: Failed to find tablespace for table `mysql`.`gtid_slave_pos` in the cache. Attempting to load the tablespace with space id 63584
2018-05-02 21:43:34 139667439187712 [Warning] InnoDB: Allocated tablespace ID 63584 for mysql/gtid_slave_pos, old maximum was 0

The innodb_force_recovery did not help we still cannot start our MySQL database, but there were two things:

  1. No errors were reported by the InnoDB Engine – “InnoDB: Buffer pool(s) load completed” and Crash recovery always finished without errors
  2. The MySQL starts successfully, but not listening socket and then after 3~5 minutes of extensive IO reading from the disk by the mysql process it crashes

It was like something big was loading just on the start and in fact needed on the very beginning, so what start immediately after successful load of the mysql process?

REPLICATION

Remember this is a slave!

STEP 4) So trying to prevent the start of the replication on start of the mysql process with:

[mysqld]
skip-slave-start

STEP 5) strace and file descritor to find the offender table file.

The skip-slave-start option DID the trick – NO Segmentation fault

the mysql processes immediately after successful engines load began to listen on sockets. So the big offender was probably something connected with the replication! A big table used in the replication? How to find it, ok remove the “skip-slave-start”, then start the mysql process and wait for the time the IO read kicks in, then strace the mysql process (find the ID of the process with ps – the first number of the following command is the ID of the mysql process):

srv@local ~ # ps axuf|grep mysql|grep -v grep
mysql    6969  239  8.3 78561320 10983848 ?   Ssl  22:53 108:59 /usr/sbin/mysqld
srv@local ~ # strace -f -p 6969
strace -f -p 6969
strace: Process 6969 attached with 27 threads
[pid  6996] rt_sigtimedwait([HUP QUIT ALRM TERM TSTP], NULL, NULL, 8 <unfinished ...>
[pid  6994] futex(0x55941cb5d1ec, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid  6995] futex(0x7f93c6de1d24, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid  6993] restart_syscall(<... resuming interrupted nanosleep ...> <unfinished ...>
[pid  6992] futex(0x55941cb5d0ec, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid  6991] futex(0x55941cb5cd6c, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid  6990] restart_syscall(<... resuming interrupted futex ...> <unfinished ...>
[pid  6997] mremap(0x7f9298b76000, 2210828288, 2210836480, MREMAP_MAYMOVE <unfinished ...>
[pid  6989] restart_syscall(<... resuming interrupted futex ...> <unfinished ...>
[pid  6988] restart_syscall(<... resuming interrupted futex ...> <unfinished ...>
[pid  6986] futex(0x55947cfe941c, FUTEX_WAIT_PRIVATE, 33, NULL <unfinished ...>
[pid  6985] futex(0x55947cfe941c, FUTEX_WAIT_PRIVATE, 31, NULL <unfinished ...>
[pid  6984] futex(0x55947cfe941c, FUTEX_WAIT_PRIVATE, 32, NULL <unfinished ...>
[pid  6997] <... mremap resumed> )      = 0x7f9298b76000
[pid  6984] <... futex resumed> )       = -1 EAGAIN (Resource temporarily unavailable)
[pid  6983] restart_syscall(<... resuming interrupted futex ...> <unfinished ...>
[pid  6984] futex(0x55947cfe941c, FUTEX_WAIT_PRIVATE, 33, NULL <unfinished ...>
[pid  6982] io_getevents(140347217702912, 1, 256,  <unfinished ...>
[pid  6981] io_getevents(140347217723392, 1, 256,  <unfinished ...>
[pid  6985] <... futex resumed> )       = -1 EAGAIN (Resource temporarily unavailable)
[pid  6980] io_getevents(140347217743872, 1, 256,  <unfinished ...>
[pid  6979] io_getevents(140347217764352, 1, 256,  <unfinished ...>
[pid  6978] io_getevents(140347217825792, 1, 256,  <unfinished ...>
[pid  6977] io_getevents(140347217846272, 1, 256,  <unfinished ...>
[pid  6985] futex(0x55947cfe941c, FUTEX_WAIT_PRIVATE, 33, NULL <unfinished ...>
[pid  6976] io_getevents(140347217866752, 1, 256,  <unfinished ...>
[pid  6975] io_getevents(140347219357696, 1, 256,  <unfinished ...>
[pid  6974] io_getevents(140347217784832, 1, 256,  <unfinished ...>
[pid  6973] io_getevents(140347217805312, 1, 256,  <unfinished ...>
[pid  6971] restart_syscall(<... resuming interrupted futex ...> <unfinished ...>
[pid  6970] restart_syscall(<... resuming interrupted futex ...> <unfinished ...>
[pid  6969] futex(0x55941a4be944, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid  6997] pread64(38, "\201L\321\365\0\5\315\3\0\5\315\0\0\5\315\6\0\1\30\307o\303\365]E\277\0\0\0\0\0\0"..., 16384, 6228590592) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210836480, 2210844672, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "\210\271\277\343\0\5\315\6\0\5\315\3\0\5\315\7\0\1\30\307o\343\362~E\277\0\0\0\0\0\0"..., 16384, 6228639744) = 16384
[pid  6997] pread64(38, "\305\17\303k\0\5\315\7\0\5\315\6\0\5\315\10\0\1\30\307o\377\224gE\277\0\0\0\0\0\0"..., 16384, 6228656128) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210844672, 2210852864, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "\217\351V~\0\5\315\10\0\5\315\7\0\5\315\n\0\1\30\307p\34\31\363E\277\0\0\0\0\0\0"..., 16384, 6228672512) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210852864, 2210861056, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "\367\321\330\232\0\5\315\n\0\5\315\10\0\5\315\v\0\1\30\307p>\316\360E\277\0\0\0\0\0\0"..., 16384, 6228705280) = 16384
[pid  6997] pread64(38, "\374v\202\177\0\5\315\v\0\5\315\n\0\5\315\r\0\1\30\307p\\alE\277\0\0\0\0\0\0"..., 16384, 6228721664) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210861056, 2210869248, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "s\261\0A\0\5\315\r\0\5\315\v\0\5\315\16\0\1\30\307p\212\1AE\277\0\0\0\0\0\0"..., 16384, 6228754432) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210869248, 2210877440, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "\214\t\7\244\0\5\315\16\0\5\315\r\0\5\315\21\0\1\30\307p\242H\37E\277\0\0\0\0\0\0"..., 16384, 6228770816) = 16384
[pid  6997] pread64(38, "<\n\272\"\0\5\315\21\0\5\315\16\0\5\315\24\0\1\30\307p\311i\313E\277\0\0\0\0\0\0"..., 16384, 6228819968) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210877440, 2210885632, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] mremap(0x7f9298b76000, 2210885632, 2210893824, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "n|I\n\0\5\315\24\0\5\315\21\0\5\315\25\0\1\30\307p\360EqE\277\0\0\0\0\0\0"..., 16384, 6228869120) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210893824, 2210902016, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "\270\206\326\207\0\5\315\25\0\5\315\24\0\5\315\27\0\1\30\307q\1a\251E\277\0\0\0\0\0\0"..., 16384, 6228885504) = 16384
[pid  6997] pread64(38, "{l\226S\0\5\315\27\0\5\315\25\0\5\315\31\0\1\30\307q&\205!E\277\0\0\0\0\0\0"..., 16384, 6228918272) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210902016, 2210910208, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "\201\265]&\0\5\315\31\0\5\315\27\0\5\315\32\0\1\30\307qK\365\212E\277\0\0\0\0\0\0"..., 16384, 6228951040) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210910208, 2210918400, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "\352?\236\327\0\5\315\32\0\5\315\31\0\5\315\33\0\1\30\307qob:E\277\0\0\0\0\0\0"..., 16384, 6228967424) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210918400, 2210926592, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] mremap(0x7f9298b76000, 2210926592, 2210934784, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "3c\33\301\0\5\315\33\0\5\315\32\0\5\315\36\0\1\30\307q\2236%E\277\0\0\0\0\0\0"..., 16384, 6228983808) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210934784, 2210942976, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "Zi\343\177\0\5\315\36\0\5\315\33\0\5\315\37\0\1\30\307q\325\27LE\277\0\0\0\0\0\0"..., 16384, 6229032960) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210942976, 2210951168, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "v5h%\0\5\315\37\0\5\315\36\0\5\315!\0\1\30\307r\f\325\364E\277\0\0\0\0\0\0"..., 16384, 6229049344) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210951168, 2210959360, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "0\6Yn\0\5\315!\0\5\315\37\0\5\315\"\0\1\30\307sCX@E\277\0\0\0\0\0\0"..., 16384, 6229082112) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210959360, 2210967552, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "\203\243K\203\0\5\315\"\0\5\315!\0\5\315$\0\1\30\307t;\234\302E\277\0\0\0\0\0\0"..., 16384, 6229098496) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210967552, 2210975744, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "9\264\332j\0\5\315$\0\5\315\"\0\5\315%\0\1\30\307\177\370#\371E\277\0\0\0\0\0\0"..., 16384, 6229131264) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210975744, 2210983936, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, ":\200+\337\0\5\315%\0\5\315$\0\5\315&\0\1\30\307\200\20U-E\277\0\0\0\0\0\0"..., 16384, 6229147648) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210983936, 2210992128, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "\35-th\0\5\315&\0\5\315%\0\5\315(\0\1\30\307\200+\333CE\277\0\0\0\0\0\0"..., 16384, 6229164032) = 16384
[pid  6997] pread64(38, "\246Y\305\351\0\5\315(\0\5\315&\0\5\315)\0\1\30\307\200[\324\305E\277\0\0\0\0\0\0"..., 16384, 6229196800) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210992128, 2211000320, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "\340(\2\350\0\5\315)\0\5\315(\0\5\315+\0\1\30\307\200\214L\365E\277\0\0\0\0\0\0"..., 16384, 6229213184) = 16384
[pid  6997] mremap(0x7f9298b76000, 2211000320, 2211008512, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "A\374\37\23\0\5\315+\0\5\315)\0\5\315,\0\1\30\307\200\252\30\373E\277\0\0\0\0\0\0"..., 16384, 6229245952) = 16384
[pid  6997] mremap(0x7f9298b76000, 2211008512, 2211016704, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "iW\274\365\0\5\315,\0\5\315+\0\5\315.\0\1\30\307\200\332\266\256E\277\0\0\0\0\0\0"..., 16384, 6229262336) = 16384
[pid  6997] mremap(0x7f9298b76000, 2211016704, 2211024896, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "\343c\340\347\0\5\315.\0\5\315,\0\5\315/\0\1\30\307\200\356\272YE\277\0\0\0\0\0\0"..., 16384, 6229295104) = 16384
[pid  6997] mremap(0x7f9298b76000, 2211024896, 2211033088, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "}\35\32i\0\5\315/\0\5\315.\0\5\3150\0\1\30\307\201\5\226\256E\277\0\0\0\0\0\0"..., 16384, 6229311488) = 16384
[pid  6997] pread64(38, "\322\237\206\377\0\5\3150\0\5\315/\0\5\3151\0\1\30\307\201\257\241\272E\277\0\0\0\0\0\0"..., 16384, 6229327872) = 16384
[pid  6997] mremap(0x7f9298b76000, 2211033088, 2211041280, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "/\250\21!\0\5\3151\0\5\3150\0\5\3152\0\1\30\307\203\3\237\2E\277\0\0\0\0\0\0"..., 16384, 6229344256) = 16384
[pid  6997] mremap(0x7f9298b76000, 2211041280, 2211049472, MREMAP_MAYMOVE) = 0x7f9298b76000

You see multiple lines (thousands) with:

[pid  6997] pread64(38, "/\250\21!\0\5\3151\0\5\3150\0\5\3152\0\1\30\307\203\3\237\2E\277\0\0\0\0\0\0"..., 16384, 6229344256) = 16384
[pid  6997] mremap(0x7f9298b76000, 2211041280, 2211049472, MREMAP_MAYMOVE) = 0x7f9298b76000

The pread64 – reads from file descriptor with ID=38, then you can list all file descriptors for the mysql process ID with:

srv@local mysql # ls -altr /proc/6969/fd
total 0
dr-xr-xr-x 9 mysql mysql  0 May  2 21:32 ..
l-wx------ 1 root  root  64 May  2 21:35 2 -> /var/log/mysql/error.log
dr-x------ 2 root  root   0 May  2 21:35 .
lrwx------ 1 root  root  64 May  2 21:35 9 -> /tmp/ibgyw4AI (deleted)
lrwx------ 1 root  root  64 May  2 21:35 8 -> /tmp/ibuaprWP (deleted)
lrwx------ 1 root  root  64 May  2 21:35 7 -> /tmp/ibfXwsBW (deleted)
lrwx------ 1 root  root  64 May  2 21:35 6 -> /var/lib/mysql/ibdata1
lrwx------ 1 root  root  64 May  2 21:35 5 -> /var/lib/mysql/aria_log.00000001
lr-x------ 1 root  root  64 May  2 21:35 4 -> /var/lib/mysql
lrwx------ 1 root  root  64 May  2 21:35 38 -> /var/lib/mysql/mysql/gtid_slave_pos.ibd
lrwx------ 1 root  root  64 May  2 21:35 37 -> /var/lib/mysql/mysql/event.MYD
lrwx------ 1 root  root  64 May  2 21:35 36 -> /var/lib/mysql/mysql/event.MYI
lrwx------ 1 root  root  64 May  2 21:35 35 -> /var/lib/mysql/mysql/procs_priv.MYD
lrwx------ 1 root  root  64 May  2 21:35 34 -> /var/lib/mysql/mysql/procs_priv.MYI
lrwx------ 1 root  root  64 May  2 21:35 33 -> /var/lib/mysql/mysql/columns_priv.MYD
lrwx------ 1 root  root  64 May  2 21:35 32 -> /var/lib/mysql/mysql/columns_priv.MYI
lrwx------ 1 root  root  64 May  2 21:35 31 -> /var/lib/mysql/mysql/tables_priv.MYD
lrwx------ 1 root  root  64 May  2 21:35 30 -> /var/lib/mysql/mysql/tables_priv.MYI
lrwx------ 1 root  root  64 May  2 21:35 3 -> /var/lib/mysql/aria_log_control
lrwx------ 1 root  root  64 May  2 21:35 29 -> /var/lib/mysql/mysql/roles_mapping.MYD
lrwx------ 1 root  root  64 May  2 21:35 28 -> /var/lib/mysql/mysql/roles_mapping.MYI
lrwx------ 1 root  root  64 May  2 21:35 27 -> /var/lib/mysql/mysql/proxies_priv.MYD
lrwx------ 1 root  root  64 May  2 21:35 26 -> /var/lib/mysql/mysql/proxies_priv.MYI
lrwx------ 1 root  root  64 May  2 21:35 25 -> /var/lib/mysql/mysql/host.MYD
lrwx------ 1 root  root  64 May  2 21:35 24 -> /var/lib/mysql/mysql/host.MYI
lrwx------ 1 root  root  64 May  2 21:35 23 -> /var/lib/mysql/mysql/db.MYD
lrwx------ 1 root  root  64 May  2 21:35 22 -> /var/lib/mysql/mysql/db.MYI
lrwx------ 1 root  root  64 May  2 21:35 21 -> /var/lib/mysql/mysql/user.MYD
lrwx------ 1 root  root  64 May  2 21:35 20 -> /var/lib/mysql/mysql/user.MYI
lrwx------ 1 root  root  64 May  2 21:35 19 -> socket:[49519]
lrwx------ 1 root  root  64 May  2 21:35 18 -> socket:[49518]
lrwx------ 1 root  root  64 May  2 21:35 17 -> /var/lib/mysql/mysql/servers.MYD
lrwx------ 1 root  root  64 May  2 21:35 16 -> /var/lib/mysql/mysql/servers.MYI
lrwx------ 1 root  root  64 May  2 21:35 15 -> /var/lib/mysql/tc.log
lrwx------ 1 root  root  64 May  2 21:35 13 -> /tmp/ib281FZH (deleted)
lrwx------ 1 root  root  64 May  2 21:35 12 -> /var/lib/mysql/ibtmp1
lrwx------ 1 root  root  64 May  2 21:35 11 -> /var/lib/mysql/ib_logfile1
lrwx------ 1 root  root  64 May  2 21:35 10 -> /var/lib/mysql/ib_logfile0
l-wx------ 1 root  root  64 May  2 21:35 1 -> /var/log/mysql/error.log
lrwx------ 1 root  root  64 May  2 21:35 0 -> /dev/pts/3

And again file descriptor with ID=38 is:

/var/lib/mysql/mysql/gtid_slave_pos.ibd

And when you check the size – 12G…..12G for this table? Why? What is used for?

STEP 6) And from the manual:

The mysql.gtid_slave_pos table is used in replication by slave servers to keep track of their current position (the global transaction ID of the last transaction applied). Using the table allows the slave to maintain a consistent value for the gtid_slave_pos system variable across server restarts. See Global Transaction ID.

You should never attempt to modify the table directly. If you do need to change the global gtid_slave_pos value, use SET GLOBAL gtid_slave_pos = ... instead.

The table is updated with the new position as part of each transaction committed during replication. This makes it preferable that the table is using the same storage engine as the tables otherwise being modified in the transaction, since otherwise a multi-engine transaction is needed that can reduce performance.

Hmm “The table is updated with the new position as part of each transaction committed during replication” and 12G table – could it track minillions of transactions? Probably not, in fact this table has only 2-3-4 rows at a given time!!! But the size is 12G, well as we said this is:
“That’s specifics of InnoDB data storage. Did you try to run OPTIMIZE TABLE mysql.gtid_slave_pos? It should allow to reclaim the disk space.” (taken from: https://jira.mariadb.org/browse/MDEV-12318).

STEP 7) Remove a corrupted innodb file, which causes database (mysql process) crash leaving your with no database at all and then recreate the table

Restart your mysql server with “skip-slave-start” again to be able to start it (look above).
So a table probably with couple of rows takes 12G and MariaDB is using it for the replication on start up, but the replication position is kept on another place “master.info”, can we delete this offender file “gtid_slave_pos.ibd”? Yes we can, move the file out of its place (mysql datadir) and then create a second table with:

MariaDB [mysql]> use mysql
MariaDB [mysql]> CREATE TABLE `gtid_slave_pos1` (
    ->   `domain_id` int(10) unsigned NOT NULL,
    ->   `sub_id` bigint(20) unsigned NOT NULL,
    ->   `server_id` int(10) unsigned NOT NULL,
    ->   `seq_no` bigint(20) unsigned NOT NULL,
    ->   PRIMARY KEY (`domain_id`,`sub_id`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='Replication slave GTID position';
Query OK, 0 rows affected (0.01 sec)

Copy gtid_slave_pos1.ibd to gtid_slave_pos.ibd and restart the mysql process, it will report an error for a table mysql.gtid_slave_pos but you will be able to drop the table and then create it with the same name: “gtid_slave_pos” (you could drop the temporary one “gtid_slave_pos1”)

MariaDB [mysql]> use mysql
MariaDB [mysql]> DROP TABLE `gtid_slave_pos`;
Query OK, 0 rows affected (0.01 sec)
MariaDB [mysql]> CREATE TABLE `gtid_slave_pos` (
    ->   `domain_id` int(10) unsigned NOT NULL,
    ->   `sub_id` bigint(20) unsigned NOT NULL,
    ->   `server_id` int(10) unsigned NOT NULL,
    ->   `seq_no` bigint(20) unsigned NOT NULL,
    ->   PRIMARY KEY (`domain_id`,`sub_id`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='Replication slave GTID position';
Query OK, 0 rows affected (0.01 sec)
MariaDB [mysql]> DROP TABLE `gtid_slave_pos1`;
Query OK, 0 rows affected (0.01 sec)

The create table statement is taken from another server probably it is a good idea to do it yourself, login on a healthy mysql server and issue: “show create table mysql.gtid_slave_pos;”.
So now you have a healthy mysql.gtid_slave_posto be used for your replication. Restart your mysql server removing “skip-slave-start” from your configuration file and here it is the replication is OK and running:

MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.10.10.10
                  Master_User: replusr
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.071301
          Read_Master_Log_Pos: 64980976
               Relay_Log_File: mysqld-relay-bin.230012
                Relay_Log_Pos: 80704376
        Relay_Master_Log_File: mysql-bin.090129
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: test
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 80704077
              Relay_Log_Space: 113209377078
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 30944
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 101
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
                   Using_Gtid: No
                  Gtid_IO_Pos: 
      Replicate_Do_Domain_Ids: 
  Replicate_Ignore_Domain_Ids: 
                Parallel_Mode: conservative
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Waiting for room in worker thread event queue
1 row in set (0.00 sec)