The impact of enabling MySQL sync_binlog – really high disk IO

If you enable this feature in your MySQL you could

increase your disk IO time and write by 8-10x times.

Generally, this feature could save your replication scheme if a power failure occurs or OS crash and it could guarantee that no transaction is lost from the binary log. When enabled the binary log is synchronized on disk before transactions are committed. You can check the manual here: https://dev.mysql.com/doc/refman/5.7/en/replication-options-binary-log.html#sysvar_sync_binlog and it also says there could be a great impact on disk writes but how many?
So here are two setups:

SETUP 1) 2 x 3T hard drives TOSHIBA DT01ACA300 in software RAID1

The impact of setting the sync_binlog=1 is 8-10 times the IO time and IO writes. Here is images of several hours of sync_binlog=1 and then we disabled it online:

SCREENSHOT 1) Enable the binary log synchronization with sync_binlog=1.

As you can see the increase in the disk IO time and disk write IOPS are significant – somewhere between 5 and 6 times more! The load is not increased more than 1.5x as normal, but it should be noted the server is off-peak and it has plenty of RAM 32G. Still mush load when some other IO appears.

main menu
Set MySQL sync_binlog=1 in a software raid of two hard drives

SCREENSHOT 2) Disabling the binary log synchronization with sync_binlog=0.

The decrease of the disk IO time and disk write IOPS are significant – somewhere between 5 and 6 times more! Everything back to normal.

main menu
Set MySQL sync_binlog=0 in a software raid of two hard drives.

SCREENSHOT 3) Enable the binary log synchronization with sync_binlog=1.

As you can see the increase in the disk IO time and disk write IOPS are significant – somewhere between 8 and 10 times more! The load is not increased, but it should be noted the server is off-peak and it has plenty of RAM 192G. Still mush load when some other IO appears.

SET GLOBAL sync_binlog=0;

SETUP 2) 2 x 960G SSD SAMSUNG SM863 in software RAID1

The impact of setting the sync_binlog=1 is also 8-10 times the IO time and IO writes. Here is images of several hours of sync_binlog=1 and then we disabled it online:

main menu
Set MySQL sync_binlog=1 in a software raid of two enterprise SSDs

SCREENSHOT 4) Disabling the binary log synchronization with sync_binlog=0.

The decrease of the disk IO time and disk write IOPS are significant – somewhere between 8 and 10 times more and even more! Everything back to normal.

main menu
Set MySQL sync_binlog=0 in a software raid of two enterprise SSDs

SCREENSHOT 5) Enable the binary log synchronization with sync_binlog=1.

Just the period of the graphs are bigger. As you can see the increase in the disk IO time and disk write IOPS are significant – somewhere between 8 and 10 times more! The load is not increased, but it should be noted the server is off-peak and it has plenty of RAM 192G. Still mush load when some other IO appears.

main menu
Set MySQL sync_binlog=1 in a software raid of two enterprise SSDs (big period)

SCREENSHOT 6) Disabling the binary log synchronization with sync_binlog=0.

Just the period of the graphs are bigger. The decrease of the disk IO time and disk write IOPS are significant – somewhere between 8 and 10 times more and even more! Everything back to normal.

main menu
Set MySQL sync_binlog=0 in a software raid of two enterprise SSDs (big period)

BONUS – MySQL changed the default value from 0 (disabled) to 1 (enabled) from 5.7 (in fact MySQL >= 5.7.7).

SO BE CAREFUL now when upgrading from older versions like MySQL 5.1, 5.5, 5.6 – you would probably need to disable it in the MySQL configuration file my.cnf.

Upgrade MySQL 5.6 to 5.7 what problems to expect with old my.cnf configuration file

Finally we do not have any more MySQL 5.6 servers. We upgraded our last part of the system with MySQL 5.6 to 5.7. In our opinion this upgrade is one of the major referred to MySQL configuration file my.cnf – multiple deprecated directives are removed in this new 5.7 version so when upgrading you should removed them before restarting or starting the new version if you want to have running MySQL server instance.
Keep in mind our my.cnf are old, they are created with MySQL 5.0 and they are edited in every upgrade to a new version (5.0 to 5.1, 5.1 to 5.5 and 5.5 to 5.6) and when we needed a specific optimization for our work load. And this is only for our configuration, there surely are more deprecated/removed variables in the new version. Here is a good starting point – https://dev.mysql.com/doc/refman/5.7/en/upgrading-from-previous-series.html This article is not how to upgrade your old MySQL 5.6 to the new MySQL 5.7 it shows what problems you might have after you upgrade MySQL 5.6 to the new MySQL 5.7.
There are two parts of this article:

  1. Removed variables, which were perfectly OK in the old version 5.6
  2. Changed default value of variables, which impact greatly the IO or the the SQL execution

The error messages are included, too.

PART 1) Removed variables.

Some MySQL variables first get deprecated and then removed in later versions (some are just renamed) and if they are contained in the my.cnf configuration file your server will not start up at all. The MySQL log shows that the server starts and then throws an error about “unknown variable” and starts a shutdown procedure. So you end up without database server and it is important to remove them from the configuration or find the new name of a renamed one.

2019-02-26T09:50:12.612950Z 0 [ERROR] unknown variable 'key_buffer=512M'
2019-02-26T09:50:36.361870Z 0 [ERROR] unknown variable 'thread_concurrency=6'
2019-02-26T09:51:17.658546Z 0 [ERROR] unknown variable 'thread_cache=10'
2019-02-26T09:51:32.473210Z 0 [ERROR] unknown variable 'innodb_additional_mem_pool_size=256M'

All four

key_buffer, thread_concurrency, thread_cache, innodb_additional_mem_pool_size

MySQL variables were removed and your server won’t start up if they are contained in the configuration. The “key_buffer” has been renamed to “key_buffer_size so replace it with key_buffer_size in your my.cnf. It’s important to replace it, because commenting it out would activate the default value and in this case 8M key_buffer_size, which is pretty low (in fact almost all default values of the MySQL variables are really low and it is a problem and a topic of discussions in many forums).
The “thread_cache” also renamed long ago to “thread_cache_size“, so replace it with thread_cache_size.
thread_concurrency and innodb_additional_mem_pool_size were removed long ago they first stopped doing anything and with this version they removed the variables. As you can see old configuration files could carry on many old names along the years.

The important thing here is you must renamed the ones, which got renamed and remove the ones, which got removed, because your server is not going start up with them in the configuration.

PART 2) Changed default value

Some default values of MySQL variables got changed and if you have not included it in the my.cnf configuration you might be really surprised how big impact they have on the IO or even on the behavior of the SQL statements.

2.1) Our first MySQL variables is

sync_binlog

– the default value was “0” (deactivated synchronization) and now it is “1” (bin log synchronization). This could greatly impact the performance of your MySQL database server with like 8-10 times more writes and IO disk wait time (really!!!) – you can see it here: (coming soon). So if you haven’t used this variable before you should put it in your my.cnf configuration for sure (in [mysqld] section):

sync_binlog=0

do not need to restart the server, just put in the my.cnf configuration file and open a mysql root console and execute:

SET GLOBAL sync_binlog=0;

it can be live changed.

2.2) And the second example is

sql_mode

– the default value was “NO_ENGINE_SUBSTITUTION” and now it is “ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION”, which is a pretty substantial difference. You can lose INSERTs and UPDATEs easily because a much strict mode is activated by default.
For example with an INSERT if you do not set value to a field, which column does not have default value (yes, it is wrong, but it was OK before), your insert won’t be executed and you’ll get an error (or just a FALSE after execution of your query like with PHP PDO). Here is the MySQL explanation:

A value is missing when a new row to be inserted does not contain a value for a non-NULL column that has no explicit DEFAULT clause in its definition.

And more in https://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sql-mode-strict
So if you haven’t used this variable before you should put it in your my.cnf configuration for sure (in [mysqld] section):

sql_mode=NO_ENGINE_SUBSTITUTION

do not need to restart the server, just put in the my.cnf configuration file and open a mysql root console and execute:

SET GLOBAL sql_mode='NO_ENGINE_SUBSTITUTION'

mysql – Error ‘Your password does not satisfy the current policy requirements’ or zero length mysql password

We got this error when granting permissions for one of our new slave server (it could be for an ordinary MySQL server, too):

Error 'Your password does not satisfy the current policy requirements' on query. Default database: ''. Query: 'GRANT REPLICATION SLAVE ON *.* TO 'reusr'@'127.0.01''

It appeared that MySQL has activated by default a password checking plugin and our password in the GRANT (or SET PASSWORD) option didn’t meet the requirements.
So here is what you can do:

OPTION 1) Lower the password policy level

Check the policy level and lower it if it is MEDIUM or HIGH (they are there options LOW=0, MEDIUM=1 the default and HIGH=2). The policy level controls how to check and what is involved in the complexity algorithm for the passwords. More details here – https://dev.mysql.com/doc/refman/5.7/en/validate-password-options-variables.html#sysvar_validate_password_policy. Here is what you have:

[myuser@mysql1 ~]# mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 13
....
....
mysql> SHOW VARIABLES LIKE 'validate%';
+--------------------------------------+--------+
| Variable_name                        | Value  |
+--------------------------------------+--------+
| validate_password_check_user_name    | OFF    |
| validate_password_dictionary_file    |        |
| validate_password_length             | 8      |
| validate_password_mixed_case_count   | 1      |
| validate_password_number_count       | 1      |
| validate_password_policy             | MEDIUM |
| validate_password_special_char_count | 1      |
+--------------------------------------+--------+
7 rows in set (0.00 sec)

So set the validate_password_policy=0 and try again your query:

mysql> set global validate_password_policy=0;
Query OK, 0 rows affected (0.00 sec)

If you still get the error your password is lower than the validate_password_length (=8 by default) so you need to change it at last to 8 characters. But what if you what zero password (or with 1,2,3 characters)? Setting validate_password_length to 0 won’t work, because there is a hard limit to 4, so you cannot set it to 0 event the set query is not reporting error when using 0 with validate_password_length.

You should uninstall the plugin.

OPTION 2) Uninstall the MySQL Validation Plugin

You can uninstall the validation plugin on-the-fly in a working server without restarting or reloading and then you can set whatever password you like.
Here is how to do it:

[myuser@mysql1 ~]# mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 13
....
....
mysql> UNINSTALL PLUGIN validate_password;
Query OK, 0 rows affected (0.03 sec)

mysql> SHOW VARIABLES LIKE 'validate%';
Empty set (0.01 sec)

As you can see no “validate_password” variables are available anymore! Now set your password.
But there is a catch, if you have started the server with “–validate-password=FORCE_PLUS_PERMANENT” (you can check it with “ps axuf|grep mysqld” in the command line) you won’t be able to uninstall the plugin live even with the root MySQL user. So at the end if you do not have root permissions to restart the MySQL service without this option it might be better to change your password or skip the query if it is received by the slave in the MySQL replication bin log.
You can install the plugin again with:

[myuser@mysql1 ~]# mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 13
....
....
mysql> INSTALL PLUGIN validate_password SONAME 'validate_password.so';
Query OK, 0 rows affected (0.00 sec)

And it will be available over restarts, too, because it is registered in “mysql.plugin” table.

mysql – Error ‘Column count of mysql.user is wrong. Expected 45, found 43. The table is probably corrupted’ on query.

If you

upgraded your MySQL server (from 5.6 to 5.7 or above)

or

imported a MySQL dump SQL file from older version

than your current server you may encounter when granting permissions to a user:

Error 'Column count of mysql.user is wrong. Expected 45, found 43. The table is probably corrupted' on query. Default database: ''. Query: 'GRANT REPLICATION SLAVE ON *.* TO 'replusr'@'144.76.156.182''

Do not panic probably it is not corrupted just continue reading.

There is the simple fix, just

execute mysql_upgrade

It will automatically detect what to upgrade and it will upgrade it:

[myuser@mysql1 ~]# screen -R upgrade
[myuser@mysql1 ~]# mysql_upgrade 
Checking if update is needed.
Checking server version.
Running queries to upgrade MySQL server.
Checking system database.
mysql.columns_priv                                 OK
mysql.db                                           OK
mysql.engine_cost                                  OK
mysql.event                                        OK
mysql.func                                         OK
mysql.general_log                                  OK
mysql.gtid_executed                                OK
mysql.help_category                                OK
mysql.help_keyword                                 OK
mysql.help_relation                                OK
mysql.help_topic                                   OK
mysql.host                                         OK
mysql.innodb_index_stats                           OK
mysql.innodb_table_stats                           OK
mysql.ndb_binlog_index                             OK
mysql.plugin                                       OK
mysql.proc                                         OK
mysql.procs_priv                                   OK
mysql.proxies_priv                                 OK
mysql.server_cost                                  OK
mysql.servers                                      OK
mysql.slave_master_info                            OK
mysql.slave_relay_log_info                         OK
mysql.slave_worker_info                            OK
mysql.slow_log                                     OK
mysql.tables_priv                                  OK
mysql.time_zone                                    OK
mysql.time_zone_leap_second                        OK
mysql.time_zone_name                               OK
mysql.time_zone_transition                         OK
mysql.time_zone_transition_type                    OK
mysql.user                                         OK
The sys schema is already up to date (version 1.5.1).
Found 0 sys functions, but expected 22. Re-installing the sys schema.
Upgrading the sys schema.
Checking databases.
phpmyadmin.pma__bookmark                           OK
phpmyadmin.pma__central_columns                    OK
phpmyadmin.pma__column_info                        OK
phpmyadmin.pma__designer_settings                  OK
phpmyadmin.pma__export_templates                   OK
phpmyadmin.pma__favorite                           OK
phpmyadmin.pma__history                            OK
phpmyadmin.pma__navigationhiding                   OK
phpmyadmin.pma__pdf_pages                          OK
phpmyadmin.pma__recent                             OK
phpmyadmin.pma__relation                           OK
phpmyadmin.pma__savedsearches                      OK
phpmyadmin.pma__table_coords                       OK
phpmyadmin.pma__table_info                         OK
phpmyadmin.pma__table_uiprefs                      OK
phpmyadmin.pma__tracking                           OK
phpmyadmin.pma__userconfig                         OK
phpmyadmin.pma__usergroups                         OK
phpmyadmin.pma__users                              OK
sys.sys_config                                     OK
db1.access                                         OK
db1.users                                          OK
db1.objects                                        OK
db1.isp                                            OK
db1.desc                                           OK
Upgrade process completed successfully.
Checking if update is needed.

It works when the server is up and running and it is a good idea to execute the command in a screen.
It does not need to be logged as root, but mysql_upgrade does need to have the root MySQL password. In the example above it did not asked for password, because we have it in ~/.my.cnf file.

Just to note you might upgraded a long before this error to appear!

If you do not use a certain functionality you could live up happily with the old mysql.user scheme (and all old mysql.* tables). In our case we upgraded one of our slaves and several days after when a grant command on the master was issued the replication just stopped with this error! Of course, if someone were used the command in our slave the error would have appeared there sooner.
We also had case where old MySQL SQL dump file (5.6) was imported in a newer MySQL server 5.7 and there had been no issues for weeks till the GRANT command.

perror

Th error code is 1805.

[myuser@mysql1 ~]# perror 1805
MySQL error code 1805 (ER_COL_COUNT_DOESNT_MATCH_CORRUPTED_V2): Column count of %s.%s is wrong. Expected %d, found %d. The table is probably corrupted

Replication hangs with Relay_Master_Log_File mysql-bin.999999 -wrong variables in mariadb (MySQL) show slave status

Several days after another .999999 hang out – mariadb (MySQL) stopped flushing relay-bin log after mysqld-relay-bin.999999, the monitoring of one of the slaves got critical with replication delayed. Then several hours it kept delaying without any apparent reason.
The slave status was weird, the

  • “Relay_Master_Log_File: mysql-bin.999999” and
  • Exec_Master_Log_Pos: 104858214” were not changing,

which is essential for the slave server! In fact without these two values we do not know the real position, which is executed in the slave! And we cannot recover the slave if anything happened.
Keep on reading!

mariadb (MySQL) stopped flushing relay-bin log after mysqld-relay-bin.999999

We have a pretty big MariaDB master-slave setup, which is really under heavy load and suddenly one of our slaves stopped purging relay logs! And soon the free space got less and it was clear were were going to run out of space for the data partition.

Here is the problem we have only 49G left free space for our mysql datadir “/var/lib/mysql-datadir”. You can see the replication is OK and running (it’s not behind – “Seconds_Behind_Master: 0”), the

Relay_Log_File says mysqld-relay-bin.999999

but it is not true! You can see there are tones of more relay logs after “mysqld-relay-bin.999999” – in fact the current MySQL relay binlog file is “mysqld-relay-bin.1071413” (look below we included a listing of the datadir).

And “FLUSH RELAY LOGS” does not help, at all!

It just not freed any byte on the partition and no files were removed after “mysqld-relay-bin.999999” as before.
Keep on reading!

systemd service freezes in activating (start-post) status – mysqld or other services

We’ve experienced this with the MySQL server under CentOS 7, but you can have this state with other services!
After updating our MySQL we tried to start it up, but the service got this strange state after “systemctl start” returned:

[root@mysql2 ~]# systemctl start mysqld
Job for mysqld.service failed because a timeout was exceeded. See "systemctl status mysqld.service" and "journalctl -xe" for details.

The timeout is big it’s something like 5 to 10 minutes and so it is typical (do not do it!) to type “ctrl+c” and you end up without this message and a strange state of the mysql:

[root@mysql2 ~]# systemctl status mysqld
● mysqld.service - MySQL Community Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: activating (start-post) since Fri 2018-11-09 09:00:55 UTC; 6min ago
  Process: 8333 ExecStart=/usr/bin/mysqld_safe --basedir=/usr (code=exited, status=0/SUCCESS)
  Process: 8321 ExecStartPre=/usr/bin/mysql-systemd-start pre (code=exited, status=0/SUCCESS)
 Main PID: 8333 (code=exited, status=0/SUCCESS);         : 8334 (mysql-systemd-s)
   CGroup: /user.slice/user-0.slice/session-2395.scope/system.slice/mysqld.service
           └─control
             ├─ 8334 /bin/bash /usr/bin/mysql-systemd-start post
             └─10152 sleep 1

Nov 09 09:00:55 mysql2.mytv.bg systemd[1]: Starting MySQL Community Server...
Nov 09 09:00:56 mysql2.mytv.bg mysqld_safe[8333]: 181109 09:00:56 mysqld_safe Logging to '/var/log/mysqld.log'.
Nov 09 09:00:56 mysql2.mytv.bg mysqld_safe[8333]: 181109 09:00:56 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql

Meanwhile with “pstree”:

[root@mysql2 ~]# pstree
systemd─┬─agetty
        ├─crond
        ├─dbus-daemon
        ├─mysql-systemd-s───sleep
        ├─rsyslogd───2*[{rsyslogd}]
        ├─sshd─┬─sshd───bash───systemctl─┬─systemctl
        │      │                         └─systemd-tty-ask
        │      └─sshd───bash───pstree
        ├─systemd-journal
        └─systemd-logind

So as you can see no mysqld process! Apparently systemctl had tried to start MySQL server process and it failed.
So the first thing to do was to check the MySQL logs. In our case it was a obsolete option in my.cnf:

2018-11-09 09:10:57 11384 [ERROR] /usr/sbin/mysqld: unknown variable 'default-character-set=utf8'
2018-11-09 09:10:57 11384 [ERROR] Aborting

The interesting part is that

the service got “Active: activating (start-post)” and when you fix the problem you cannot “systemctl start mysqld” it just start to wait for the current timeout.

In fact this state means “I’m trying to start the service…” and it is in an endless loop to start the service and if you the service has a big start timeout like 5-10 minutes you must wait for the next iteration of the loop to start the service successfully (if you fixed the problem!). And if you want not to wait you must execute first stop to the service and then start – you’ll not wait for any timeout and you can check immediately if the service was started successfully:

[root@mysql2 ~]# systemctl status mysqld
● mysqld.service - MySQL Community Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: activating (start-post) since Fri 2018-11-09 09:20:56 UTC; 2min 50s ago
  Process: 13208 ExecStart=/usr/bin/mysqld_safe --basedir=/usr (code=exited, status=0/SUCCESS)
  Process: 13196 ExecStartPre=/usr/bin/mysql-systemd-start pre (code=exited, status=0/SUCCESS)
 Main PID: 13208 (code=exited, status=0/SUCCESS);         : 13209 (mysql-systemd-s)
   CGroup: /user.slice/user-0.slice/session-2395.scope/system.slice/mysqld.service
           └─control
             ├─13209 /bin/bash /usr/bin/mysql-systemd-start post
             └─14357 sleep 1

Nov 09 09:20:56 mysql2.mytv.bg systemd[1]: Starting MySQL Community Server...
Nov 09 09:20:56 mysql2.mytv.bg mysqld_safe[13208]: 181109 09:20:56 mysqld_safe Logging to '/var/log/mysqld.log'.
Nov 09 09:20:56 mysql2.mytv.bg mysqld_safe[13208]: 181109 09:20:56 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
[root@mysql2 ~]# systemctl start mysqld
Job for mysqld.service failed because a timeout was exceeded. See "systemctl status mysqld.service" and "journalctl -xe" for details.
[root@mysql2 ~]# systemctl status mysqld
● mysqld.service - MySQL Community Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2018-11-09 09:30:59 UTC; 2s ago
  Process: 15656 ExecStartPost=/usr/bin/mysql-systemd-start post (code=exited, status=0/SUCCESS)
  Process: 15643 ExecStartPre=/usr/bin/mysql-systemd-start pre (code=exited, status=0/SUCCESS)
 Main PID: 15655 (mysqld_safe)
   CGroup: /user.slice/user-0.slice/session-2395.scope/system.slice/mysqld.service
           ├─15655 /bin/sh /usr/bin/mysqld_safe --basedir=/usr
           └─16243 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --log-error=/var/log/mysqld.log --open-files-limit=10000...

Nov 09 09:30:56 mysql2.mytv.bg systemd[1]: Starting MySQL Community Server...
Nov 09 09:30:57 mysql2.mytv.bg mysqld_safe[15655]: 181109 09:30:57 mysqld_safe Logging to '/var/log/mysqld.log'.
Nov 09 09:30:57 mysql2.mytv.bg mysqld_safe[15655]: 181109 09:30:57 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
Nov 09 09:30:59 mysql2.mytv.bg systemd[1]: Started MySQL Community Server.

As you can see we even received error again that the service cannot be started and immediately after that the service status is in normal “active (running)” state! And we waited for around 10 minutes! You can see the times in the logs above.
So to summarize it up:

If you have a service in “activating (start-post)” the service cannot be started because of an error, check and fix the problem and then issue “stop and start”:

[root@mysql2 ~]# systemctl start mysqld
Job for mysqld.service failed because a timeout was exceeded. See "systemctl status mysqld.service" and "journalctl -xe" for details.
[root@mysql2 ~]# systemctl status mysqld
● mysqld.service - MySQL Community Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: activating (start-post) since Fri 2018-11-09 10:05:20 UTC; 2min 17s ago
  Process: 23601 ExecStart=/usr/bin/mysqld_safe --basedir=/usr (code=exited, status=0/SUCCESS)
  Process: 23589 ExecStartPre=/usr/bin/mysql-systemd-start pre (code=exited, status=0/SUCCESS)
 Main PID: 23601 (code=exited, status=0/SUCCESS);         : 23602 (mysql-systemd-s)
   CGroup: /user.slice/user-0.slice/session-2395.scope/system.slice/mysqld.service
           └─control
             ├─23602 /bin/bash /usr/bin/mysql-systemd-start post
             └─24646 sleep 1

Nov 09 10:05:20 mysql2.mytv.bg systemd[1]: Starting MySQL Community Server...
Nov 09 10:05:21 mysql2.mytv.bg mysqld_safe[23601]: 181109 10:05:21 mysqld_safe Logging to '/var/log/mysqld.log'.
Nov 09 10:05:21 mysql2.mytv.bg mysqld_safe[23601]: 181109 10:05:21 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
[root@mysql2 ~]# systemctl stop mysqld
[root@mysql2 ~]# systemctl status mysqld
● mysqld.service - MySQL Community Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Fri 2018-11-09 10:07:52 UTC; 4s ago
  Process: 23602 ExecStartPost=/usr/bin/mysql-systemd-start post (code=killed, signal=TERM)
  Process: 23601 ExecStart=/usr/bin/mysqld_safe --basedir=/usr (code=exited, status=0/SUCCESS)
  Process: 23589 ExecStartPre=/usr/bin/mysql-systemd-start pre (code=exited, status=0/SUCCESS)
 Main PID: 23601 (code=exited, status=0/SUCCESS)

Nov 09 10:05:20 mysql2.mytv.bg systemd[1]: Starting MySQL Community Server...
Nov 09 10:05:21 mysql2.mytv.bg mysqld_safe[23601]: 181109 10:05:21 mysqld_safe Logging to '/var/log/mysqld.log'.
Nov 09 10:05:21 mysql2.mytv.bg mysqld_safe[23601]: 181109 10:05:21 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
Nov 09 10:07:52 mysql2.mytv.bg systemd[1]: Stopped MySQL Community Server.
[root@mysql2 ~]# systemctl start mysqld
[root@mysql2 ~]# systemctl status mysqld
● mysqld.service - MySQL Community Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2018-11-09 10:08:06 UTC; 3s ago
  Process: 24711 ExecStartPost=/usr/bin/mysql-systemd-start post (code=exited, status=0/SUCCESS)
  Process: 24698 ExecStartPre=/usr/bin/mysql-systemd-start pre (code=exited, status=0/SUCCESS)
 Main PID: 24710 (mysqld_safe)
   CGroup: /user.slice/user-0.slice/session-2395.scope/system.slice/mysqld.service
           ├─24710 /bin/sh /usr/bin/mysqld_safe --basedir=/usr
           └─25298 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --log-error=/var/log/mysqld.log --open-files-limit=10000...

Nov 09 10:08:04 mysql2.mytv.bg systemd[1]: Starting MySQL Community Server...
Nov 09 10:08:04 mysql2.mytv.bg mysqld_safe[24710]: 181109 10:08:04 mysqld_safe Logging to '/var/log/mysqld.log'.
Nov 09 10:08:04 mysql2.mytv.bg mysqld_safe[24710]: 181109 10:08:04 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
Nov 09 10:08:06 mysql2.mytv.bg systemd[1]: Started MySQL Community Server.

mysql slave requested master to start replication from position greater than the file size

If you happen to reset your master mysql server without shutting down the mysql process (probably because of your super collocation cut the power!!!) you could have a slave server, which have the following error:

Last_IO_Errno: 1236
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from position > file size; the first event 'mysql-bin.005173' at 644642725, the last event read from './mysql-bin.005173' at 4, the last byte read from './mysql-bin.005173' at 4.'

The slave server wants to read a position in the master, which does not exist probably because it was not committed to the file.

It is fairly easy (and in most cases safe) to just correct the position and restart the replication! Take the Master_Log_File and Read_Master_Log_Pos on the slave with:

[root@mysql1 ~]# mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 50497
Server version: 5.6.38-log MySQL Community Server (GPL)

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: 10.10.10.10
                  Master_User: replusr
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.005173
          Read_Master_Log_Pos: 644642725
               Relay_Log_File: mysqld-relay-bin.000479
                Relay_Log_Pos: 644642888
        Relay_Master_Log_File: mysql-bin.005173
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 644642725
              Relay_Log_Space: 644643109
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 1236
                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from position > file size; the first event 'mysql-bin.005173' at 644642725, the last event read from './mysql-bin.005173' at 4, the last byte read from './mysql-bin.005173' at 4.'
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 2
                  Master_UUID: ce8a6c29-cf8e-11e5-9d39-000000000001
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 180525 16:27:22
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 
                Auto_Position: 0
1 row in set (0.00 sec)

As you can see the highlighted variables are important, that is the file and the position you should check in your master mysql server. So go to your master datadir (or the binlog directory, which could be changed with “binlog-dir”) and check whether there is Master_Log_File: “mysql-bin.005173” with position Read_Master_Log_Pos: “644642725”

srv@local-master ~ # cd /var/lib/mysql/
srv@local-master mysql # ls -altr mysql-bin.005173
-rw-rw---- 1 mysql mysql 644639026 24 may 12,22 mysql-bin.005173
srv@local-master mysql # mysqlbinlog mysql-bin.005173|tail -n 15
# at 644638722
#180524 12:03:31 server id 2  end_log_pos 644638930 CRC32 0xfeabe1ab    Query   thread_id=88263388      exec_time=0     error_code=0
SET TIMESTAMP=1527152611/*!*/;
UPDATE `group_desc` SET `id` = '153357',`name` = 'Test Group Name',`tags` = '|Test Group |' WHERE  `id` = '153357'
/*!*/;
# at 644638930
#180524 12:03:31 server id 2  end_log_pos 644639026 CRC32 0x5ca1b693    Query   thread_id=88263388      exec_time=0     error_code=0
SET TIMESTAMP=1527152611/*!*/;
COMMIT
/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
srv@local-master ~ # 

And yes, there is no position Read_Master_Log_Pos: “644642725” as the slave requested! The position number starts with

# at

for example

# at 644638930

As you can see from the bash commands above we got the last 15 lines of our binlog mysql file and the last position was 644638930. Here is what is going on: slave requested to continue from master’s position at 644642725, but master has last position 644638930:

644642725 > 644638930

To fix it just use the next binlog file and position 1 and your slave will continue normally. Let’s say there is a possibility your master missed to write the last commands to the master’s binlog because of the reset and in this situation your slave could be out of sync and in this case you should recover your slave from a full mysql dump and import in the slave. But in most cases it is fairly safe to continue.

[root@mysql1 ~]# mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 50506
Server version: 5.6.38-log MySQL Community Server (GPL)

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> STOP SLAVE;
Query OK, 0 rows affected (0.12 sec)

mysql> CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.005174',MASTER_LOG_POS=1;
Query OK, 0 rows affected (0.47 sec)

mysql> START SLAVE;
Query OK, 0 rows affected (0.04 sec)

And if everything is OK your slave will continue with no errors (just with a big delay – behind your master):

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.10.10.10
                  Master_User: replusr
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.005175
          Read_Master_Log_Pos: 328685690
               Relay_Log_File: mysqld-relay-bin.000003
                Relay_Log_Pos: 36999
        Relay_Master_Log_File: mysql-bin.005175
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 36836
              Relay_Log_Space: 334719008
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 103216
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 2
                  Master_UUID: ce8a6c29-cf8e-11e5-9d39-000000000001
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Repair by sorting
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 
                Auto_Position: 0
1 row in set (0.00 sec)

But if you get:

Duplicate entry

Last_SQL_Errno: 1062
Last_SQL_Error: Error 'Duplicate entry '422987' for key 'PRIMARY'' on query. Default database: 'mydb'. Query: 'INSERT INTO `spec_cookie` (`userid`, `userip`, `cookie`, `added`) VALUES ('96201', '2591115382', 'f3b81be45a484c652d38a2c70f8c44c30d4d04d1293918c9071e052ffd9c76f7', NOW())'

You might get into troubles if you continue, be careful!!! Examine the query, select the data in the slave and in the master, if they are equal you can skip it the error with:

[srv@local-slave ~]# mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 50506
Server version: 5.6.38-log MySQL Community Server (GPL)

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> STOP SLAVE;
Query OK, 0 rows affected (0.09 sec)

mysql> SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1;
Query OK, 0 rows affected (0.00 sec)

mysql> START SLAVE;
Query OK, 0 rows affected (0.12 sec)

mysql>

If they are not equal you can change them manually and continue again. More about Duplicate entry you can check here (similar mysql binlog problems): mysql slave reset and fixing relay log read failure

mariadb mysql slave cannot stop and then crashes on start up with segmentation fault or abort

Let’s show you what means

“That’s specifics of InnoDB data storage.”

Here is the what this specific case will teach you:

  1. basic debug with strace
  2. MySQL could hangs in a infinite loop during a shutdown (could be seen with strace)
  3. InnoDB files could get corrupted without a hardware issue, but with a kill -KILL or probably it was corrupted before the shutdown and the kill?
  4. list the opened files (and the IDs = file descriptors) of a mysql process
  5. start a mysql slave without a starting the replication on start with “skip-slave-start”
  6. Remove a corrupted innodb file, which causes database (mysql process) crash leaving your with no database at all – replace a innodb file, start the server, then drop the innodb table with sql command and recreate it to continue using healthy table
  7. 2 rows (4 INTEGER columns) could eat up a lot of space – 12G and probably infinite! === “That’s specifics of InnoDB data storage.”
    This table file is 12 Gigabytes in size!!!

    MariaDB [mysql]> select * from gtid_slave_pos;
    +-----------+----------+-----------+-------------+
    | domain_id | sub_id   | server_id | seq_no      |
    +-----------+----------+-----------+-------------+
    |         0 | 16983943 |       101 | 45790450502 |
    |         0 | 16983944 |       101 | 45790450503 |
    +-----------+----------+-----------+-------------+
    2 rows in set (0.00 sec)
    

Here is presented a specific case with replication, but in your case you may not use replication, your problem table could be another (and your mariadb/mysql server crashes on start up or selecting from a specific table or on shutdown?), so find the problem table and remove it, here we show you how to do it! BACKUP the mysql datadir BEFORE any intervention!

Here we have a situation: a mariadb (mysql) server running as slave to a really busy master server, so you could expect 10 000 update/insert/delete queries. Everything was working till the time we wanted to shutdown the mysql process, which occurred to be impossible.

STEP 1) We tried everything from “systemctl stop mysql” to kill -TERM multiple times

5 hour the mysql process was running with 2000 opened file descriptors to multiple table files. The strace showed this:

[pid 14824] <... io_getevents resumed> []{0, 500000000}) = 0
[pid 14815] <... io_getevents resumed> []{0, 500000000}) = 0
[pid 14824] io_getevents(139723876253696, 1, 256,  <unfinished ...>
[pid 14815] io_getevents(139723876356096, 1, 256,  <unfinished ...>
[pid 14825] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
[pid 14825] futex(0x55d16885deb0, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 14825] futex(0x55d16885dedc, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 3576261, {1525268239, 312230000}, ffffffff <unfinished ...>
[pid 14852] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
[pid 14852] futex(0x55d16885dd30, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 14852] futex(0x55d16885dd5c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 2775717, {1525268240, 308225000}, ffffffff <unfinished ...>
[pid 14825] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
[pid 14825] futex(0x55d16885deb0, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 14825] futex(0x55d16885dedc, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 3576263, {1525268240, 304889000}, ffffffff <unfinished ...>
[pid 14862] <... nanosleep resumed> NULL) = 0
[pid 14862] nanosleep({1, 0},  <unfinished ...>
[pid 14821] <... io_getevents resumed> []{0, 500000000}) = 0
[pid 14821] io_getevents(139723876315136, 1, 256,  <unfinished ...>
[pid 14820] <... io_getevents resumed> []{0, 500000000}) = 0

And this has been repeating many times for hours without any disk activity on flushing any IO…so no use to wait for something, which apparently won’t finish at all.

STEP 2) So a

kill -9

was used to stop the mysql process. What could go wrong??? InnoDB is awesome and cannot corrupt if the hardware is OK, right? Yeahhh right…
When the start command was executed, the mysql process started, the innodb engine recovery completed successfully and after 5 minutes without listening socket and heavy IO reading there is the segmentation fault crash and YOU have no database….
So here is one of the crashes taken from the log:

2018-05-02 19:51:54 139990018914496 [ERROR] Plugin 'innodb' already installed
2018-05-02 19:51:54 139990018914496 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2018-05-02 19:51:54 139990018914496 [Note] InnoDB: Uses event mutexes
2018-05-02 19:51:54 139990018914496 [Note] InnoDB: Compressed tables use zlib 1.2.8
2018-05-02 19:51:54 139990018914496 [Note] InnoDB: Using Linux native AIO
2018-05-02 19:51:54 139990018914496 [Note] InnoDB: Number of pools: 1
2018-05-02 19:51:54 139990018914496 [Note] InnoDB: Using SSE2 crc32 instructions
2018-05-02 19:51:54 139990018914496 [Note] InnoDB: Initializing buffer pool, total size = 64G, instances = 8, chunk size = 128M
2018-05-02 19:51:56 139990018914496 [Note] InnoDB: Completed initialization of buffer pool
2018-05-02 19:51:56 139913709942528 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2018-05-02 19:51:56 139990018914496 [Note] InnoDB: Highest supported file format is Barracuda.
2018-05-02 19:51:56 139990018914496 [Note] InnoDB: Starting crash recovery from checkpoint LSN=311205983427362
2018-05-02 19:51:57 139990018914496 [Note] InnoDB: Last binlog file '/var/log/mysql-binlog/mysql-bin.227009', position 38590879
2018-05-02 19:52:12 139990018914496 [Note] InnoDB: 128 out of 128 rollback segments are active.
2018-05-02 19:52:12 139990018914496 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
2018-05-02 19:52:12 139990018914496 [Note] InnoDB: Creating shared tablespace for temporary tables
2018-05-02 19:52:12 139990018914496 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2018-05-02 19:52:12 139990018914496 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2018-05-02 19:52:12 139990018914496 [Note] InnoDB: Waiting for purge to start
2018-05-02 19:52:12 139990018914496 [Note] InnoDB: 5.7.21 started; log sequence number 311205983427371
2018-05-02 19:52:12 139915446597376 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
2018-05-02 19:52:12 139990018914496 [Note] Plugin 'FEEDBACK' is disabled.
2018-05-02 19:52:12 139990018914496 [Note] Recovering after a crash using tc.log
2018-05-02 19:52:12 139990018914496 [Note] Starting crash recovery...
2018-05-02 19:52:12 139990018914496 [Note] Crash recovery finished.
2018-05-02 19:52:12 139990018914496 [Note] Server socket created on IP: '0.0.0.0'.
2018-05-02 19:52:12 139990018914496 [Warning] 'user' entry 'root@srvdns2' ignored in --skip-name-resolve mode.
2018-05-02 19:52:12 139990018914496 [Warning] 'proxies_priv' entry '@% root@srvdns1' ignored in --skip-name-resolve mode.
2018-05-02 19:53:52 139915446597376 [Note] InnoDB: Buffer pool(s) load completed at 180502 19:53:52
180502 19:57:12 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.

Server version: 10.2.13-MariaDB-10.2.13+maria~xenial
key_buffer_size=2147483648
read_buffer_size=262144
max_used_connections=0
max_threads=10002
thread_count=6
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 6871600 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7f40700009a8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f515c4fecf8 thread_stack 0x49000
*** buffer overflow detected ***: /usr/sbin/mysqld terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f51f54eb7e5]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x5c)[0x7f51f558d15c]
/lib/x86_64-linux-gnu/libc.so.6(+0x117160)[0x7f51f558b160]
/lib/x86_64-linux-gnu/libc.so.6(+0x1190a7)[0x7f51f558d0a7]
/usr/sbin/mysqld(my_addr_resolve+0xde)[0x55a9e9f1832e]
/usr/sbin/mysqld(my_print_stacktrace+0x1e2)[0x55a9e9eff2e2]
/usr/sbin/mysqld(handle_fatal_signal+0x345)[0x55a9e9999b95]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7f51f5eda390]
/lib/x86_64-linux-gnu/libc.so.6(+0x9f849)[0x7f51f5513849]
/usr/sbin/mysqld(insert_dynamic+0x2a)[0x55a9e9ed3a4a]
/usr/sbin/mysqld(_Z25rpl_load_gtid_slave_stateP3THD+0x424)[0x55a9e98c43a4]
/usr/sbin/mysqld(handle_slave_background+0xe7)[0x55a9e97834e7]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7f51f5ed06ba]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f51f557b41d]
======= Memory map: ========
55a9e933e000-55a9ea423000 r-xp 00000000 08:03 148565                     /usr/sbin/mysqld
55a9ea622000-55a9ea6f4000 r--p 010e4000 08:03 148565                     /usr/sbin/mysqld
55a9ea6f4000-55a9ea7aa000 rw-p 011b6000 08:03 148565                     /usr/sbin/mysqld
55a9ea7aa000-55a9eb03d000 rw-p 00000000 00:00 0 
55a9eba46000-55aa4e67b000 rw-p 00000000 00:00 0                          [heap]

STEP 3) So the default way of repairing the InnoDB is to use

innodb_force_recovery

in your my.cnf configuration file:

[mysqld]
innodb_force_recovery=1

But again and again crashes even enabling all the options of innodb_force_recovery=1,2,3,4,5 and last 6. But when using “innodb_force_recovery=4” and 5 and 6 we have some strange additional error:

2018-05-02 21:43:34 139667439187712 [ERROR] InnoDB: Failed to find tablespace for table `mysql`.`gtid_slave_pos` in the cache. Attempting to load the tablespace with space id 63584
2018-05-02 21:43:34 139667439187712 [Warning] InnoDB: Allocated tablespace ID 63584 for mysql/gtid_slave_pos, old maximum was 0

The innodb_force_recovery did not help we still cannot start our MySQL database, but there were two things:

  1. No errors were reported by the InnoDB Engine – “InnoDB: Buffer pool(s) load completed” and Crash recovery always finished without errors
  2. The MySQL starts successfully, but not listening socket and then after 3~5 minutes of extensive IO reading from the disk by the mysql process it crashes

It was like something big was loading just on the start and in fact needed on the very beginning, so what start immediately after successful load of the mysql process?

REPLICATION

Remember this is a slave!

STEP 4) So trying to prevent the start of the replication on start of the mysql process with:

[mysqld]
skip-slave-start

STEP 5) strace and file descritor to find the offender table file.

The skip-slave-start option DID the trick – NO Segmentation fault

the mysql processes immediately after successful engines load began to listen on sockets. So the big offender was probably something connected with the replication! A big table used in the replication? How to find it, ok remove the “skip-slave-start”, then start the mysql process and wait for the time the IO read kicks in, then strace the mysql process (find the ID of the process with ps – the first number of the following command is the ID of the mysql process):

srv@local ~ # ps axuf|grep mysql|grep -v grep
mysql    6969  239  8.3 78561320 10983848 ?   Ssl  22:53 108:59 /usr/sbin/mysqld
srv@local ~ # strace -f -p 6969
strace -f -p 6969
strace: Process 6969 attached with 27 threads
[pid  6996] rt_sigtimedwait([HUP QUIT ALRM TERM TSTP], NULL, NULL, 8 <unfinished ...>
[pid  6994] futex(0x55941cb5d1ec, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid  6995] futex(0x7f93c6de1d24, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid  6993] restart_syscall(<... resuming interrupted nanosleep ...> <unfinished ...>
[pid  6992] futex(0x55941cb5d0ec, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid  6991] futex(0x55941cb5cd6c, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid  6990] restart_syscall(<... resuming interrupted futex ...> <unfinished ...>
[pid  6997] mremap(0x7f9298b76000, 2210828288, 2210836480, MREMAP_MAYMOVE <unfinished ...>
[pid  6989] restart_syscall(<... resuming interrupted futex ...> <unfinished ...>
[pid  6988] restart_syscall(<... resuming interrupted futex ...> <unfinished ...>
[pid  6986] futex(0x55947cfe941c, FUTEX_WAIT_PRIVATE, 33, NULL <unfinished ...>
[pid  6985] futex(0x55947cfe941c, FUTEX_WAIT_PRIVATE, 31, NULL <unfinished ...>
[pid  6984] futex(0x55947cfe941c, FUTEX_WAIT_PRIVATE, 32, NULL <unfinished ...>
[pid  6997] <... mremap resumed> )      = 0x7f9298b76000
[pid  6984] <... futex resumed> )       = -1 EAGAIN (Resource temporarily unavailable)
[pid  6983] restart_syscall(<... resuming interrupted futex ...> <unfinished ...>
[pid  6984] futex(0x55947cfe941c, FUTEX_WAIT_PRIVATE, 33, NULL <unfinished ...>
[pid  6982] io_getevents(140347217702912, 1, 256,  <unfinished ...>
[pid  6981] io_getevents(140347217723392, 1, 256,  <unfinished ...>
[pid  6985] <... futex resumed> )       = -1 EAGAIN (Resource temporarily unavailable)
[pid  6980] io_getevents(140347217743872, 1, 256,  <unfinished ...>
[pid  6979] io_getevents(140347217764352, 1, 256,  <unfinished ...>
[pid  6978] io_getevents(140347217825792, 1, 256,  <unfinished ...>
[pid  6977] io_getevents(140347217846272, 1, 256,  <unfinished ...>
[pid  6985] futex(0x55947cfe941c, FUTEX_WAIT_PRIVATE, 33, NULL <unfinished ...>
[pid  6976] io_getevents(140347217866752, 1, 256,  <unfinished ...>
[pid  6975] io_getevents(140347219357696, 1, 256,  <unfinished ...>
[pid  6974] io_getevents(140347217784832, 1, 256,  <unfinished ...>
[pid  6973] io_getevents(140347217805312, 1, 256,  <unfinished ...>
[pid  6971] restart_syscall(<... resuming interrupted futex ...> <unfinished ...>
[pid  6970] restart_syscall(<... resuming interrupted futex ...> <unfinished ...>
[pid  6969] futex(0x55941a4be944, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid  6997] pread64(38, "\201L\321\365\0\5\315\3\0\5\315\0\0\5\315\6\0\1\30\307o\303\365]E\277\0\0\0\0\0\0"..., 16384, 6228590592) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210836480, 2210844672, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "\210\271\277\343\0\5\315\6\0\5\315\3\0\5\315\7\0\1\30\307o\343\362~E\277\0\0\0\0\0\0"..., 16384, 6228639744) = 16384
[pid  6997] pread64(38, "\305\17\303k\0\5\315\7\0\5\315\6\0\5\315\10\0\1\30\307o\377\224gE\277\0\0\0\0\0\0"..., 16384, 6228656128) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210844672, 2210852864, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "\217\351V~\0\5\315\10\0\5\315\7\0\5\315\n\0\1\30\307p\34\31\363E\277\0\0\0\0\0\0"..., 16384, 6228672512) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210852864, 2210861056, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "\367\321\330\232\0\5\315\n\0\5\315\10\0\5\315\v\0\1\30\307p>\316\360E\277\0\0\0\0\0\0"..., 16384, 6228705280) = 16384
[pid  6997] pread64(38, "\374v\202\177\0\5\315\v\0\5\315\n\0\5\315\r\0\1\30\307p\\alE\277\0\0\0\0\0\0"..., 16384, 6228721664) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210861056, 2210869248, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "s\261\0A\0\5\315\r\0\5\315\v\0\5\315\16\0\1\30\307p\212\1AE\277\0\0\0\0\0\0"..., 16384, 6228754432) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210869248, 2210877440, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "\214\t\7\244\0\5\315\16\0\5\315\r\0\5\315\21\0\1\30\307p\242H\37E\277\0\0\0\0\0\0"..., 16384, 6228770816) = 16384
[pid  6997] pread64(38, "<\n\272\"\0\5\315\21\0\5\315\16\0\5\315\24\0\1\30\307p\311i\313E\277\0\0\0\0\0\0"..., 16384, 6228819968) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210877440, 2210885632, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] mremap(0x7f9298b76000, 2210885632, 2210893824, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "n|I\n\0\5\315\24\0\5\315\21\0\5\315\25\0\1\30\307p\360EqE\277\0\0\0\0\0\0"..., 16384, 6228869120) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210893824, 2210902016, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "\270\206\326\207\0\5\315\25\0\5\315\24\0\5\315\27\0\1\30\307q\1a\251E\277\0\0\0\0\0\0"..., 16384, 6228885504) = 16384
[pid  6997] pread64(38, "{l\226S\0\5\315\27\0\5\315\25\0\5\315\31\0\1\30\307q&\205!E\277\0\0\0\0\0\0"..., 16384, 6228918272) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210902016, 2210910208, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "\201\265]&\0\5\315\31\0\5\315\27\0\5\315\32\0\1\30\307qK\365\212E\277\0\0\0\0\0\0"..., 16384, 6228951040) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210910208, 2210918400, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "\352?\236\327\0\5\315\32\0\5\315\31\0\5\315\33\0\1\30\307qob:E\277\0\0\0\0\0\0"..., 16384, 6228967424) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210918400, 2210926592, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] mremap(0x7f9298b76000, 2210926592, 2210934784, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "3c\33\301\0\5\315\33\0\5\315\32\0\5\315\36\0\1\30\307q\2236%E\277\0\0\0\0\0\0"..., 16384, 6228983808) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210934784, 2210942976, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "Zi\343\177\0\5\315\36\0\5\315\33\0\5\315\37\0\1\30\307q\325\27LE\277\0\0\0\0\0\0"..., 16384, 6229032960) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210942976, 2210951168, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "v5h%\0\5\315\37\0\5\315\36\0\5\315!\0\1\30\307r\f\325\364E\277\0\0\0\0\0\0"..., 16384, 6229049344) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210951168, 2210959360, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "0\6Yn\0\5\315!\0\5\315\37\0\5\315\"\0\1\30\307sCX@E\277\0\0\0\0\0\0"..., 16384, 6229082112) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210959360, 2210967552, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "\203\243K\203\0\5\315\"\0\5\315!\0\5\315$\0\1\30\307t;\234\302E\277\0\0\0\0\0\0"..., 16384, 6229098496) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210967552, 2210975744, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "9\264\332j\0\5\315$\0\5\315\"\0\5\315%\0\1\30\307\177\370#\371E\277\0\0\0\0\0\0"..., 16384, 6229131264) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210975744, 2210983936, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, ":\200+\337\0\5\315%\0\5\315$\0\5\315&\0\1\30\307\200\20U-E\277\0\0\0\0\0\0"..., 16384, 6229147648) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210983936, 2210992128, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "\35-th\0\5\315&\0\5\315%\0\5\315(\0\1\30\307\200+\333CE\277\0\0\0\0\0\0"..., 16384, 6229164032) = 16384
[pid  6997] pread64(38, "\246Y\305\351\0\5\315(\0\5\315&\0\5\315)\0\1\30\307\200[\324\305E\277\0\0\0\0\0\0"..., 16384, 6229196800) = 16384
[pid  6997] mremap(0x7f9298b76000, 2210992128, 2211000320, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "\340(\2\350\0\5\315)\0\5\315(\0\5\315+\0\1\30\307\200\214L\365E\277\0\0\0\0\0\0"..., 16384, 6229213184) = 16384
[pid  6997] mremap(0x7f9298b76000, 2211000320, 2211008512, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "A\374\37\23\0\5\315+\0\5\315)\0\5\315,\0\1\30\307\200\252\30\373E\277\0\0\0\0\0\0"..., 16384, 6229245952) = 16384
[pid  6997] mremap(0x7f9298b76000, 2211008512, 2211016704, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "iW\274\365\0\5\315,\0\5\315+\0\5\315.\0\1\30\307\200\332\266\256E\277\0\0\0\0\0\0"..., 16384, 6229262336) = 16384
[pid  6997] mremap(0x7f9298b76000, 2211016704, 2211024896, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "\343c\340\347\0\5\315.\0\5\315,\0\5\315/\0\1\30\307\200\356\272YE\277\0\0\0\0\0\0"..., 16384, 6229295104) = 16384
[pid  6997] mremap(0x7f9298b76000, 2211024896, 2211033088, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "}\35\32i\0\5\315/\0\5\315.\0\5\3150\0\1\30\307\201\5\226\256E\277\0\0\0\0\0\0"..., 16384, 6229311488) = 16384
[pid  6997] pread64(38, "\322\237\206\377\0\5\3150\0\5\315/\0\5\3151\0\1\30\307\201\257\241\272E\277\0\0\0\0\0\0"..., 16384, 6229327872) = 16384
[pid  6997] mremap(0x7f9298b76000, 2211033088, 2211041280, MREMAP_MAYMOVE) = 0x7f9298b76000
[pid  6997] pread64(38, "/\250\21!\0\5\3151\0\5\3150\0\5\3152\0\1\30\307\203\3\237\2E\277\0\0\0\0\0\0"..., 16384, 6229344256) = 16384
[pid  6997] mremap(0x7f9298b76000, 2211041280, 2211049472, MREMAP_MAYMOVE) = 0x7f9298b76000

You see multiple lines (thousands) with:

[pid  6997] pread64(38, "/\250\21!\0\5\3151\0\5\3150\0\5\3152\0\1\30\307\203\3\237\2E\277\0\0\0\0\0\0"..., 16384, 6229344256) = 16384
[pid  6997] mremap(0x7f9298b76000, 2211041280, 2211049472, MREMAP_MAYMOVE) = 0x7f9298b76000

The pread64 – reads from file descriptor with ID=38, then you can list all file descriptors for the mysql process ID with:

srv@local mysql # ls -altr /proc/6969/fd
total 0
dr-xr-xr-x 9 mysql mysql  0 May  2 21:32 ..
l-wx------ 1 root  root  64 May  2 21:35 2 -> /var/log/mysql/error.log
dr-x------ 2 root  root   0 May  2 21:35 .
lrwx------ 1 root  root  64 May  2 21:35 9 -> /tmp/ibgyw4AI (deleted)
lrwx------ 1 root  root  64 May  2 21:35 8 -> /tmp/ibuaprWP (deleted)
lrwx------ 1 root  root  64 May  2 21:35 7 -> /tmp/ibfXwsBW (deleted)
lrwx------ 1 root  root  64 May  2 21:35 6 -> /var/lib/mysql/ibdata1
lrwx------ 1 root  root  64 May  2 21:35 5 -> /var/lib/mysql/aria_log.00000001
lr-x------ 1 root  root  64 May  2 21:35 4 -> /var/lib/mysql
lrwx------ 1 root  root  64 May  2 21:35 38 -> /var/lib/mysql/mysql/gtid_slave_pos.ibd
lrwx------ 1 root  root  64 May  2 21:35 37 -> /var/lib/mysql/mysql/event.MYD
lrwx------ 1 root  root  64 May  2 21:35 36 -> /var/lib/mysql/mysql/event.MYI
lrwx------ 1 root  root  64 May  2 21:35 35 -> /var/lib/mysql/mysql/procs_priv.MYD
lrwx------ 1 root  root  64 May  2 21:35 34 -> /var/lib/mysql/mysql/procs_priv.MYI
lrwx------ 1 root  root  64 May  2 21:35 33 -> /var/lib/mysql/mysql/columns_priv.MYD
lrwx------ 1 root  root  64 May  2 21:35 32 -> /var/lib/mysql/mysql/columns_priv.MYI
lrwx------ 1 root  root  64 May  2 21:35 31 -> /var/lib/mysql/mysql/tables_priv.MYD
lrwx------ 1 root  root  64 May  2 21:35 30 -> /var/lib/mysql/mysql/tables_priv.MYI
lrwx------ 1 root  root  64 May  2 21:35 3 -> /var/lib/mysql/aria_log_control
lrwx------ 1 root  root  64 May  2 21:35 29 -> /var/lib/mysql/mysql/roles_mapping.MYD
lrwx------ 1 root  root  64 May  2 21:35 28 -> /var/lib/mysql/mysql/roles_mapping.MYI
lrwx------ 1 root  root  64 May  2 21:35 27 -> /var/lib/mysql/mysql/proxies_priv.MYD
lrwx------ 1 root  root  64 May  2 21:35 26 -> /var/lib/mysql/mysql/proxies_priv.MYI
lrwx------ 1 root  root  64 May  2 21:35 25 -> /var/lib/mysql/mysql/host.MYD
lrwx------ 1 root  root  64 May  2 21:35 24 -> /var/lib/mysql/mysql/host.MYI
lrwx------ 1 root  root  64 May  2 21:35 23 -> /var/lib/mysql/mysql/db.MYD
lrwx------ 1 root  root  64 May  2 21:35 22 -> /var/lib/mysql/mysql/db.MYI
lrwx------ 1 root  root  64 May  2 21:35 21 -> /var/lib/mysql/mysql/user.MYD
lrwx------ 1 root  root  64 May  2 21:35 20 -> /var/lib/mysql/mysql/user.MYI
lrwx------ 1 root  root  64 May  2 21:35 19 -> socket:[49519]
lrwx------ 1 root  root  64 May  2 21:35 18 -> socket:[49518]
lrwx------ 1 root  root  64 May  2 21:35 17 -> /var/lib/mysql/mysql/servers.MYD
lrwx------ 1 root  root  64 May  2 21:35 16 -> /var/lib/mysql/mysql/servers.MYI
lrwx------ 1 root  root  64 May  2 21:35 15 -> /var/lib/mysql/tc.log
lrwx------ 1 root  root  64 May  2 21:35 13 -> /tmp/ib281FZH (deleted)
lrwx------ 1 root  root  64 May  2 21:35 12 -> /var/lib/mysql/ibtmp1
lrwx------ 1 root  root  64 May  2 21:35 11 -> /var/lib/mysql/ib_logfile1
lrwx------ 1 root  root  64 May  2 21:35 10 -> /var/lib/mysql/ib_logfile0
l-wx------ 1 root  root  64 May  2 21:35 1 -> /var/log/mysql/error.log
lrwx------ 1 root  root  64 May  2 21:35 0 -> /dev/pts/3

And again file descriptor with ID=38 is:

/var/lib/mysql/mysql/gtid_slave_pos.ibd

And when you check the size – 12G…..12G for this table? Why? What is used for?

STEP 6) And from the manual:

The mysql.gtid_slave_pos table is used in replication by slave servers to keep track of their current position (the global transaction ID of the last transaction applied). Using the table allows the slave to maintain a consistent value for the gtid_slave_pos system variable across server restarts. See Global Transaction ID.

You should never attempt to modify the table directly. If you do need to change the global gtid_slave_pos value, use SET GLOBAL gtid_slave_pos = ... instead.

The table is updated with the new position as part of each transaction committed during replication. This makes it preferable that the table is using the same storage engine as the tables otherwise being modified in the transaction, since otherwise a multi-engine transaction is needed that can reduce performance.

Hmm “The table is updated with the new position as part of each transaction committed during replication” and 12G table – could it track minillions of transactions? Probably not, in fact this table has only 2-3-4 rows at a given time!!! But the size is 12G, well as we said this is:
“That’s specifics of InnoDB data storage. Did you try to run OPTIMIZE TABLE mysql.gtid_slave_pos? It should allow to reclaim the disk space.” (taken from: https://jira.mariadb.org/browse/MDEV-12318).

STEP 7) Remove a corrupted innodb file, which causes database (mysql process) crash leaving your with no database at all and then recreate the table

Restart your mysql server with “skip-slave-start” again to be able to start it (look above).
So a table probably with couple of rows takes 12G and MariaDB is using it for the replication on start up, but the replication position is kept on another place “master.info”, can we delete this offender file “gtid_slave_pos.ibd”? Yes we can, move the file out of its place (mysql datadir) and then create a second table with:

MariaDB [mysql]> use mysql
MariaDB [mysql]> CREATE TABLE `gtid_slave_pos1` (
    ->   `domain_id` int(10) unsigned NOT NULL,
    ->   `sub_id` bigint(20) unsigned NOT NULL,
    ->   `server_id` int(10) unsigned NOT NULL,
    ->   `seq_no` bigint(20) unsigned NOT NULL,
    ->   PRIMARY KEY (`domain_id`,`sub_id`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='Replication slave GTID position';
Query OK, 0 rows affected (0.01 sec)

Copy gtid_slave_pos1.ibd to gtid_slave_pos.ibd and restart the mysql process, it will report an error for a table mysql.gtid_slave_pos but you will be able to drop the table and then create it with the same name: “gtid_slave_pos” (you could drop the temporary one “gtid_slave_pos1”)

MariaDB [mysql]> use mysql
MariaDB [mysql]> DROP TABLE `gtid_slave_pos`;
Query OK, 0 rows affected (0.01 sec)
MariaDB [mysql]> CREATE TABLE `gtid_slave_pos` (
    ->   `domain_id` int(10) unsigned NOT NULL,
    ->   `sub_id` bigint(20) unsigned NOT NULL,
    ->   `server_id` int(10) unsigned NOT NULL,
    ->   `seq_no` bigint(20) unsigned NOT NULL,
    ->   PRIMARY KEY (`domain_id`,`sub_id`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='Replication slave GTID position';
Query OK, 0 rows affected (0.01 sec)
MariaDB [mysql]> DROP TABLE `gtid_slave_pos1`;
Query OK, 0 rows affected (0.01 sec)

The create table statement is taken from another server probably it is a good idea to do it yourself, login on a healthy mysql server and issue: “show create table mysql.gtid_slave_pos;”.
So now you have a healthy mysql.gtid_slave_posto be used for your replication. Restart your mysql server removing “skip-slave-start” from your configuration file and here it is the replication is OK and running:

MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.10.10.10
                  Master_User: replusr
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.071301
          Read_Master_Log_Pos: 64980976
               Relay_Log_File: mysqld-relay-bin.230012
                Relay_Log_Pos: 80704376
        Relay_Master_Log_File: mysql-bin.090129
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: test
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 80704077
              Relay_Log_Space: 113209377078
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 30944
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 101
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
                   Using_Gtid: No
                  Gtid_IO_Pos: 
      Replicate_Do_Domain_Ids: 
  Replicate_Ignore_Domain_Ids: 
                Parallel_Mode: conservative
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Waiting for room in worker thread event queue
1 row in set (0.00 sec)

Speeding up the receiving of mysql binlog when replication is configured

Have you aver set up a mysql replication between two host with more than 100ms latency between. Let’s say one of your server is in US and the other is in Europe. The latency between the servers could reach more than 100ms and your MySQL replication (or galera cluster, or master-master configuration) which could easily turn into problems when you have really busy servers. In our case we have really busy servers with more than 7G binlog of every minute, which means 7G/minute transfer between the servers with a latency of 100ms and it could get worse your connectivity just hiccup for a minute or two and you would have a replication behind with more than 20 binlog files, which could take up to an hour to advance. There is a good chance to get the binlog files a lot faster – between 5-8 times faster

just compress them!

You can enable compression between your servers so they will transfer the binlog in a compressed format, the compression algo is really fast and almost no CPU consuming you’ll not observe any penalties (even if you enable it between servers on “the same switch”) and your traffic will decrease with almost 5-8 times it depends on your queries but in general it is around such values! So you now 1G binlog could be transferred 5-8 times faster using only 150Mbytes of your connection!
Here is how to enable this option

  1. Enter MySQL console
  2. set the compression to true
  3. verify the compression is set
  4. STOP and then START the slave – the compression won’t happen between the servers if you DO NOT DO this!!!
srv@local ~ # mysql
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 55856
Server version: 10.1.30-MariaDB-1~xenial mariadb.org binary distribution

Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> set global slave_compressed_protocol=1;
Query OK, 0 rows affected (0.00 sec)

MariaDB [(none)]> show global variables like 'slave_compressed_protocol';
+---------------------------+-------+
| Variable_name             | Value |
+---------------------------+-------+
| slave_compressed_protocol | ON    |
+---------------------------+-------+
1 row in set (0.00 sec)

MariaDB [(none)]> STOP SLAVE;
Query OK, 0 rows affected (0.15 sec)

MariaDB [(none)]> START SLAVE;
Query OK, 0 rows affected (0.01 sec)

Then check your network:

main menu
Network Statistics

main menu
Network Packets