Several days after another
The slave status was weird, the
- “Relay_Master_Log_File: mysql-bin.999999” and
- “Exec_Master_Log_Pos: 104858214” were not changing,
which is essential for the slave server! In fact without these two values we do not know the real position, which is executed in the slave! And we cannot recover the slave if anything happened.
MariaDB [(none)]> show slave status \G;
*************************** 1. row ***************************
Slave_IO_State: Queueing master event to the relay log
Master_Host: srv-master1.local
Master_User: rep-user
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.1007039
Read_Master_Log_Pos: 52820624
Relay_Log_File: mysqld-relay-bin.712675
Relay_Log_Pos: 52807349
Relay_Master_Log_File: mysql-bin.999999
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB: test
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 104858214
Relay_Log_Space: 7602615098
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 101
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: No
Gtid_IO_Pos:
Replicate_Do_Domain_Ids:
Replicate_Ignore_Domain_Ids:
Parallel_Mode: aggressive
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Reading event from the relay log
1 row in set (0.00 sec)
The SHOW SLAVE STATUS showed:
- Replication was definitely running
Slave_IO_Running: Yes Slave_SQL_Running: Yes - The “show processlist;” command showed INSERT transactions executing from the slave workers:
| 920146 | system user | | database1 | Slave_worker | 0 | update | INSERT INTO `mytable1` (`id`, `username`, `key`, `data`, `bytes | 0.000 | | 920149 | system user | | database2 | Slave_worker | 0 | update | INSERT INTO `mytable1` (`id`, `username`, `key`, `data`, `bytes | 0.000 |
- As you know even you use the old style replication based on master bin file name and position in it the mariadb(mysql) server internally uses Global Transaction IDs (GTID) to tracking down the replication and for crash recovery issues! Even it is not enabled it (Using_Gtid: No) still it is engaged in the process!
And you can check it and see how the transaction ID is changing, so really the replication is running even the old style variables are stuck (Relay_Master_Log_File: mysql-bin.999999″ and “Exec_Master_Log_Pos: 104858214):MariaDB [(none)]> SELECT @@GLOBAL.gtid_slave_pos; +-------------------------+ | @@GLOBAL.gtid_slave_pos | +-------------------------+ | 0-101-78054462996 | +-------------------------+ 1 row in set (0.00 sec) MariaDB [(none)]> SELECT @@GLOBAL.gtid_slave_pos; +-------------------------+ | @@GLOBAL.gtid_slave_pos | +-------------------------+ | 0-102-44185395354 | +-------------------------+ 1 row in set (0.00 sec) MariaDB [(none)]> SELECT @@GLOBAL.gtid_slave_pos; +-------------------------+ | @@GLOBAL.gtid_slave_pos | +-------------------------+ | 0-101-78054463301 | +-------------------------+ 1 row in set (0.00 sec)
- No rollback transactions
- The relay bin logs were filing with data and were rotated as usual:
srv-slave2 mysql # ls -altr|grep relay -rw-rw---- 1 mysql mysql 1640 Jan 20 10:06 mysqld-relay-bin.713309 -rw-rw---- 1 mysql mysql 358 Jan 20 10:06 mysqld-relay-bin.713310 -rw-rw---- 1 mysql mysql 78 Jan 20 10:06 mysqld-relay-bin.index -rw-rw---- 1 mysql mysql 1690 Jan 20 10:06 relay-log.info -rw-rw---- 1 mysql mysql 64727863 Jan 20 10:06 mysqld-relay-bin.713311 srv-slave2 mysql # ls -altr|grep relay -rw-rw---- 1 mysql mysql 593 Jan 20 10:06 mysqld-relay-bin.713312 -rw-rw---- 1 mysql mysql 358 Jan 20 10:06 mysqld-relay-bin.713313 -rw-rw---- 1 mysql mysql 78 Jan 20 10:06 mysqld-relay-bin.index -rw-rw---- 1 mysql mysql 1690 Jan 20 10:06 relay-log.info -rw-rw---- 1 mysql mysql 67442357 Jan 20 10:06 mysqld-relay-bin.713314
So to summarize:
The replication was running and the replication was consistent, but we cannot find the real slave’s position, because the variables were wrong! They had never changed after: Relay_Master_Log_File: “mysql-bin.999999” and “Exec_Master_Log_Pos: 104858214”
We tried “STOP/START SLAVE;” with no success, the Relay_Master_Log_File: “mysql-bin.999999” remained and Exec_Master_Log_Pos changed a little bit incremented “104859320”, which was again the last position in the master’s bin-log file:
srv-master1 mysql-binlog # mysqlbinlog mysql-bin.999999|tail -n 10 # at 104858183 #180413 9:51:01 server id 102 end_log_pos 104858214 CRC32 0x634cb3d4 Xid = 238913678385 COMMIT/*!*/; # at 104858214 #180413 9:51:01 server id 101 end_log_pos 104858262 CRC32 0x1af29770 Rotate to mysql-bin.1000000 pos: 4 DELIMITER ; # End of log file ROLLBACK /* added by mysqlbinlog */; /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/; /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
And the second option to “RESET MASTER”, which indeed would help, as with the problem with relay-bin-log case (here mentioned above) CANNOT BE USED, because we need exactly the two values (Relay_Master_Log_File and Exec_Master_Log_Pos) to recover after the RESET command, which two values are WRONG! If we make “RESET SLAVE” and then “CHANGE MASTER” command with these two values you are going to have replication inconsistency and probably your slave will stop with duplicate entry error.
The solution
Switch the replication to use Global Transaction ID – MariaDB and MySQL.
Here is how:
STOP SLAVE; CHANGE MASTER TO master_use_gtid=slave_pos; START SLAVE;
Executed in the mysql console.
And here is what happened (real output with changed names). You can see how the “SHOW SLAVE STATUS” changed. First there are:
Using_Gtid: Slave_Pos
Gtid_IO_Pos: 0-101-78054700522
and
Relay_Master_Log_File immediately change from the wrong value mysql-bin.999999 to mysql-bin.1007691 and Exec_Master_Log_Pos began changing really fast (not showing only one value as before).
srv-slave2 # mysql
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 920614
Server version: 10.2.17-MariaDB-10.2.17+maria~xenial mariadb.org binary distribution
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> SELECT @@GLOBAL.gtid_slave_pos;
+-------------------------+
| @@GLOBAL.gtid_slave_pos |
+-------------------------+
| 0-101-78054463301 |
+-------------------------+
1 row in set (0.00 sec)
MariaDB [(none)]> STOP SLAVE;
Query OK, 0 rows affected (1.68 sec)
MariaDB [(none)]> SELECT @@GLOBAL.gtid_slave_pos;
+-------------------------+
| @@GLOBAL.gtid_slave_pos |
+-------------------------+
| 0-101-78054463614 |
+-------------------------+
1 row in set (0.00 sec)
MariaDB [(none)]> SELECT @@GLOBAL.gtid_slave_pos;
+-------------------------+
| @@GLOBAL.gtid_slave_pos |
+-------------------------+
| 0-101-78054463614 |
+-------------------------+
1 row in set (0.00 sec)
MariaDB [(none)]> SELECT @@GLOBAL.gtid_slave_pos;
+-------------------------+
| @@GLOBAL.gtid_slave_pos |
+-------------------------+
| 0-101-78054463614 |
+-------------------------+
1 row in set (0.01 sec)
MariaDB [(none)]> CHANGE MASTER TO master_use_gtid=slave_pos;
Query OK, 0 rows affected (0.04 sec)
MariaDB [(none)]> START SLAVE;
Query OK, 0 rows affected (0.00 sec)
MariaDB [(none)]> SELECT @@GLOBAL.gtid_slave_pos;
+-------------------------+
| @@GLOBAL.gtid_slave_pos |
+-------------------------+
| 0-104-78053919845 |
+-------------------------+
1 row in set (0.00 sec)
MariaDB [(none)]> SELECT @@GLOBAL.gtid_slave_pos;
+-------------------------+
| @@GLOBAL.gtid_slave_pos |
+-------------------------+
| 0-101-78054464262 |
+-------------------------+
1 row in set (0.00 sec)
MariaDB [(none)]> SHOW SLAVE STATUS\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: srv-master1.local
Master_User: rep-user
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.1007698
Read_Master_Log_Pos: 57861886
Relay_Log_File: mysqld-relay-bin.000002
Relay_Log_Pos: 9566981
Relay_Master_Log_File: mysql-bin.1007691
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB: test
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 60692482
Relay_Log_Space: 740756853
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 111
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 101
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: Slave_Pos
Gtid_IO_Pos: 0-101-78054700522
Replicate_Do_Domain_Ids:
Replicate_Ignore_Domain_Ids:
Parallel_Mode: aggressive
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Waiting for room in worker thread event queue
1 row in set (0.00 sec)
ERROR: No query specified
MariaDB [(none)]> SHOW SLAVE STATUS\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: srv-master1.local
Master_User: rep-user
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.1007704
Read_Master_Log_Pos: 85833415
Relay_Log_File: mysqld-relay-bin.000019
Relay_Log_Pos: 42510046
Relay_Master_Log_File: mysql-bin.1007697
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB: test
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 42509746
Relay_Log_Space: 924712984
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 112
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 101
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: Slave_Pos
Gtid_IO_Pos: 0-102-44185395356
Replicate_Do_Domain_Ids:
Replicate_Ignore_Domain_Ids:
Parallel_Mode: aggressive
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Waiting for room in worker thread event queue
1 row in set (0.00 sec)
ERROR: No query specified
MariaDB [(none)]> status
--------------
mysql Ver 15.1 Distrib 10.2.17-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2
Connection id: 920614
Current database:
Current user: root@localhost
SSL: Not in use
Current pager: stdout
Using outfile: ''
Using delimiter: ;
Server: MariaDB
Server version: 10.2.17-MariaDB-10.2.17+maria~xenial mariadb.org binary distribution
Protocol version: 10
Connection: Localhost via UNIX socket
Server characterset: latin1
Db characterset: latin1
Client characterset: latin1
Conn. characterset: latin1
UNIX socket: /var/run/mysqld/mysqld.sock
Uptime: 40 days 2 hours 23 min 17 sec
Threads: 38 Questions: 69062735474 Slow queries: 0 Opens: 106005039 Flush tables: 1 Open tables: 2000 Queries per second avg: 19933.843
--------------
MariaDB [(none)]