Several days after another
The slave status was weird, the
- “Relay_Master_Log_File: mysql-bin.999999” and
- “Exec_Master_Log_Pos: 104858214” were not changing,
which is essential for the slave server! In fact without these two values we do not know the real position, which is executed in the slave! And we cannot recover the slave if anything happened.
MariaDB [(none)]> show slave status \G; *************************** 1. row *************************** Slave_IO_State: Queueing master event to the relay log Master_Host: srv-master1.local Master_User: rep-user Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.1007039 Read_Master_Log_Pos: 52820624 Relay_Log_File: mysqld-relay-bin.712675 Relay_Log_Pos: 52807349 Relay_Master_Log_File: mysql-bin.999999 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: test Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 104858214 Relay_Log_Space: 7602615098 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 101 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: aggressive SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Reading event from the relay log 1 row in set (0.00 sec)
The SHOW SLAVE STATUS showed:
- Replication was definitely running
Slave_IO_Running: Yes Slave_SQL_Running: Yes
- The “show processlist;” command showed INSERT transactions executing from the slave workers:
| 920146 | system user | | database1 | Slave_worker | 0 | update | INSERT INTO `mytable1` (`id`, `username`, `key`, `data`, `bytes | 0.000 | | 920149 | system user | | database2 | Slave_worker | 0 | update | INSERT INTO `mytable1` (`id`, `username`, `key`, `data`, `bytes | 0.000 |
- As you know even you use the old style replication based on master bin file name and position in it the mariadb(mysql) server internally uses Global Transaction IDs (GTID) to tracking down the replication and for crash recovery issues! Even it is not enabled it (Using_Gtid: No) still it is engaged in the process!
And you can check it and see how the transaction ID is changing, so really the replication is running even the old style variables are stuck (Relay_Master_Log_File: mysql-bin.999999″ and “Exec_Master_Log_Pos: 104858214):MariaDB [(none)]> SELECT @@GLOBAL.gtid_slave_pos; +-------------------------+ | @@GLOBAL.gtid_slave_pos | +-------------------------+ | 0-101-78054462996 | +-------------------------+ 1 row in set (0.00 sec) MariaDB [(none)]> SELECT @@GLOBAL.gtid_slave_pos; +-------------------------+ | @@GLOBAL.gtid_slave_pos | +-------------------------+ | 0-102-44185395354 | +-------------------------+ 1 row in set (0.00 sec) MariaDB [(none)]> SELECT @@GLOBAL.gtid_slave_pos; +-------------------------+ | @@GLOBAL.gtid_slave_pos | +-------------------------+ | 0-101-78054463301 | +-------------------------+ 1 row in set (0.00 sec)
- No rollback transactions
- The relay bin logs were filing with data and were rotated as usual:
srv-slave2 mysql # ls -altr|grep relay -rw-rw---- 1 mysql mysql 1640 Jan 20 10:06 mysqld-relay-bin.713309 -rw-rw---- 1 mysql mysql 358 Jan 20 10:06 mysqld-relay-bin.713310 -rw-rw---- 1 mysql mysql 78 Jan 20 10:06 mysqld-relay-bin.index -rw-rw---- 1 mysql mysql 1690 Jan 20 10:06 relay-log.info -rw-rw---- 1 mysql mysql 64727863 Jan 20 10:06 mysqld-relay-bin.713311 srv-slave2 mysql # ls -altr|grep relay -rw-rw---- 1 mysql mysql 593 Jan 20 10:06 mysqld-relay-bin.713312 -rw-rw---- 1 mysql mysql 358 Jan 20 10:06 mysqld-relay-bin.713313 -rw-rw---- 1 mysql mysql 78 Jan 20 10:06 mysqld-relay-bin.index -rw-rw---- 1 mysql mysql 1690 Jan 20 10:06 relay-log.info -rw-rw---- 1 mysql mysql 67442357 Jan 20 10:06 mysqld-relay-bin.713314
So to summarize:
The replication was running and the replication was consistent, but we cannot find the real slave’s position, because the variables were wrong! They had never changed after: Relay_Master_Log_File: “mysql-bin.999999” and “Exec_Master_Log_Pos: 104858214”
We tried “STOP/START SLAVE;” with no success, the Relay_Master_Log_File: “mysql-bin.999999” remained and Exec_Master_Log_Pos changed a little bit incremented “104859320”, which was again the last position in the master’s bin-log file:
srv-master1 mysql-binlog # mysqlbinlog mysql-bin.999999|tail -n 10 # at 104858183 #180413 9:51:01 server id 102 end_log_pos 104858214 CRC32 0x634cb3d4 Xid = 238913678385 COMMIT/*!*/; # at 104858214 #180413 9:51:01 server id 101 end_log_pos 104858262 CRC32 0x1af29770 Rotate to mysql-bin.1000000 pos: 4 DELIMITER ; # End of log file ROLLBACK /* added by mysqlbinlog */; /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/; /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
And the second option to “RESET MASTER”, which indeed would help, as with the problem with relay-bin-log case (here mentioned above) CANNOT BE USED, because we need exactly the two values (Relay_Master_Log_File and Exec_Master_Log_Pos) to recover after the RESET command, which two values are WRONG! If we make “RESET SLAVE” and then “CHANGE MASTER” command with these two values you are going to have replication inconsistency and probably your slave will stop with duplicate entry error.
The solution
Switch the replication to use Global Transaction ID – MariaDB and MySQL.
Here is how:
STOP SLAVE; CHANGE MASTER TO master_use_gtid=slave_pos; START SLAVE;
Executed in the mysql console.
And here is what happened (real output with changed names). You can see how the “SHOW SLAVE STATUS” changed. First there are:
Using_Gtid: Slave_Pos Gtid_IO_Pos: 0-101-78054700522
and
Relay_Master_Log_File immediately change from the wrong value mysql-bin.999999 to mysql-bin.1007691 and Exec_Master_Log_Pos began changing really fast (not showing only one value as before).
srv-slave2 # mysql Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 920614 Server version: 10.2.17-MariaDB-10.2.17+maria~xenial mariadb.org binary distribution Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [(none)]> SELECT @@GLOBAL.gtid_slave_pos; +-------------------------+ | @@GLOBAL.gtid_slave_pos | +-------------------------+ | 0-101-78054463301 | +-------------------------+ 1 row in set (0.00 sec) MariaDB [(none)]> STOP SLAVE; Query OK, 0 rows affected (1.68 sec) MariaDB [(none)]> SELECT @@GLOBAL.gtid_slave_pos; +-------------------------+ | @@GLOBAL.gtid_slave_pos | +-------------------------+ | 0-101-78054463614 | +-------------------------+ 1 row in set (0.00 sec) MariaDB [(none)]> SELECT @@GLOBAL.gtid_slave_pos; +-------------------------+ | @@GLOBAL.gtid_slave_pos | +-------------------------+ | 0-101-78054463614 | +-------------------------+ 1 row in set (0.00 sec) MariaDB [(none)]> SELECT @@GLOBAL.gtid_slave_pos; +-------------------------+ | @@GLOBAL.gtid_slave_pos | +-------------------------+ | 0-101-78054463614 | +-------------------------+ 1 row in set (0.01 sec) MariaDB [(none)]> CHANGE MASTER TO master_use_gtid=slave_pos; Query OK, 0 rows affected (0.04 sec) MariaDB [(none)]> START SLAVE; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> SELECT @@GLOBAL.gtid_slave_pos; +-------------------------+ | @@GLOBAL.gtid_slave_pos | +-------------------------+ | 0-104-78053919845 | +-------------------------+ 1 row in set (0.00 sec) MariaDB [(none)]> SELECT @@GLOBAL.gtid_slave_pos; +-------------------------+ | @@GLOBAL.gtid_slave_pos | +-------------------------+ | 0-101-78054464262 | +-------------------------+ 1 row in set (0.00 sec) MariaDB [(none)]> SHOW SLAVE STATUS\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: srv-master1.local Master_User: rep-user Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.1007698 Read_Master_Log_Pos: 57861886 Relay_Log_File: mysqld-relay-bin.000002 Relay_Log_Pos: 9566981 Relay_Master_Log_File: mysql-bin.1007691 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: test Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 60692482 Relay_Log_Space: 740756853 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 111 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 101 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Slave_Pos Gtid_IO_Pos: 0-101-78054700522 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: aggressive SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Waiting for room in worker thread event queue 1 row in set (0.00 sec) ERROR: No query specified MariaDB [(none)]> SHOW SLAVE STATUS\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: srv-master1.local Master_User: rep-user Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.1007704 Read_Master_Log_Pos: 85833415 Relay_Log_File: mysqld-relay-bin.000019 Relay_Log_Pos: 42510046 Relay_Master_Log_File: mysql-bin.1007697 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: test Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 42509746 Relay_Log_Space: 924712984 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 112 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 101 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Slave_Pos Gtid_IO_Pos: 0-102-44185395356 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: aggressive SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Waiting for room in worker thread event queue 1 row in set (0.00 sec) ERROR: No query specified MariaDB [(none)]> status -------------- mysql Ver 15.1 Distrib 10.2.17-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2 Connection id: 920614 Current database: Current user: root@localhost SSL: Not in use Current pager: stdout Using outfile: '' Using delimiter: ; Server: MariaDB Server version: 10.2.17-MariaDB-10.2.17+maria~xenial mariadb.org binary distribution Protocol version: 10 Connection: Localhost via UNIX socket Server characterset: latin1 Db characterset: latin1 Client characterset: latin1 Conn. characterset: latin1 UNIX socket: /var/run/mysqld/mysqld.sock Uptime: 40 days 2 hours 23 min 17 sec Threads: 38 Questions: 69062735474 Slow queries: 0 Opens: 106005039 Flush tables: 1 Open tables: 2000 Queries per second avg: 19933.843 -------------- MariaDB [(none)]