We have a pretty big MariaDB master-slave setup, which is really under heavy load and suddenly one of our slaves stopped purging relay logs! And soon the free space got less and it was clear we were going to run out of space for the data partition.
Here is the problem we have only 49G left free space for our mysql datadir “/var/lib/mysql-datadir”. You can see the replication is OK and running (it’s not behind – “Seconds_Behind_Master: 0”), the
Relay_Log_File says mysqld-relay-bin.999999
but it is not true! You can see there are tones of more relay logs after “mysqld-relay-bin.999999” – in fact the current MySQL relay binlog file is “mysqld-relay-bin.1071413” (look below we included a listing of the datadir).
And “FLUSH RELAY LOGS” does not help, at all!
It just not freed any byte on the partition and no files were removed after “mysqld-relay-bin.999999” as before.
And for the record the “expire_logs_days” is set to 2 days and we have 5 days of relay binlogs and counting…
Here is the real output with changed names, the important lines were highlighted.
MariaDB [(none)]> status -------------- mysql Ver 15.1 Distrib 10.1.30-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2 Connection id: 312192 Current database: Current user: root@localhost SSL: Not in use Current pager: stdout Using outfile: '' Using delimiter: ; Server: MariaDB Server version: 10.1.30-MariaDB-1~xenial mariadb.org binary distribution Protocol version: 10 Connection: Localhost via UNIX socket Server characterset: latin1 Db characterset: latin1 Client characterset: latin1 Conn. characterset: latin1 UNIX socket: /var/run/mysqld/mysqld.sock Uptime: 83 days 9 hours 34 min 46 sec Threads: 2 Questions: 84357808218 Slow queries: 0 Opens: 1837378 Flush tables: 1 Open tables: 2000 Queries per second avg: 11707.116 -------------- MariaDB [(none)]> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: srv-db2.internal Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.1842317 Read_Master_Log_Pos: 15124639 Relay_Log_File: mysqld-relay-bin.999999 Relay_Log_Pos: 3755 Relay_Master_Log_File: mysql-bin.1842317 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: test Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 15018314 Relay_Log_Space: 2646478670996 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 10 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: aggressive 1 row in set (0.00 sec) MariaDB [(none)]> Bye srv ~ # df -h Filesystem Size Used Avail Use% Mounted on udev 63G 0 63G 0% /dev tmpfs 13G 7.2M 13G 1% /run /dev/sda4 26G 4.4G 20G 18% / tmpfs 63G 0 63G 0% /dev/shm tmpfs 5.0M 0 5.0M 0% /run/lock tmpfs 63G 0 63G 0% /sys/fs/cgroup /dev/sdd1 2.6T 2.6T 49G 99% /var/lib/mysql-datadir /dev/sde1 3.6T 2.3T 1.4T 64% /var/lib/mysql-bilog tmpfs 13G 0 13G 0% /run/user/0 srv mysql-datadir # ls -altr total 2438925472 drwx------ 2 mysql mysql 4096 Oct 23 13:37 performance_schema -rw-rw---- 1 mysql mysql 52 Oct 23 13:38 aria_log_control -rw-rw---- 1 mysql mysql 16384 Oct 23 13:38 aria_log.00000001 -rw-rw---- 1 mysql mysql 0 Oct 23 13:39 multi-master.info ..... ..... ..... drwx------ 2 mysql mysql 4096 Jan 9 08:07 database1 drwx------ 2 mysql mysql 4096 Jan 9 09:13 database2 drwx------ 2 mysql mysql 4096 Jan 9 23:54 database2 -rw-rw---- 1 mysql mysql 3755 Jan 10 15:43 mysqld-relay-bin.999999 -rw-rw---- 1 mysql mysql 344 Jan 10 15:43 mysqld-relay-bin.1000000 -rw-rw---- 1 mysql mysql 104857737 Jan 10 15:43 mysqld-relay-bin.1000001 -rw-rw---- 1 mysql mysql 344 Jan 10 15:43 mysqld-relay-bin.1000003 -rw-rw---- 1 mysql mysql 2174 Jan 10 15:43 mysqld-relay-bin.1000002 -rw-rw---- 1 mysql mysql 104857902 Jan 10 15:43 mysqld-relay-bin.1000004 -rw-rw---- 1 mysql mysql 344 Jan 10 15:43 mysqld-relay-bin.1000006 -rw-rw---- 1 mysql mysql 1118 Jan 10 15:43 mysqld-relay-bin.1000005 -rw-rw---- 1 mysql mysql 104857952 Jan 10 15:44 mysqld-relay-bin.1000007 ...... ...... ...... -rw-rw---- 1 mysql mysql 344 Jan 14 23:14 mysqld-relay-bin.1071412 -rw-rw---- 1 mysql mysql 1928204 Jan 14 23:14 mysqld-relay-bin.index drwxr-xr-x 1012 mysql mysql 3457024 Jan 14 23:14 . -rw-rw---- 1 mysql mysql 268435456 Jan 14 23:14 ib_logfile1 -rw-rw---- 1 mysql mysql 268435456 Jan 14 23:14 ib_logfile0 -rw-rw---- 1 mysql mysql 549453824 Jan 14 23:14 ibdata1 -rw-rw---- 1 mysql mysql 30744576 Jan 14 23:14 mysqld-relay-bin.1071413 -rw-rw---- 1 mysql mysql 1364 Jan 14 23:14 relay-log.info -rw-rw---- 1 mysql mysql 190 Jan 14 23:14 master.info srv ~ # mysql Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 313127 Server version: 10.1.30-MariaDB-1~xenial mariadb.org binary distribution Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [(none)]> FLUSH RELAY LOGS; Query OK, 0 rows affected (0.07 sec) MariaDB [(none)]> Bye srv ~ # df -h Filesystem Size Used Avail Use% Mounted on udev 63G 0 63G 0% /dev tmpfs 13G 7.2M 13G 1% /run /dev/sda4 26G 4.4G 20G 18% / tmpfs 63G 0 63G 0% /dev/shm tmpfs 5.0M 0 5.0M 0% /run/lock tmpfs 63G 0 63G 0% /sys/fs/cgroup /dev/sdd1 2.6T 2.6T 49G 99% /var/lib/mysql-datadir /dev/sde1 3.6T 2.3T 1.4T 64% /var/log/mysql-binlog tmpfs 13G 0 13G 0% /run/user/0
The solution
Here are the steps you should do to safety delete the old relay mysql binlog files:
- Stop the replication – “STOP SLAVE;“
- Save the slave status after you stopped the replication – “show slave status\G;“
- Reset slave, it will just lose the current master info and resets the counters like the naming of the relay mysql logs – “RESET SLAVE;“
- Change the master binlog name and position with a change master command and the two values extract from (STEP 2) – “CHANGE MASTER TO MASTER_LOG_FILE=’mysql-bin.1842369′, MASTER_LOG_POS=90881016;”
Here is how your change master command is built the “Relay_Master_Log_File” and “Exec_Master_Log_Pos” are from (STEP 2) i.e. the “show slave status” after stopping the slave:CHANGE MASTER TO MASTER_LOG_FILE='<Relay_Master_Log_File>', MASTER_LOG_POS=<Exec_Master_Log_Pos>;
- Start the replication – “START SLAVE“
It depends how fast you write it probably can be done in under 10 seconds! So no real delay for the replication.
Here is the real output and how we manage to recover (with couple of more show slave commands to see if everything is OK):
srv ~ # mysql Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 313138 Server version: 10.1.30-MariaDB-1~xenial mariadb.org binary distribution Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [(none)]> STOP SLAVE; Query OK, 0 rows affected (0.15 sec) MariaDB [(none)]> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Master_Host: srv-db2.internal Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.1842369 Read_Master_Log_Pos: 91255064 Relay_Log_File: mysqld-relay-bin.999999 Relay_Log_Pos: 3755 Relay_Master_Log_File: mysql-bin.1842369 Slave_IO_Running: No Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: test Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 90881016 Relay_Log_Space: 2652007529834 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 10 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: aggressive 1 row in set (0.00 sec) ERROR: No query specified MariaDB [(none)]> RESET SLAVE; Query OK, 0 rows affected (11.25 sec) MariaDB [(none)]> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Master_Host: srv-db2.internal Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: Read_Master_Log_Pos: 4 Relay_Log_File: Relay_Log_Pos: 3755 Relay_Master_Log_File: Slave_IO_Running: No Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: test Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 0 Relay_Log_Space: 2652007529834 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 10 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: aggressive 1 row in set (0.00 sec) ERROR: No query specified MariaDB [(none)]> CHANGE MASTER TO -> MASTER_LOG_FILE='mysql-bin.1842369', -> MASTER_LOG_POS=90881016; Query OK, 0 rows affected (0.01 sec) MariaDB [(none)]> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Master_Host: srv-db2.internal Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.1842369 Read_Master_Log_Pos: 90881016 Relay_Log_File: mysqld-relay-bin.000001 Relay_Log_Pos: 4 Relay_Master_Log_File: mysql-bin.1842369 Slave_IO_Running: No Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: test Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 90881016 Relay_Log_Space: 249 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 10 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: aggressive 1 row in set (0.00 sec) ERROR: No query specified MariaDB [(none)]> START SLAVE; Query OK, 0 rows affected (0.01 sec) MariaDB [(none)]> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Queueing master event to the relay log Master_Host: srv-db2.internal Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.1842371 Read_Master_Log_Pos: 1912119 Relay_Log_File: mysqld-relay-bin.000004 Relay_Log_Pos: 31891749 Relay_Master_Log_File: mysql-bin.1842370 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: test Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 31891460 Relay_Log_Space: 120749474 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 232 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 10 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: aggressive 1 row in set (0.00 sec) ERROR: No query specified MariaDB [(none)]> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Queueing master event to the relay log Master_Host: srv-db2.internal Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.1842372 Read_Master_Log_Pos: 51142186 Relay_Log_File: mysqld-relay-bin.000004 Relay_Log_Pos: 73441730 Relay_Master_Log_File: mysql-bin.1842370 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: test Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 73441441 Relay_Log_Space: 274838866 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 229 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 10 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: aggressive 1 row in set (0.00 sec) ERROR: No query specified MariaDB [(none)]> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Queueing master event to the relay log Master_Host: srv-db2.internal Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.1842373 Read_Master_Log_Pos: 8907663 Relay_Log_File: mysqld-relay-bin.000004 Relay_Log_Pos: 89442523 Relay_Master_Log_File: mysql-bin.1842370 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: test Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 89442234 Relay_Log_Space: 337463692 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 228 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 10 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: aggressive 1 row in set (0.00 sec) ERROR: No query specified MariaDB [(none)]> Bye srv ~ # df -h Filesystem Size Used Avail Use% Mounted on udev 63G 0 63G 0% /dev tmpfs 13G 7.2M 13G 1% /run /dev/sda4 26G 4.4G 20G 18% / tmpfs 63G 0 63G 0% /dev/shm tmpfs 5.0M 0 5.0M 0% /run/lock tmpfs 63G 0 63G 0% /sys/fs/cgroup /dev/sdd1 2.6T 111G 2.5T 5% /var/lib/mysql-datadir /dev/sde1 3.6T 2.3T 1.4T 64% /var/lib/mysql-bilog tmpfs 13G 0 13G 0% /run/user/0
There is bug was reported
The bug was reported on 2015-05-11 and at present is “Confirmed” and “Unresolved“, here is a link to the bug report https://jira.mariadb.org/browse/MDEV-8134 No solution or workaround is provided at the moment of writing this article!
One thought on “mariadb (MySQL) stopped flushing relay-bin log after mysqld-relay-bin.999999”