We have a pretty big MariaDB master-slave setup, which is really under heavy load and suddenly one of our slaves stopped purging relay logs! And soon the free space got less and it was clear we were going to run out of space for the data partition.
Here is the problem we have only 49G left free space for our mysql datadir “/var/lib/mysql-datadir”. You can see the replication is OK and running (it’s not behind – “Seconds_Behind_Master: 0”), the
Relay_Log_File says mysqld-relay-bin.999999
but it is not true! You can see there are tones of more relay logs after “mysqld-relay-bin.999999” – in fact the current MySQL relay binlog file is “mysqld-relay-bin.1071413” (look below we included a listing of the datadir).
And “FLUSH RELAY LOGS” does not help, at all!
It just not freed any byte on the partition and no files were removed after “mysqld-relay-bin.999999” as before.
And for the record the “expire_logs_days” is set to 2 days and we have 5 days of relay binlogs and counting…
Here is the real output with changed names, the important lines were highlighted.
MariaDB [(none)]> status
--------------
mysql Ver 15.1 Distrib 10.1.30-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2
Connection id: 312192
Current database:
Current user: root@localhost
SSL: Not in use
Current pager: stdout
Using outfile: ''
Using delimiter: ;
Server: MariaDB
Server version: 10.1.30-MariaDB-1~xenial mariadb.org binary distribution
Protocol version: 10
Connection: Localhost via UNIX socket
Server characterset: latin1
Db characterset: latin1
Client characterset: latin1
Conn. characterset: latin1
UNIX socket: /var/run/mysqld/mysqld.sock
Uptime: 83 days 9 hours 34 min 46 sec
Threads: 2 Questions: 84357808218 Slow queries: 0 Opens: 1837378 Flush tables: 1 Open tables: 2000 Queries per second avg: 11707.116
--------------
MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: srv-db2.internal
Master_User: repluser
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.1842317
Read_Master_Log_Pos: 15124639
Relay_Log_File: mysqld-relay-bin.999999
Relay_Log_Pos: 3755
Relay_Master_Log_File: mysql-bin.1842317
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB: test
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 15018314
Relay_Log_Space: 2646478670996
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 10
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: No
Gtid_IO_Pos:
Replicate_Do_Domain_Ids:
Replicate_Ignore_Domain_Ids:
Parallel_Mode: aggressive
1 row in set (0.00 sec)
MariaDB [(none)]> Bye
srv ~ # df -h
Filesystem Size Used Avail Use% Mounted on
udev 63G 0 63G 0% /dev
tmpfs 13G 7.2M 13G 1% /run
/dev/sda4 26G 4.4G 20G 18% /
tmpfs 63G 0 63G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 63G 0 63G 0% /sys/fs/cgroup
/dev/sdd1 2.6T 2.6T 49G 99% /var/lib/mysql-datadir
/dev/sde1 3.6T 2.3T 1.4T 64% /var/lib/mysql-bilog
tmpfs 13G 0 13G 0% /run/user/0
srv mysql-datadir # ls -altr
total 2438925472
drwx------ 2 mysql mysql 4096 Oct 23 13:37 performance_schema
-rw-rw---- 1 mysql mysql 52 Oct 23 13:38 aria_log_control
-rw-rw---- 1 mysql mysql 16384 Oct 23 13:38 aria_log.00000001
-rw-rw---- 1 mysql mysql 0 Oct 23 13:39 multi-master.info
.....
.....
.....
drwx------ 2 mysql mysql 4096 Jan 9 08:07 database1
drwx------ 2 mysql mysql 4096 Jan 9 09:13 database2
drwx------ 2 mysql mysql 4096 Jan 9 23:54 database2
-rw-rw---- 1 mysql mysql 3755 Jan 10 15:43 mysqld-relay-bin.999999
-rw-rw---- 1 mysql mysql 344 Jan 10 15:43 mysqld-relay-bin.1000000
-rw-rw---- 1 mysql mysql 104857737 Jan 10 15:43 mysqld-relay-bin.1000001
-rw-rw---- 1 mysql mysql 344 Jan 10 15:43 mysqld-relay-bin.1000003
-rw-rw---- 1 mysql mysql 2174 Jan 10 15:43 mysqld-relay-bin.1000002
-rw-rw---- 1 mysql mysql 104857902 Jan 10 15:43 mysqld-relay-bin.1000004
-rw-rw---- 1 mysql mysql 344 Jan 10 15:43 mysqld-relay-bin.1000006
-rw-rw---- 1 mysql mysql 1118 Jan 10 15:43 mysqld-relay-bin.1000005
-rw-rw---- 1 mysql mysql 104857952 Jan 10 15:44 mysqld-relay-bin.1000007
......
......
......
-rw-rw---- 1 mysql mysql 344 Jan 14 23:14 mysqld-relay-bin.1071412
-rw-rw---- 1 mysql mysql 1928204 Jan 14 23:14 mysqld-relay-bin.index
drwxr-xr-x 1012 mysql mysql 3457024 Jan 14 23:14 .
-rw-rw---- 1 mysql mysql 268435456 Jan 14 23:14 ib_logfile1
-rw-rw---- 1 mysql mysql 268435456 Jan 14 23:14 ib_logfile0
-rw-rw---- 1 mysql mysql 549453824 Jan 14 23:14 ibdata1
-rw-rw---- 1 mysql mysql 30744576 Jan 14 23:14 mysqld-relay-bin.1071413
-rw-rw---- 1 mysql mysql 1364 Jan 14 23:14 relay-log.info
-rw-rw---- 1 mysql mysql 190 Jan 14 23:14 master.info
srv ~ # mysql
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 313127
Server version: 10.1.30-MariaDB-1~xenial mariadb.org binary distribution
Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> FLUSH RELAY LOGS;
Query OK, 0 rows affected (0.07 sec)
MariaDB [(none)]> Bye
srv ~ # df -h
Filesystem Size Used Avail Use% Mounted on
udev 63G 0 63G 0% /dev
tmpfs 13G 7.2M 13G 1% /run
/dev/sda4 26G 4.4G 20G 18% /
tmpfs 63G 0 63G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 63G 0 63G 0% /sys/fs/cgroup
/dev/sdd1 2.6T 2.6T 49G 99% /var/lib/mysql-datadir
/dev/sde1 3.6T 2.3T 1.4T 64% /var/log/mysql-binlog
tmpfs 13G 0 13G 0% /run/user/0
The solution
Here are the steps you should do to safety delete the old relay mysql binlog files:
- Stop the replication – “STOP SLAVE;“
- Save the slave status after you stopped the replication – “show slave status\G;“
- Reset slave, it will just lose the current master info and resets the counters like the naming of the relay mysql logs – “RESET SLAVE;“
- Change the master binlog name and position with a change master command and the two values extract from (STEP 2) – “CHANGE MASTER TO MASTER_LOG_FILE=’mysql-bin.1842369′, MASTER_LOG_POS=90881016;”
Here is how your change master command is built the “Relay_Master_Log_File” and “Exec_Master_Log_Pos” are from (STEP 2) i.e. the “show slave status” after stopping the slave:CHANGE MASTER TO MASTER_LOG_FILE='<Relay_Master_Log_File>', MASTER_LOG_POS=<Exec_Master_Log_Pos>;
- Start the replication – “START SLAVE“
It depends how fast you write it probably can be done in under 10 seconds! So no real delay for the replication.
Here is the real output and how we manage to recover (with couple of more show slave commands to see if everything is OK):
srv ~ # mysql
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 313138
Server version: 10.1.30-MariaDB-1~xenial mariadb.org binary distribution
Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> STOP SLAVE;
Query OK, 0 rows affected (0.15 sec)
MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: srv-db2.internal
Master_User: repluser
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.1842369
Read_Master_Log_Pos: 91255064
Relay_Log_File: mysqld-relay-bin.999999
Relay_Log_Pos: 3755
Relay_Master_Log_File: mysql-bin.1842369
Slave_IO_Running: No
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB: test
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 90881016
Relay_Log_Space: 2652007529834
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 10
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: No
Gtid_IO_Pos:
Replicate_Do_Domain_Ids:
Replicate_Ignore_Domain_Ids:
Parallel_Mode: aggressive
1 row in set (0.00 sec)
ERROR: No query specified
MariaDB [(none)]> RESET SLAVE;
Query OK, 0 rows affected (11.25 sec)
MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: srv-db2.internal
Master_User: repluser
Master_Port: 3306
Connect_Retry: 60
Master_Log_File:
Read_Master_Log_Pos: 4
Relay_Log_File:
Relay_Log_Pos: 3755
Relay_Master_Log_File:
Slave_IO_Running: No
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB: test
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 0
Relay_Log_Space: 2652007529834
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 10
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: No
Gtid_IO_Pos:
Replicate_Do_Domain_Ids:
Replicate_Ignore_Domain_Ids:
Parallel_Mode: aggressive
1 row in set (0.00 sec)
ERROR: No query specified
MariaDB [(none)]> CHANGE MASTER TO
-> MASTER_LOG_FILE='mysql-bin.1842369',
-> MASTER_LOG_POS=90881016;
Query OK, 0 rows affected (0.01 sec)
MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: srv-db2.internal
Master_User: repluser
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.1842369
Read_Master_Log_Pos: 90881016
Relay_Log_File: mysqld-relay-bin.000001
Relay_Log_Pos: 4
Relay_Master_Log_File: mysql-bin.1842369
Slave_IO_Running: No
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB: test
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 90881016
Relay_Log_Space: 249
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 10
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: No
Gtid_IO_Pos:
Replicate_Do_Domain_Ids:
Replicate_Ignore_Domain_Ids:
Parallel_Mode: aggressive
1 row in set (0.00 sec)
ERROR: No query specified
MariaDB [(none)]> START SLAVE;
Query OK, 0 rows affected (0.01 sec)
MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Queueing master event to the relay log
Master_Host: srv-db2.internal
Master_User: repluser
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.1842371
Read_Master_Log_Pos: 1912119
Relay_Log_File: mysqld-relay-bin.000004
Relay_Log_Pos: 31891749
Relay_Master_Log_File: mysql-bin.1842370
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB: test
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 31891460
Relay_Log_Space: 120749474
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 232
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 10
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: No
Gtid_IO_Pos:
Replicate_Do_Domain_Ids:
Replicate_Ignore_Domain_Ids:
Parallel_Mode: aggressive
1 row in set (0.00 sec)
ERROR: No query specified
MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Queueing master event to the relay log
Master_Host: srv-db2.internal
Master_User: repluser
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.1842372
Read_Master_Log_Pos: 51142186
Relay_Log_File: mysqld-relay-bin.000004
Relay_Log_Pos: 73441730
Relay_Master_Log_File: mysql-bin.1842370
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB: test
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 73441441
Relay_Log_Space: 274838866
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 229
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 10
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: No
Gtid_IO_Pos:
Replicate_Do_Domain_Ids:
Replicate_Ignore_Domain_Ids:
Parallel_Mode: aggressive
1 row in set (0.00 sec)
ERROR: No query specified
MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Queueing master event to the relay log
Master_Host: srv-db2.internal
Master_User: repluser
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.1842373
Read_Master_Log_Pos: 8907663
Relay_Log_File: mysqld-relay-bin.000004
Relay_Log_Pos: 89442523
Relay_Master_Log_File: mysql-bin.1842370
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB: test
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 89442234
Relay_Log_Space: 337463692
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 228
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 10
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: No
Gtid_IO_Pos:
Replicate_Do_Domain_Ids:
Replicate_Ignore_Domain_Ids:
Parallel_Mode: aggressive
1 row in set (0.00 sec)
ERROR: No query specified
MariaDB [(none)]> Bye
srv ~ # df -h
Filesystem Size Used Avail Use% Mounted on
udev 63G 0 63G 0% /dev
tmpfs 13G 7.2M 13G 1% /run
/dev/sda4 26G 4.4G 20G 18% /
tmpfs 63G 0 63G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 63G 0 63G 0% /sys/fs/cgroup
/dev/sdd1 2.6T 111G 2.5T 5% /var/lib/mysql-datadir
/dev/sde1 3.6T 2.3T 1.4T 64% /var/lib/mysql-bilog
tmpfs 13G 0 13G 0% /run/user/0
There is bug was reported
The bug was reported on 2015-05-11 and at present is “Confirmed” and “Unresolved“, here is a link to the bug report https://jira.mariadb.org/browse/MDEV-8134 No solution or workaround is provided at the moment of writing this article!
One thought on “mariadb (MySQL) stopped flushing relay-bin log after mysqld-relay-bin.999999”