mariadb (MySQL) stopped flushing relay-bin log after mysqld-relay-bin.999999

We have a pretty big MariaDB master-slave setup, which is really under heavy load and suddenly one of our slaves stopped purging relay logs! And soon the free space got less and it was clear we were going to run out of space for the data partition.

Here is the problem we have only 49G left free space for our mysql datadir “/var/lib/mysql-datadir”. You can see the replication is OK and running (it’s not behind – “Seconds_Behind_Master: 0”), the

Relay_Log_File says mysqld-relay-bin.999999

but it is not true! You can see there are tones of more relay logs after “mysqld-relay-bin.999999” – in fact the current MySQL relay binlog file is “mysqld-relay-bin.1071413” (look below we included a listing of the datadir).

And “FLUSH RELAY LOGS” does not help, at all!

It just not freed any byte on the partition and no files were removed after “mysqld-relay-bin.999999” as before.

And for the record the “expire_logs_days” is set to 2 days and we have 5 days of relay binlogs and counting…

Here is the real output with changed names, the important lines were highlighted.

MariaDB [(none)]> status
--------------
mysql  Ver 15.1 Distrib 10.1.30-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2

Connection id:          312192
Current database:
Current user:           root@localhost
SSL:                    Not in use
Current pager:          stdout
Using outfile:          ''
Using delimiter:        ;
Server:                 MariaDB
Server version:         10.1.30-MariaDB-1~xenial mariadb.org binary distribution
Protocol version:       10
Connection:             Localhost via UNIX socket
Server characterset:    latin1
Db     characterset:    latin1
Client characterset:    latin1
Conn.  characterset:    latin1
UNIX socket:            /var/run/mysqld/mysqld.sock
Uptime:                 83 days 9 hours 34 min 46 sec

Threads: 2  Questions: 84357808218  Slow queries: 0  Opens: 1837378  Flush tables: 1  Open tables: 2000  Queries per second avg: 11707.116
--------------
MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: srv-db2.internal
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.1842317
          Read_Master_Log_Pos: 15124639
               Relay_Log_File: mysqld-relay-bin.999999
                Relay_Log_Pos: 3755
        Relay_Master_Log_File: mysql-bin.1842317
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: test
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 15018314
              Relay_Log_Space: 2646478670996
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 10
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
                   Using_Gtid: No
                  Gtid_IO_Pos: 
      Replicate_Do_Domain_Ids: 
  Replicate_Ignore_Domain_Ids: 
                Parallel_Mode: aggressive
1 row in set (0.00 sec)

MariaDB [(none)]> Bye
srv ~ # df -h
Filesystem      Size  Used Avail Use% Mounted on
udev             63G     0   63G   0% /dev
tmpfs            13G  7.2M   13G   1% /run
/dev/sda4        26G  4.4G   20G  18% /
tmpfs            63G     0   63G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs            63G     0   63G   0% /sys/fs/cgroup
/dev/sdd1       2.6T  2.6T   49G  99% /var/lib/mysql-datadir
/dev/sde1       3.6T  2.3T  1.4T  64% /var/lib/mysql-bilog
tmpfs            13G     0   13G   0% /run/user/0
srv mysql-datadir # ls -altr
total 2438925472
drwx------    2 mysql mysql      4096 Oct 23 13:37 performance_schema
-rw-rw----    1 mysql mysql        52 Oct 23 13:38 aria_log_control
-rw-rw----    1 mysql mysql     16384 Oct 23 13:38 aria_log.00000001
-rw-rw----    1 mysql mysql         0 Oct 23 13:39 multi-master.info
.....
.....
.....
drwx------    2 mysql mysql      4096 Jan  9 08:07 database1
drwx------    2 mysql mysql      4096 Jan  9 09:13 database2
drwx------    2 mysql mysql      4096 Jan  9 23:54 database2
-rw-rw----    1 mysql mysql      3755 Jan 10 15:43 mysqld-relay-bin.999999
-rw-rw----    1 mysql mysql       344 Jan 10 15:43 mysqld-relay-bin.1000000
-rw-rw----    1 mysql mysql 104857737 Jan 10 15:43 mysqld-relay-bin.1000001
-rw-rw----    1 mysql mysql       344 Jan 10 15:43 mysqld-relay-bin.1000003
-rw-rw----    1 mysql mysql      2174 Jan 10 15:43 mysqld-relay-bin.1000002
-rw-rw----    1 mysql mysql 104857902 Jan 10 15:43 mysqld-relay-bin.1000004
-rw-rw----    1 mysql mysql       344 Jan 10 15:43 mysqld-relay-bin.1000006
-rw-rw----    1 mysql mysql      1118 Jan 10 15:43 mysqld-relay-bin.1000005
-rw-rw----    1 mysql mysql 104857952 Jan 10 15:44 mysqld-relay-bin.1000007
......
......
......
-rw-rw----    1 mysql mysql       344 Jan 14 23:14 mysqld-relay-bin.1071412
-rw-rw----    1 mysql mysql   1928204 Jan 14 23:14 mysqld-relay-bin.index
drwxr-xr-x 1012 mysql mysql   3457024 Jan 14 23:14 .
-rw-rw----    1 mysql mysql 268435456 Jan 14 23:14 ib_logfile1
-rw-rw----    1 mysql mysql 268435456 Jan 14 23:14 ib_logfile0
-rw-rw----    1 mysql mysql 549453824 Jan 14 23:14 ibdata1
-rw-rw----    1 mysql mysql  30744576 Jan 14 23:14 mysqld-relay-bin.1071413
-rw-rw----    1 mysql mysql      1364 Jan 14 23:14 relay-log.info
-rw-rw----    1 mysql mysql       190 Jan 14 23:14 master.info

srv ~ # mysql
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 313127
Server version: 10.1.30-MariaDB-1~xenial mariadb.org binary distribution

Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> FLUSH RELAY LOGS;
Query OK, 0 rows affected (0.07 sec)

MariaDB [(none)]> Bye

srv ~ # df -h
Filesystem      Size  Used Avail Use% Mounted on
udev             63G     0   63G   0% /dev
tmpfs            13G  7.2M   13G   1% /run
/dev/sda4        26G  4.4G   20G  18% /
tmpfs            63G     0   63G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs            63G     0   63G   0% /sys/fs/cgroup
/dev/sdd1       2.6T  2.6T   49G  99% /var/lib/mysql-datadir
/dev/sde1       3.6T  2.3T  1.4T  64% /var/log/mysql-binlog
tmpfs            13G     0   13G   0% /run/user/0

The solution

Here are the steps you should do to safety delete the old relay mysql binlog files:

  1. Stop the replication – “STOP SLAVE;
  2. Save the slave status after you stopped the replication – “show slave status\G;
  3. Reset slave, it will just lose the current master info and resets the counters like the naming of the relay mysql logs – “RESET SLAVE;
  4. Change the master binlog name and position with a change master command and the two values extract from (STEP 2) – “CHANGE MASTER TO MASTER_LOG_FILE=’mysql-bin.1842369′, MASTER_LOG_POS=90881016;
    Here is how your change master command is built the “Relay_Master_Log_File” and “Exec_Master_Log_Pos” are from (STEP 2) i.e. the “show slave status” after stopping the slave:

    CHANGE MASTER TO MASTER_LOG_FILE='<Relay_Master_Log_File>', MASTER_LOG_POS=<Exec_Master_Log_Pos>;
    
  5. Start the replication – “START SLAVE

It depends how fast you write it probably can be done in under 10 seconds! So no real delay for the replication.
Here is the real output and how we manage to recover (with couple of more show slave commands to see if everything is OK):

srv ~ # mysql
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 313138
Server version: 10.1.30-MariaDB-1~xenial mariadb.org binary distribution

Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> STOP SLAVE;
Query OK, 0 rows affected (0.15 sec)

MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: srv-db2.internal
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.1842369
          Read_Master_Log_Pos: 91255064
               Relay_Log_File: mysqld-relay-bin.999999
                Relay_Log_Pos: 3755
        Relay_Master_Log_File: mysql-bin.1842369
             Slave_IO_Running: No
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: test
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 90881016
              Relay_Log_Space: 2652007529834
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 10
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
                   Using_Gtid: No
                  Gtid_IO_Pos: 
      Replicate_Do_Domain_Ids: 
  Replicate_Ignore_Domain_Ids: 
                Parallel_Mode: aggressive
1 row in set (0.00 sec)

ERROR: No query specified

MariaDB [(none)]> RESET SLAVE;
Query OK, 0 rows affected (11.25 sec)

MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: srv-db2.internal
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: 
          Read_Master_Log_Pos: 4
               Relay_Log_File: 
                Relay_Log_Pos: 3755
        Relay_Master_Log_File: 
             Slave_IO_Running: No
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: test
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 0
              Relay_Log_Space: 2652007529834
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 10
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
                   Using_Gtid: No
                  Gtid_IO_Pos: 
      Replicate_Do_Domain_Ids: 
  Replicate_Ignore_Domain_Ids: 
                Parallel_Mode: aggressive
1 row in set (0.00 sec)

ERROR: No query specified

MariaDB [(none)]> CHANGE MASTER TO
    ->          MASTER_LOG_FILE='mysql-bin.1842369',
    ->          MASTER_LOG_POS=90881016;
Query OK, 0 rows affected (0.01 sec)

MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: srv-db2.internal
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.1842369
          Read_Master_Log_Pos: 90881016
               Relay_Log_File: mysqld-relay-bin.000001
                Relay_Log_Pos: 4
        Relay_Master_Log_File: mysql-bin.1842369
             Slave_IO_Running: No
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: test
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 90881016
              Relay_Log_Space: 249
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 10
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
                   Using_Gtid: No
                  Gtid_IO_Pos: 
      Replicate_Do_Domain_Ids: 
  Replicate_Ignore_Domain_Ids: 
                Parallel_Mode: aggressive
1 row in set (0.00 sec)

ERROR: No query specified

MariaDB [(none)]> START SLAVE;
Query OK, 0 rows affected (0.01 sec)

MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Queueing master event to the relay log
                  Master_Host: srv-db2.internal
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.1842371
          Read_Master_Log_Pos: 1912119
               Relay_Log_File: mysqld-relay-bin.000004
                Relay_Log_Pos: 31891749
        Relay_Master_Log_File: mysql-bin.1842370
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: test
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 31891460
              Relay_Log_Space: 120749474
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 232
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 10
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
                   Using_Gtid: No
                  Gtid_IO_Pos: 
      Replicate_Do_Domain_Ids: 
  Replicate_Ignore_Domain_Ids: 
                Parallel_Mode: aggressive
1 row in set (0.00 sec)

ERROR: No query specified

MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Queueing master event to the relay log
                  Master_Host: srv-db2.internal
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.1842372
          Read_Master_Log_Pos: 51142186
               Relay_Log_File: mysqld-relay-bin.000004
                Relay_Log_Pos: 73441730
        Relay_Master_Log_File: mysql-bin.1842370
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: test
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 73441441
              Relay_Log_Space: 274838866
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 229
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 10
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
                   Using_Gtid: No
                  Gtid_IO_Pos: 
      Replicate_Do_Domain_Ids: 
  Replicate_Ignore_Domain_Ids: 
                Parallel_Mode: aggressive
1 row in set (0.00 sec)

ERROR: No query specified

MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Queueing master event to the relay log
                  Master_Host: srv-db2.internal
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.1842373
          Read_Master_Log_Pos: 8907663
               Relay_Log_File: mysqld-relay-bin.000004
                Relay_Log_Pos: 89442523
        Relay_Master_Log_File: mysql-bin.1842370
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: test
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 89442234
              Relay_Log_Space: 337463692
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 228
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 10
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
                   Using_Gtid: No
                  Gtid_IO_Pos: 
      Replicate_Do_Domain_Ids: 
  Replicate_Ignore_Domain_Ids: 
                Parallel_Mode: aggressive
1 row in set (0.00 sec)

ERROR: No query specified


MariaDB [(none)]> Bye
srv ~ # df -h
Filesystem      Size  Used Avail Use% Mounted on
udev             63G     0   63G   0% /dev
tmpfs            13G  7.2M   13G   1% /run
/dev/sda4        26G  4.4G   20G  18% /
tmpfs            63G     0   63G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs            63G     0   63G   0% /sys/fs/cgroup
/dev/sdd1       2.6T  111G  2.5T   5% /var/lib/mysql-datadir
/dev/sde1       3.6T  2.3T  1.4T  64% /var/lib/mysql-bilog
tmpfs            13G     0   13G   0% /run/user/0

There is bug was reported

The bug was reported on 2015-05-11 and at present is “Confirmed” and “Unresolved“, here is a link to the bug report https://jira.mariadb.org/browse/MDEV-8134 No solution or workaround is provided at the moment of writing this article!

One thought on “mariadb (MySQL) stopped flushing relay-bin log after mysqld-relay-bin.999999”

Leave a Reply

Your email address will not be published. Required fields are marked *