Recover MySQL InnoDB Cluster and Dba.rebootClusterFromCompleteOutage: Argument #2: Invalid options: primary (ArgumentError)

main menu
MySQL 8.0.28 version

Recent version of MySQL 8 implemented more options to the rebootClusterFromCompleteOutage function! Definitely check the link’s manual above and most of the handy second options are implemented in MySQL 8.0.30, so the user’s MySQL InnoDB Cluster crashed and if rebootClusterFromCompleteOutage should be used, but it outputs an error sort of:

 MySQL  db-cluster-1:33060+ ssl  JS > var cluster = dba.rebootClusterFromCompleteOutage()
Restoring the cluster 'mycluster1' from complete outage...

Dba.rebootClusterFromCompleteOutage: Target member is in state ERROR (RuntimeError)

And when trying to use the node, which was healthy before the crash with this function, there is an error, too:

 MySQL  db-cluster-1:33060+ ssl  JS > var cluster = dba.rebootClusterFromCompleteOutage("mycluster1", {primary: "db-cluster-1:3306"});
Dba.rebootClusterFromCompleteOutage: Argument #2: Invalid options: primary (ArgumentError)

So no cluster is available and the database and its data is inaccessible.
Indeed, the initial state of the cluster was really bad and before the restart, the two of three servers were missing or in bad state.

[root@db-cluster-1 ~]# mysqlsh
MySQL Shell 8.0.28

Copyright (c) 2016, 2022, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.

Type '\help' or '\?' for help; '\quit' to exit.
 MySQL  JS > \connect clusteradmin@db-cluster-1
Creating a session to 'clusteradmin@db-cluster-1'
Fetching schema names for autocompletion... Press ^C to stop.
Your MySQL connection id is 241708346 (X protocol)
Server version: 8.0.28 MySQL Community Server - GPL
No default schema selected; type \use <schema> to set one.
 MySQL  db-cluster-1:33060+ ssl  JS > var cluster = dba.getCluster()
 MySQL  db-cluster-1:33060+ ssl  JS > cluster.status()
{
    "clusterName": "mycluster1", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "db-cluster-1:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures. 2 members are not active.", 
        "topology": {
            "db-cluster-1:3306": {
                "address": "db-cluster-1:3306", 
                "memberRole": "PRIMARY", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "replicationLag": null, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.28"
            }, 
            "db-cluster-2:3306": {
                "address": "db-cluster-2:3306", 
                "instanceErrors": [
                    "NOTE: group_replication is stopped."
                ], 
                "memberRole": "SECONDARY", 
                "memberState": "OFFLINE", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)", 
                "version": "8.0.28"
            }, 
            "db-cluster-3:3306": {
                "address": "db-cluster-3:3306", 
                "instanceErrors": [
                    "ERROR: GR Recovery channel receiver stopped with an error: error connecting to master 'mysql_innodb_cluster_2324239842@db-cluster-1:3306' - retry-time: 60 retries: 1 message: Access denied for user 'mysql_innodb_cluster_2324239842'@'10.10.10.11' (using password: YES) (1045) at 2023-09-19 04:37:00.076960", 
                    "ERROR: group_replication has stopped with an error."
                ], 
                "memberRole": "SECONDARY", 
                "memberState": "ERROR", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)", 
                "version": "8.0.28"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "db-cluster-1:3306"
}

The problem here is the MySQL version is 8.0.28, but after MySQL 8.0.30 there are much more features, which can be used in the second argument of rebootClusterFromCompleteOutage including, which server should be considered primary therefore healthy. In fact, the updated rebootClusterFromCompleteOutage of MySQL 8.0.34 version even auto-detected the correct and healthy node and booted the MySQL InnoDB Cluster.
There were no problems with the update from MySQL 8.0.28 to MySQL 8.0.34 and after the MySQL 8.0.34 started, the rebootClusterFromCompleteOutage reconfigured and started the cluster with the right and healthy server auto-detected. In fact, it is safer to use the second argument and set the option, which is the healthy server “{primary: “db-cluster-1:3306″}”.
Keep on reading!

Switch to a new master (primary) in MySQL InnoDB Cluster 8

Switching to a new master (or new primary if to use the new naming) in a MySQL 8 InnoDB Cluster is simple with the MySQL Shell console and the function of the cluster variable – setPrimaryInstance.

main menu
MySQL Shell with setPrimaryInstance

Why would someone need to do it manually? One of the reasons may be because one of the nodes is on the same physical server and thus suppose a smaller latency.

First, get a cluster object of the cluster by connecting to the cluster API with MySQL Shell:

[root@db-cluster-1 ~]# mysqlsh
MySQL Shell 8.0.28

Copyright (c) 2016, 2022, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.

Type '\help' or '\?' for help; '\quit' to exit.
 MySQL  JS > \connect clusteradmin@db-cluster-1
Creating a session to 'clusteradmin@db-cluster-1'
Fetching schema names for autocompletion... Press ^C to stop.
Your MySQL connection id is 166928419 (X protocol)
Server version: 8.0.28 MySQL Community Server - GPL
No default schema selected; type \use <schema> to set one.
 MySQL  db-cluster-1:33060+ ssl  JS > var cluster = dba.getCluster()

Second, show the status of the cluster to get the cluster topology and the exact nodes’ names, which will use as an argument of the setPrimaryInstance. Still, in the MySQL Shell Console:

 MySQL  db-cluster-1:33060+ ssl  JS > cluster.status()
{
    "clusterName": "mycluster1", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "db-cluster-2:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "db-cluster-1:3306": {
                "address": "db-cluster-1:3306", 
                "memberRole": "SECONDARY", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "replicationLag": null, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.28"
            }, 
            "db-cluster-2:3306": {
                "address": "db-cluster-2:3306", 
                "memberRole": "PRIMARY", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "replicationLag": null, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.28"
            }, 
            "db-cluster-3:3306": {
                "address": "db-cluster-3:3306", 
                "memberRole": "SECONDARY", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "replicationLag": null, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.28"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "db-cluster-2:3306"
}

Keep on reading!

Viewing the progress of MySQL 8 Cluster InnoDB recovery

This article will show several handy MySQL commands for viewing the progress of MySQL 8 Cluster recovery and how the administrators may keep track of how much time will need a MySQL InnoDB Cluster node will need to complete the recovering procedure.

main menu
SHOW REPLICA STATUS CHANNEL ‘group_replication_recovery’

If the reader needs to recover from a node failure there is the other article – Recovery of MySQL 8 Cluster instance after server crash and corrupted data in log event. In this article, the MySQL commands are executed on a CentOS Stream 8 with MySQL InnoDB 8 Cluster (here is how it is installed – Install and deploy MySQL 8 InnoDB Cluster with 3 nodes under CentOS 8 and MySQL Router for HA), which one node had been offline for about several weeks. The node was powered off normally, so the MySQL instance on the server was shut down gracefully.
Initially, the cluster state after the power down was two nodes with a missing one.
Use MySQL Shell Console to view the MySQL InnoDB 8 Cluster status:

[root@db-cluster-1 ~]# mysqlsh
MySQL Shell 8.0.28

Copyright (c) 2016, 2022, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.

Type '\help' or '\?' for help; '\quit' to exit.
 MySQL  JS > \connect clusteradmin@db-cluster-1
Creating a session to 'clusteradmin@db-cluster-1'
Fetching schema names for autocompletion... Press ^C to stop.
Your MySQL connection id is 158633505 (X protocol)
Server version: 8.0.28 MySQL Community Server - GPL
No default schema selected; type \use <schema> to set one.
 MySQL  db-cluster-1:33060+ ssl  JS > var cluster = dba.getCluster()
 MySQL  db-cluster-1:33060+ ssl  JS > cluster.status()
{
    "clusterName": "mycluster1", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "db-cluster-2:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active.", 
        "topology": {
            "db-cluster-1:3306": {
                "address": "db-cluster-1:3306", 
                "memberRole": "SECONDARY", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "replicationLag": null, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.28"
            }, 
            "db-cluster-2:3306": {
                "address": "db-cluster-2:3306", 
                "memberRole": "PRIMARY", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "replicationLag": null, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.28"
            }, 
            "db-cluster-3:3306": {
                "address": "db-cluster-3:3306", 
                "memberRole": "SECONDARY", 
                "mode": "n/a", 
                "readReplicas": {}, 
                "role": "HA", 
                "shellConnectError": "MySQL Error 2003: Could not open connection to 'db-cluster-3:3306': Can't connect to MySQL server on 'db-cluster-3:3306' (111)", 
                "status": "(MISSING)"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "db-cluster-2:3306"
}
 MySQL  db-cluster-1:33060+ ssl  JS >

When the third server in the cluster is started (from a clean shutdown), the third node will be in a recovery state. Here is the same status command and the third node in recovery. Still, with the MySQL Shell Console:
Keep on reading!

Recover from Unable to fetch live group_replication member data from any server in cluster

main menu
MySQL Router Unable to fetch live group_replication

After multiple networking connectivity issues between MySQL InnoDB Cluster nodes, the cluster may break and the MySQL Router begins to log the following messages:

2022-10-11 15:20:48 metadata_cache ERROR [7f2d619fe640] Unable to fetch live group_replication member data from any server in cluster 'mycluster1'
2022-10-11 15:20:49 metadata_cache WARNING [7f2d619fe640] Member db-cluster-2:3306 (05b6c7c7-f285-11ec-adfc-00163e0b38ff) defined in metadata not found in actual Group Replication
2022-10-11 15:20:49 metadata_cache WARNING [7f2d619fe640] db-cluster-1:3306 is not part of quorum for cluster 'mycluster1'
2022-10-11 15:20:49 metadata_cache WARNING [7f2d619fe640] Member db-cluster-1:3306 (8bf2c25f-90ae-11ec-93d1-00163e20a401) defined in metadata not found in actual Group Replication
2022-10-11 15:20:49 metadata_cache WARNING [7f2d619fe640] db-cluster-3:3306 is not part of quorum for cluster 'mycluster1'
2022-10-11 15:20:49 metadata_cache WARNING [7f2d619fe640] Member db-cluster-3:3306 (99856952-90ae-11ec-9a5f-fafd8f1acc17) defined in metadata not found in actual Group Replication
2022-10-11 15:20:49 metadata_cache WARNING [7f2d619fe640] db-cluster-2:3306 is not part of quorum for cluster 'mycluster1'

And in MySQL nodes there are also the errors of unable to connect to 33061:

2022-10-11T15:16:25.728393Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error on opening a connection to peer node db-cluster-1:33061 when joining a group. My local port is: 33061.'
2022-10-11T15:16:25.728714Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error on opening a connection to peer node db-cluster-3:33061 when joining a group. My local port is: 33061.'
2022-10-11T15:16:25.729195Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error on opening a connection to peer node db-cluster-1:33061 when joining a group. My local port is: 33061.'
2022-10-11T15:16:25.729569Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error on opening a connection to peer node db-cluster-3:33061 when joining a group. My local port is: 33061.'
2022-10-11T15:16:25.730154Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error on opening a connection to peer node db-cluster-1:33061 when joining a group. My local port is: 33061.'
2022-10-11T15:16:25.730474Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error on opening a connection to peer node db-cluster-3:33061 when joining a group. My local port is: 33061.'
2022-10-11T15:16:25.730485Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error connecting to all peers. Member join failed. Local port: 33061'
2022-10-11T15:16:25.782015Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 33061'

When a MySQL Cluster node node cannot connect to the 33061 (or 3306X), it may be a signal for a firewall issue or the group replication has not started on this node, which is the case here!

The group replication is not working and the cluster must be recovered. All nodes wait to join an exiting cluster, which is not available. It has not stared yet and it would not start alone even the administrator may restart all the nodes.
Keep on reading!

How To Install Linux, Apache, MySQL (MariaDB), PHP-FPM (LAMP) Stack on CentOS Stream 9

main menu
PHP Version 8.0.20

This article describes how to install a Web server with application back-end PHP and database back-end MySQL using MariaDB. In continuing the same topic, but with different software from the previous article – How To Install Linux, Nginx, MySQL (MariaDB), PHP-FPM (LEMP) Stack on CentOS Stream 9, where the Web server is Nginx with application back-end PHP-FPM, which is a sort of CGI (FastCGI). In this article, the Web server is Apache and the application is again PHP-FPM, because since the CentOS 8 the Apache mod_php is deprecated.
All the software installed throughout this article is from the CentOS Stream 9 official repositories including the EPEL repository. The machine is installed with a minimal installation of CentOS Stream 9 and there is a how-to here – Network installation of CentOS Stream 9 (20220606.0) – minimal server installation.
Here are the steps to perform:

  1. Install, configure and start the database MariaDB.
  2. Install, configure and start the PHP-FPM and PHP cli.
  3. Install, configure and start the Web server Apache 2.x.
  4. Configure the system – firewall and SELinux.
  5. Test the installation with a phpMyAdmin installation.
  6. Bonus – Apache HTTPS with SSL certificate – self-signed and letsencrypt.

STEP 1) Install, configure and start the database MariaDB.

First, install the MariaDB server by:

dnf install -y mariadb-server

To configure the MariaDB server, the main file is /etc/my.cnf, which just includes all files under the folder /etc/my.cnf.d/

[root@srv ~]# cat /etc/my.cnf
#
# This group is read both both by the client and the server
# use it for options that affect everything
#
[client-server]

#
# include all files from the config directory
#
!includedir /etc/my.cnf.d

[root@srv ~]# ls -altr /etc/my.cnf.d/
total 32
-rw-r--r--.  1 root root  295 Mar 25  2022 client.cnf
-rw-r--r--.  1 root root  120 May 18 07:55 spider.cnf
-rw-r--r--.  1 root root  232 May 18 07:55 mysql-clients.cnf
-rw-r--r--.  1 root root  763 May 18 07:55 enable_encryption.preset
-rw-r--r--.  1 root root 1458 Jun 13 13:24 mariadb-server.cnf
-rw-r--r--.  1 root root   42 Jun 13 13:29 auth_gssapi.cnf
drwxr-xr-x.  2 root root 4096 Oct  6 06:34 .
drwxr-xr-x. 81 root root 4096 Oct  6 06:34 ..

The most important file for the MariaDB server is /etc/my.cnf.d/mariadb-server.cnf, where all the server options are included. Under section “[mysqld]” add options to tune the MariaDB server. Supported options could be found here: https://mariadb.com/kb/en/mysqld-options/
Add the following options under “[mysqld]” in /etc/my.cnf.d/mariadb-server.cnf
Keep on reading!

How To Install Linux, Nginx, MySQL (MariaDB), PHP-FPM (LEMP) Stack on CentOS Stream 9

main menu
dnf mariadb

This article presents how to install a Web server with application back-end PHP and database back-end MySQL using MariaDB. All the software installed throughout this article is from the CentOS Stream 9 official repositories including the EPEL repository. The machine is installed with a minimal installation of CentOS Stream 9 and there is a how-to here – Network installation of CentOS Stream 9 (20220606.0) – minimal server installation.
Here are the steps to perform:

  1. Install, configure and start the database MariaDB.
  2. Install, configure and start the PHP-FPM and PHP cli.
  3. Install, configure and start the Web server Nginx.
  4. Configure the system – firewall and SELinux.
  5. Test the installation with a phpMyAdmin installation.
  6. Bonus – Nginx HTTPS with SSL certificate – self-signed and letsencrypt.

STEP 1) Install, configure and start the database MariaDB.

First, install the MariaDB server by:

dnf install -y mariadb-server

To configure the MariaDB server, the main file is /etc/my.cnf, which just includes all files under the folder /etc/my.cnf.d/

[root@srv ~]# cat /etc/my.cnf
#
# This group is read both both by the client and the server
# use it for options that affect everything
#
[client-server]

#
# include all files from the config directory
#
!includedir /etc/my.cnf.d

[root@srv ~]# ls -altr /etc/my.cnf.d/
total 32
-rw-r--r--.  1 root root  295 Mar 25  2022 client.cnf
-rw-r--r--.  1 root root  120 May 18 07:55 spider.cnf
-rw-r--r--.  1 root root  232 May 18 07:55 mysql-clients.cnf
-rw-r--r--.  1 root root  763 May 18 07:55 enable_encryption.preset
-rw-r--r--.  1 root root 1458 Jun 13 13:24 mariadb-server.cnf
-rw-r--r--.  1 root root   42 Jun 13 13:29 auth_gssapi.cnf
drwxr-xr-x.  2 root root 4096 Oct  6 06:34 .
drwxr-xr-x. 81 root root 4096 Oct  6 06:34 ..

The most important file for the MariaDB server is /etc/my.cnf.d/mariadb-server.cnf, where all the server options are included. Under section “[mysqld]” add options to tune the MariaDB server. Supported options could be found here: https://mariadb.com/kb/en/mysqld-options/
Add the following options under “[mysqld]” in /etc/my.cnf.d/mariadb-server.cnf
Keep on reading!

Recovery of MySQL 8 Cluster instance after server crash and corrupted data in log event

There is a MySQL 8 Cluster InnoDB of three servers and one of the server crashed with a bad RAM. The same setup is described here – Install and deploy MySQL 8 InnoDB Cluster with 3 nodes under CentOS 8 and MySQL Router for HA. The failed server got restarted without clean shutdown and after booting up the MySQL Cluster node tried to recover automatically, but the recover process failed and the node left the group of the three server:

2022-05-31T04:00:00.322469Z 24 [ERROR] [MY-011620] [Repl] Plugin group_replication reported: 'Fatal error during the incremental recovery process of Group Replication. The server will leave the group.'
2022-05-31T04:00:00.322489Z 24 [Warning] [MY-011645] [Repl] Plugin group_replication reported: 'Skipping leave operation: concurrent attempt to leave the group is on-going.'
2022-05-31T04:00:00.322500Z 24 [ERROR] [MY-011712] [Repl] Plugin group_replication reported: 'The server was automatically set into read only mode after an error was detected.'
2022-05-31T04:00:03.448475Z 0 [System] [MY-011504] [Repl] Plugin group_replication reported: 'Group membership changed: This member has left the group.'

The recovery process proposed here follows these steps

  1. Connect with mysqlsh (MySQL Shell) to a MySQL instance, which is currently a part of the cluster group. The member, which left the group is not part any more, though the MySQL Cluster status shows it is part of the cluster topology, but with error.
  2. Remove the bad instance from the MySQL Cluster with removeInstance
  3. Add the instance with addInstance and the recovery process will kick in. The type of the recovery process will be chosen by the setup if not specified. In this case, the setup chooses the Incremental state recovery over (full) clone mode.
  4. Initiate the cluster rescan operation to recovery the group replication and the MySQL Cluster.

mysql

Summery of the recovery process

  • The recovery process was successful.
  • The distributed recovery with Incremental state recovery has finished for 24 hours for 200Mbyte database, which is really strange and the speed was really bad. The instance uses ordinary disks, not SSDs and a 1Gbps network.
  • No need to change or manage the MySQL Router in any of the steps or the recovery stages. It handled the situation from the very beginning by removing the bad instance and then adding it again only after the recovery process had finished successfully.
  • MySQL Shell should be connected to an healthy instance currently a part of the Cluster.

In the console output logs all commands and important lines are highlighted.

STEP 1) Remove the bad instance from the cluster.

The status of the cluster with the bad instance.

[root@db-cluster-3 ~]# mysqlsh
MySQL Shell 8.0.28

Copyright (c) 2016, 2022, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.

Type '\help' or '\?' for help; '\quit' to exit.
 MySQL  JS > \connect clusteradmin@db-cluster-1
Creating a session to 'clusteradmin@db-cluster-1'
Fetching schema names for autocompletion... Press ^C to stop.
Closing old connection...
Your MySQL connection id is 39806649 (X protocol)
Server version: 8.0.28 MySQL Community Server - GPL
No default schema selected; type \use <schema> to set one.
 MySQL  db-cluster-1:33060+ ssl  JS > var cluster = dba.getCluster()
 MySQL  db-cluster-1:33060+ ssl  JS > cluster.status()
{
    "clusterName": "mycluster1", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "db-cluster-1:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active.", 
        "topology": {
            "db-cluster-1:3306": {
                "address": "db-cluster-1:3306", 
                "memberRole": "PRIMARY", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "replicationLag": null, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.28"
            }, 
            "db-cluster-2:3306": {
                "address": "db-cluster-2:3306", 
                "memberRole": "SECONDARY", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "replicationLag": null, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.28"
            }, 
            "db-cluster-3:3306": {
                "address": "db-cluster-3:3306", 
                "instanceErrors": [
                    "ERROR: group_replication has stopped with an error."
                ], 
                "memberRole": "SECONDARY", 
                "memberState": "ERROR", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)", 
                "version": "8.0.28"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "db-cluster-1:3306"
}

Keep on reading!

Install and deploy MySQL 8 InnoDB Cluster with 3 nodes under CentOS 8 and MySQL Router for HA

This article is going to show how to install a MySQL server and deploy a MySQL 8 InnoDB Cluster with three nodes behind a MySQL router to archive a high availability with MySQL database back-end.

In really simple words, MySQL 8.0 InnoDB Cluster is just MySQL replication on steroids – i.e. a little more additional work between the servers in the group before committing the transactions. It uses MySQL Group Replication plugin, which allows the group to operate in two different modes:

  1. a single-primary mode with automatic primary election. Only one server gets the updates.
  2. a multi-master mode – all servers accept the updates. For advanced setups.

Group Replication is bi-directional, the servers communicate with each other and use row replication to replicate the data. The main limitation is that only the MySQL InnoDB engine is supported, because of the transactions support. So the performance (and most features and caveats) of MySQL InnoDB is not impacted by cluster setup and overhead compared to the MySQL in replication mode (or a single server setups) from the previous MySQL versions. Still, all read-write transactions commit only after they have been approved by the group – a verification process providing consensus between the servers. In fact, most of the features like GUIDs, row-based replication (i.e. different replication modes) are developed and available to older versions. The new part is handled by Group Communication System (GCS) protocols, which provide a failure detection mechanism, a group membership service, and a safe and completely ordered message delivery (more on the subject here https://dev.mysql.com/doc/refman/8.0/en/group-replication-background.html).
In addition to the group replication, MySQL Router 8.0 provides the HAhigh availability. The program, which redirects, fails over, balances to the right server in the group is the MySQL Router. Clients may connect directly to the servers in the group, but only if the clients connect using MySQL router will have HA because Group Replication does not have a built-in method for it. It is worth noting, there could be many MySQL Routers in different servers, they do not need to communicate or synchronize anything with each other. So the router could be installed in the same place, where the application is installed or on a separate dedicated server, or on every MySQL server in the group.

Key points in this article of MySQL InnoDB Cluster deployment:

  • CentOS 8 Stream is used for the operating system
  • SELinux tuning to allow MySQL process to connect the network.
  • CentOS 8 firewall tuning to unblock the nodes traffic between them.
  • Disable mysql package system module to use the official MySQL repository.
  • Three MySQL 8.0.28 server nodes will be installed
  • To create and manage the cluster MySQL Shell 8.0 and dba object in it are used.
  • Three MySQL routers on each MySQL node will be installed.
  • Each server will have the domains of the all three servers in /etc/hosts file – db-cluster-1, db-cluster-2, db-cluster-3.
  • The cluster is in group replication with one primary (i.e. master) and two secondary nodes (i.e. slaves)

STEP 1) Install CentOS 8 Stream.

There is an article with the CentOS 8 – How to do a network installation of CentOS 8 (8.0.1950) – minimal server installation, which installation is essentially the same as CentOS 8 Stream.

STEP 2) Prepare the CentOS 8 Stream to install MySQL 8 server.

At present, the latest MySQL Community edition is 8.0.28. The preferred way to install the MySQL server is to download the RPM repository file from MySQL web site – https://dev.mysql.com/downloads/repo/yum/
Keep on reading!

alter table error 1215 (HY000): Cannot add foreign key constraint

Adding a foreign key may result in the following error:

mysql> ALTER TABLE `table1` ADD CONSTRAINT FK_table2_id FOREIGN KEY (`table2_id`) REFERENCES table2(`ID`);
ERROR 1215 (HY000): Cannot add foreign key constraint

This error can occur because in couple of cases and it is not very informative, but basically, it says there is some kind of compatibility issue between the table1.table_id and table2.ID. And the three most common problems, besides the tables and columns exists, are:

  1. One of the two table are not Innodb.
    SHOW CREATE TABLE `table1`;
    
  2. There is a value (or values) in table1.table_id, which does not exists in table2.ID.
  3. SELECT `table1`.`table2_id`, `table2`.`ID` FROM `table1` LEFT JOIN `table2` ON `table1`.`table2_id`=`table2`.`ID`;
    

    Execute a left join to check whether there are NULL values in the table1.table2_id. If NULL values exist a record with the same missing IDs should be inserted in table2.ID. this kind of check may be stopped temporarily with:

    SET FOREIGN_KEY_CHECKS=0;
    

    Change it back to 1 after the ALTER clause.

  4. The type and the attributes of the columns are different. The attributes such as UNSINGED may cause a problem, too!
    ALTER TABLE `table1` CHANGE `table2_id` `table2_id` INT(11) UNSIGNED NOT NULL; 
    

    In this case, the column types are the same, but the attributes are different and the MySQL server throws the error! Change the signed int to unsigned with the above command. And after the table1.table2_id and table2.ID are of the same type and they have the same attributes, the alter command will be executed successfully.

Here are the initial structure of above tables:

CREATE TABLE `table1` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `table2_id` int(11) NOT NULL,
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=220 DEFAULT CHARSET=utf8
CREATE TABLE `table2` (
  `ID` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `col1` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=642 DEFAULT CHARSET=utf8 

And after a successful ALTER statement, the table1 will looks like:

ALTER TABLE `table1` ADD CONSTRAINT `FK_table2_id` FOREIGN KEY (`table2_id`) REFERENCES `table1`(`ID`);
Query OK, 216 rows affected (0.92 sec)
Records: 216  Duplicates: 0  Warnings: 0

CREATE TABLE `table1` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `table2_id` int(11) NOT NULL,
  PRIMARY KEY (`ID`),
  KEY `FK_table2_id` (`table1_id`),
  CONSTRAINT `FK_table2_id` FOREIGN KEY (`table2_id`) REFERENCES `table2` (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=220 DEFAULT CHARSET=utf8

MySQL slave upgrade: Slave failed to initialize relay log info structure from the repository

MySQL slave after upgrade from 5.6.x to 5.7.x may throw the following error:

mysql> START SLAVE;
ERROR 1872 (HY000): Slave failed to initialize relay log info structure from the repository

The best solution for this error is to:

  • Master server – mysqldump the database with –master-data=1 –single-transaction
  • On the slave server issue command “RESET SLAVE;”
  • On the slave server import the dump sql file and issue “CHANGE MASTER” command with the meta data written in the sql dump
  • On the slave server issue START SLAVE to start the replication.

Here is an a real world example:
First, mysqldump in the master with

root@master ~ # mysqldump --master-data=1 --single-transaction mydb > /root/mydb.sql
root@master ~ # grep "CHANGE MASTER" media.sql 
CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.023283', MASTER_LOG_POS=537774724;

And then copy the dump file to the slave server and import it and issue several specific slave commands:

root@slave ~ # mysql < /root/mydb.sql
root@slave ~ # mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 23
Server version: 5.7.31-log Gentoo Linux mysql-5.7.31

Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> reset slave;
Query OK, 0 rows affected (0.01 sec)
mysql> CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.023283', MASTER_LOG_POS=537774724;
Query OK, 0 rows affected (0.00 sec)
mysql> START SLAVE;
Query OK, 0 rows affected (0.00 sec)

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Queueing master event to the relay log
                  Master_Host: 10.10.10.10
                  Master_User: ruser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.023283
          Read_Master_Log_Pos: 641769286
               Relay_Log_File: slave-relay-bin.000002
                Relay_Log_Pos: 90874706
        Relay_Master_Log_File: mysql-bin.023283
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: mydb.%
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 628649113
              Relay_Log_Space: 103995088
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 2395
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 101
                  Master_UUID: cd1bcebb-cc27-11e8-90c9-801844f2c4d8
             Master_Info_File: /mnt/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Reading event from the relay log
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 
                Auto_Position: 0
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

The replication is advancing. It is 2395 seconds behind the master.
Keep on reading!