Recover from Unable to fetch live group_replication member data from any server in cluster

main menu
MySQL Router Unable to fetch live group_replication

After multiple networking connectivity issues between MySQL InnoDB Cluster nodes, the cluster may break and the MySQL Router begins to log the following messages:

2022-10-11 15:20:48 metadata_cache ERROR [7f2d619fe640] Unable to fetch live group_replication member data from any server in cluster 'mycluster1'
2022-10-11 15:20:49 metadata_cache WARNING [7f2d619fe640] Member db-cluster-2:3306 (05b6c7c7-f285-11ec-adfc-00163e0b38ff) defined in metadata not found in actual Group Replication
2022-10-11 15:20:49 metadata_cache WARNING [7f2d619fe640] db-cluster-1:3306 is not part of quorum for cluster 'mycluster1'
2022-10-11 15:20:49 metadata_cache WARNING [7f2d619fe640] Member db-cluster-1:3306 (8bf2c25f-90ae-11ec-93d1-00163e20a401) defined in metadata not found in actual Group Replication
2022-10-11 15:20:49 metadata_cache WARNING [7f2d619fe640] db-cluster-3:3306 is not part of quorum for cluster 'mycluster1'
2022-10-11 15:20:49 metadata_cache WARNING [7f2d619fe640] Member db-cluster-3:3306 (99856952-90ae-11ec-9a5f-fafd8f1acc17) defined in metadata not found in actual Group Replication
2022-10-11 15:20:49 metadata_cache WARNING [7f2d619fe640] db-cluster-2:3306 is not part of quorum for cluster 'mycluster1'

And in MySQL nodes there are also the errors of unable to connect to 33061:

2022-10-11T15:16:25.728393Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error on opening a connection to peer node db-cluster-1:33061 when joining a group. My local port is: 33061.'
2022-10-11T15:16:25.728714Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error on opening a connection to peer node db-cluster-3:33061 when joining a group. My local port is: 33061.'
2022-10-11T15:16:25.729195Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error on opening a connection to peer node db-cluster-1:33061 when joining a group. My local port is: 33061.'
2022-10-11T15:16:25.729569Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error on opening a connection to peer node db-cluster-3:33061 when joining a group. My local port is: 33061.'
2022-10-11T15:16:25.730154Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error on opening a connection to peer node db-cluster-1:33061 when joining a group. My local port is: 33061.'
2022-10-11T15:16:25.730474Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error on opening a connection to peer node db-cluster-3:33061 when joining a group. My local port is: 33061.'
2022-10-11T15:16:25.730485Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error connecting to all peers. Member join failed. Local port: 33061'
2022-10-11T15:16:25.782015Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 33061'

When a MySQL Cluster node node cannot connect to the 33061 (or 3306X), it may be a signal for a firewall issue or the group replication has not started on this node, which is the case here!

The group replication is not working and the cluster must be recovered. All nodes wait to join an exiting cluster, which is not available. It has not stared yet and it would not start alone even the administrator may restart all the nodes.
Keep on reading!

Recovery of MySQL 8 Cluster instance after server crash and corrupted data in log event

There is a MySQL 8 Cluster InnoDB of three servers and one of the server crashed with a bad RAM. The same setup is described here – Install and deploy MySQL 8 InnoDB Cluster with 3 nodes under CentOS 8 and MySQL Router for HA. The failed server got restarted without clean shutdown and after booting up the MySQL Cluster node tried to recover automatically, but the recover process failed and the node left the group of the three server:

2022-05-31T04:00:00.322469Z 24 [ERROR] [MY-011620] [Repl] Plugin group_replication reported: 'Fatal error during the incremental recovery process of Group Replication. The server will leave the group.'
2022-05-31T04:00:00.322489Z 24 [Warning] [MY-011645] [Repl] Plugin group_replication reported: 'Skipping leave operation: concurrent attempt to leave the group is on-going.'
2022-05-31T04:00:00.322500Z 24 [ERROR] [MY-011712] [Repl] Plugin group_replication reported: 'The server was automatically set into read only mode after an error was detected.'
2022-05-31T04:00:03.448475Z 0 [System] [MY-011504] [Repl] Plugin group_replication reported: 'Group membership changed: This member has left the group.'

The recovery process proposed here follows these steps

  1. Connect with mysqlsh (MySQL Shell) to a MySQL instance, which is currently a part of the cluster group. The member, which left the group is not part any more, though the MySQL Cluster status shows it is part of the cluster topology, but with error.
  2. Remove the bad instance from the MySQL Cluster with removeInstance
  3. Add the instance with addInstance and the recovery process will kick in. The type of the recovery process will be chosen by the setup if not specified. In this case, the setup chooses the Incremental state recovery over (full) clone mode.
  4. Initiate the cluster rescan operation to recovery the group replication and the MySQL Cluster.

mysql

Summery of the recovery process

  • The recovery process was successful.
  • The distributed recovery with Incremental state recovery has finished for 24 hours for 200Mbyte database, which is really strange and the speed was really bad. The instance uses ordinary disks, not SSDs and a 1Gbps network.
  • No need to change or manage the MySQL Router in any of the steps or the recovery stages. It handled the situation from the very beginning by removing the bad instance and then adding it again only after the recovery process had finished successfully.
  • MySQL Shell should be connected to an healthy instance currently a part of the Cluster.

In the console output logs all commands and important lines are highlighted.

STEP 1) Remove the bad instance from the cluster.

The status of the cluster with the bad instance.

[root@db-cluster-3 ~]# mysqlsh
MySQL Shell 8.0.28

Copyright (c) 2016, 2022, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.

Type '\help' or '\?' for help; '\quit' to exit.
 MySQL  JS > \connect clusteradmin@db-cluster-1
Creating a session to 'clusteradmin@db-cluster-1'
Fetching schema names for autocompletion... Press ^C to stop.
Closing old connection...
Your MySQL connection id is 39806649 (X protocol)
Server version: 8.0.28 MySQL Community Server - GPL
No default schema selected; type \use <schema> to set one.
 MySQL  db-cluster-1:33060+ ssl  JS > var cluster = dba.getCluster()
 MySQL  db-cluster-1:33060+ ssl  JS > cluster.status()
{
    "clusterName": "mycluster1", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "db-cluster-1:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active.", 
        "topology": {
            "db-cluster-1:3306": {
                "address": "db-cluster-1:3306", 
                "memberRole": "PRIMARY", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "replicationLag": null, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.28"
            }, 
            "db-cluster-2:3306": {
                "address": "db-cluster-2:3306", 
                "memberRole": "SECONDARY", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "replicationLag": null, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.28"
            }, 
            "db-cluster-3:3306": {
                "address": "db-cluster-3:3306", 
                "instanceErrors": [
                    "ERROR: group_replication has stopped with an error."
                ], 
                "memberRole": "SECONDARY", 
                "memberState": "ERROR", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)", 
                "version": "8.0.28"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "db-cluster-1:3306"
}

Keep on reading!

Install and deploy MySQL 8 InnoDB Cluster with 3 nodes under CentOS 8 and MySQL Router for HA

This article is going to show how to install a MySQL server and deploy a MySQL 8 InnoDB Cluster with three nodes behind a MySQL router to archive a high availability with MySQL database back-end.

In really simple words, MySQL 8.0 InnoDB Cluster is just MySQL replication on steroids – i.e. a little more additional work between the servers in the group before committing the transactions. It uses MySQL Group Replication plugin, which allows the group to operate in two different modes:

  1. a single-primary mode with automatic primary election. Only one server gets the updates.
  2. a multi-master mode – all servers accept the updates. For advanced setups.

Group Replication is bi-directional, the servers communicate with each other and use row replication to replicate the data. The main limitation is that only the MySQL InnoDB engine is supported, because of the transactions support. So the performance (and most features and caveats) of MySQL InnoDB is not impacted by cluster setup and overhead compared to the MySQL in replication mode (or a single server setups) from the previous MySQL versions. Still, all read-write transactions commit only after they have been approved by the group – a verification process providing consensus between the servers. In fact, most of the features like GUIDs, row-based replication (i.e. different replication modes) are developed and available to older versions. The new part is handled by Group Communication System (GCS) protocols, which provide a failure detection mechanism, a group membership service, and a safe and completely ordered message delivery (more on the subject here https://dev.mysql.com/doc/refman/8.0/en/group-replication-background.html).
In addition to the group replication, MySQL Router 8.0 provides the HAhigh availability. The program, which redirects, fails over, balances to the right server in the group is the MySQL Router. Clients may connect directly to the servers in the group, but only if the clients connect using MySQL router will have HA because Group Replication does not have a built-in method for it. It is worth noting, there could be many MySQL Routers in different servers, they do not need to communicate or synchronize anything with each other. So the router could be installed in the same place, where the application is installed or on a separate dedicated server, or on every MySQL server in the group.

Key points in this article of MySQL InnoDB Cluster deployment:

  • CentOS 8 Stream is used for the operating system
  • SELinux tuning to allow MySQL process to connect the network.
  • CentOS 8 firewall tuning to unblock the nodes traffic between them.
  • Disable mysql package system module to use the official MySQL repository.
  • Three MySQL 8.0.28 server nodes will be installed
  • To create and manage the cluster MySQL Shell 8.0 and dba object in it are used.
  • Three MySQL routers on each MySQL node will be installed.
  • Each server will have the domains of the all three servers in /etc/hosts file – db-cluster-1, db-cluster-2, db-cluster-3.
  • The cluster is in group replication with one primary (i.e. master) and two secondary nodes (i.e. slaves)

STEP 1) Install CentOS 8 Stream.

There is an article with the CentOS 8 – How to do a network installation of CentOS 8 (8.0.1950) – minimal server installation, which installation is essentially the same as CentOS 8 Stream.

STEP 2) Prepare the CentOS 8 Stream to install MySQL 8 server.

At present, the latest MySQL Community edition is 8.0.28. The preferred way to install the MySQL server is to download the RPM repository file from MySQL web site – https://dev.mysql.com/downloads/repo/yum/
Keep on reading!