Delete Glusterfs volume when a peer is down – failed: Some of the peers are down

Deleting GlusterFS volumes may fail with an error, pointing out some of the peers are down, i.e. they are disconnected. Even all the volume’s peers of the volume the user is trying to delete are available, still the error appears and it is not possible to delete the volume.
That’s because GlusterFS by design stores the volume configuration spread to all peers – no matter they host a brick/arbiter of the volume or not. If a peer is a part of a GlusterFS setup, it is mandatory to be available and online in the peer status, to be able to delete a volume.
If the user still wants to delete the volume:

  1. * Force remove the brink, which was hosted on the detached peer. If any!
  2. Detach the disconnected peer from the peers
  3. Delete the volume

Here are real examples with and without a brick on the unavailable peer.
The initial volumes and peers configuration:

[root@srv1 ~]# gluster volume info
 
Volume Name: VOL1
Type: Replicate
Volume ID: 02ff2995-7307-4f3d-aa24-862edda7ce81
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: ng1:/mnt/storage1/glusterfs/brick1
Brick2: ng3:/mnt/storage1/glusterfs/brick1
Brick3: ng1:/mnt/storage1/glusterfs/arbiter1 (arbiter)
Options Reconfigured:
features.scrub: Active
features.bitrot: on
cluster.self-heal-daemon: enable
storage.linux-io_uring: off
client.event-threads: 4
performance.cache-max-file-size: 50MB
performance.parallel-readdir: on
network.inode-lru-limit: 200000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
performance.cache-size: 2048MB
performance.client-io-threads: on
nfs.disable: on
transport.address-family: inet
 
Volume Name: VOL2
Type: Replicate
Volume ID: fc2e82e4-2576-4bb1-b9bf-c6b2aff10ef0
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: ng1:/mnt/storage1/glusterfs/brick2
Brick2: ng2:/mnt/storage1/glusterfs/brick2
Brick3: ng1:/mnt/storage1/glusterfs/arbiter2 (arbiter)
Options Reconfigured:
features.scrub: Active
features.bitrot: on
cluster.self-heal-daemon: enable
storage.linux-io_uring: off
performance.parallel-readdir: on
network.compression: off
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
features.cache-invalidation: on

[root@srv ~]# gluster peer status
Number of Peers: 2

Hostname: ng1
Uuid: 7953514b-b52c-4a5c-be03-763c3e24eb4e
State: Peer in Cluster (Connected)

Hostname: ng3
Uuid: 3d273834-eca6-4997-871f-1a282ca90fb0
State: Peer in Cluster (Disconnected)

Delete a GlusterFS volume – all bricks and bricks’ peers are available, but another peer is not.

First, the error, when the disconnected peer is still in peer status list.

[root@srv ~]# gluster volume stop VOL2
Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y
volume stop: VOL2: success
[root@srv ~]# gluster volume delete VOL2
Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y
volume delete: VOL2: failed: Some of the peers are down

Keep on reading!