Deleting GlusterFS volumes may fail with an error, pointing out some of the peers are down, i.e. they are disconnected. Even all the volume’s peers of the volume the user is trying to delete are available, still the error appears and it is not possible to delete the volume.
That’s because GlusterFS by design stores the volume configuration spread to all peers – no matter they host a brick/arbiter of the volume or not. If a peer is a part of a GlusterFS setup, it is mandatory to be available and online in the peer status, to be able to delete a volume.
If the user still wants to delete the volume:
- * Force remove the brink, which was hosted on the detached peer. If any!
- Detach the disconnected peer from the peers
- Delete the volume
Here are real examples with and without a brick on the unavailable peer.
The initial volumes and peers configuration:
[root@srv1 ~]# gluster volume info Volume Name: VOL1 Type: Replicate Volume ID: 02ff2995-7307-4f3d-aa24-862edda7ce81 Status: Started Snapshot Count: 0 Number of Bricks: 1 x (2 + 1) = 3 Transport-type: tcp Bricks: Brick1: ng1:/mnt/storage1/glusterfs/brick1 Brick2: ng3:/mnt/storage1/glusterfs/brick1 Brick3: ng1:/mnt/storage1/glusterfs/arbiter1 (arbiter) Options Reconfigured: features.scrub: Active features.bitrot: on cluster.self-heal-daemon: enable storage.linux-io_uring: off client.event-threads: 4 performance.cache-max-file-size: 50MB performance.parallel-readdir: on network.inode-lru-limit: 200000 performance.md-cache-timeout: 600 performance.cache-invalidation: on performance.stat-prefetch: on features.cache-invalidation-timeout: 600 features.cache-invalidation: on performance.cache-size: 2048MB performance.client-io-threads: on nfs.disable: on transport.address-family: inet Volume Name: VOL2 Type: Replicate Volume ID: fc2e82e4-2576-4bb1-b9bf-c6b2aff10ef0 Status: Started Snapshot Count: 0 Number of Bricks: 1 x (2 + 1) = 3 Transport-type: tcp Bricks: Brick1: ng1:/mnt/storage1/glusterfs/brick2 Brick2: ng2:/mnt/storage1/glusterfs/brick2 Brick3: ng1:/mnt/storage1/glusterfs/arbiter2 (arbiter) Options Reconfigured: features.scrub: Active features.bitrot: on cluster.self-heal-daemon: enable storage.linux-io_uring: off performance.parallel-readdir: on network.compression: off transport.address-family: inet nfs.disable: on performance.client-io-threads: off features.cache-invalidation: on [root@srv ~]# gluster peer status Number of Peers: 2 Hostname: ng1 Uuid: 7953514b-b52c-4a5c-be03-763c3e24eb4e State: Peer in Cluster (Connected) Hostname: ng3 Uuid: 3d273834-eca6-4997-871f-1a282ca90fb0 State: Peer in Cluster (Disconnected)
Delete a GlusterFS volume – all bricks and bricks’ peers are available, but another peer is not.
First, the error, when the disconnected peer is still in peer status list.
[root@srv ~]# gluster volume stop VOL2 Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y volume stop: VOL2: success [root@srv ~]# gluster volume delete VOL2 Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y volume delete: VOL2: failed: Some of the peers are down
To delete the volume it must be stopped, first. Then the delete command fails because of the disconnected peer “ng3“.
Detach the peer “ng3” and issue the command again:
[root@srv ~]# gluster peer detach ng3 All clients mounted through the peer which is getting detached need to be remounted using one of the other active peers in the trusted storage pool to ensure client gets notification on any changes done on the gluster configuration and if the same has been done do you want to proceed? (y/n) y peer detach: success [root@srv ~]# gluster peer status Number of Peers: 1 Hostname: ng1 Uuid: 7953514b-b52c-4a5c-be03-763c3e24eb4e State: Peer in Cluster (Connected) [root@srv ~]# gluster volume delete VOL2 Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y volume delete: VOL2: success
Delete a GlusterFS volume – a brick and the brick’s peer is not available.
[root@srv ~]# gluster volume remove-brick VOL1 replica 2 ng1:/mnt/storage1/glusterfs/arbiter1 force Remove-brick force will not migrate files from the removed bricks, so they will no longer be available on the volume. Do you want to continue? (y/n) y volume remove-brick commit force: success [root@srv ~]# gluster volume remove-brick VOL1 replica 1 ng3:/mnt/storage1/glusterfs/brick1 force Remove-brick force will not migrate files from the removed bricks, so they will no longer be available on the volume. Do you want to continue? (y/n) y volume remove-brick commit force: success [root@srv ~]# gluster volume info Volume Name: VOL1 Type: Distribute Volume ID: 02ff2995-7307-4f3d-aa24-862edda7ce81 Status: Stopped Snapshot Count: 0 Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: ng1:/mnt/storage1/glusterfs/brick1 Options Reconfigured: features.scrub: Active features.bitrot: on cluster.self-heal-daemon: enable storage.linux-io_uring: off client.event-threads: 4 performance.cache-max-file-size: 50MB performance.parallel-readdir: on network.inode-lru-limit: 200000 performance.md-cache-timeout: 600 performance.cache-invalidation: on performance.stat-prefetch: on features.cache-invalidation-timeout: 600 features.cache-invalidation: on performance.cache-size: 2048MB performance.client-io-threads: on nfs.disable: on transport.address-family: inet [root@srv ~]# gluster peer detach ng3 All clients mounted through the peer which is getting detached need to be remounted using one of the other active peers in the trusted storage pool to ensure client gets notification on any changes done on the gluster configuration and if the same has been done do you want to proceed? (y/n) y peer detach: success [root@srv ~]# gluster volume delete VOL1 Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y volume delete: VOL1: success
There is a catch in this setup – it has 2 bricks and 1 arbiter. There is a missing brick, so to delete the volume VOL1, the user should delete the arbiter first and then the unavailable brick and at the end, to delete the volume!
Here are some wrong executions:
[root@srv ~]# gluster volume delete VOL1 Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y volume delete: VOL1: failed: Some of the peers are down [root@srv ~]# gluster volume remove-brick VOL1 replica 1 ng3:/mnt/storage1/glusterfs/brick1 force Remove-brick force will not migrate files from the removed bricks, so they will no longer be available on the volume. Do you want to continue? (y/n) y volume remove-brick commit force: failed: need 2(xN) bricks for reducing replica count of the volume from 3 to 1 [root@srv ~]# gluster volume remove-brick VOL1 ng3:/mnt/storage1/glusterfs/brick1 force Remove-brick force will not migrate files from the removed bricks, so they will no longer be available on the volume. Do you want to continue? (y/n) y volume remove-brick commit force: failed: Removing bricks from replicate configuration is not allowed without reducing replica count explicitly. [root@srv ~]# gluster volume remove-brick VOL1 replica 2 ng3:/mnt/storage1/glusterfs/brick1 force Remove-brick force will not migrate files from the removed bricks, so they will no longer be available on the volume. Do you want to continue? (y/n) y volume remove-brick commit force: failed: Remove arbiter brick(s) only when converting from arbiter to replica 2 subvolume. [root@srv ~]# gluster volume remove-brick VOL1 replica 2 lsrv3.stoev.eu:/mnt/storage1/glusterfs/brick1 start Replica 2 volumes are prone to split-brain. Use Arbiter or Replica 3 to avoid this. See: http://docs.gluster.org/en/latest/Administrator%20Guide/Split%20brain%20and%20ways%20to%20deal%20with%20it/. Do you still want to continue? (y/n) y It is recommended that remove-brick be run with cluster.force-migration option disabled to prevent possible data corruption. Doing so will ensure that files that receive writes during migration will not be migrated and will need to be manually copied after the remove-brick commit operation. Please check the value of the option and update accordingly. Do you want to continue with your current cluster.force-migration settings? (y/n) y volume remove-brick start: failed: Remove arbiter brick(s) only when converting from arbiter to replica 2 subvolume. [root@srv ~]# gluster volume remove-brick VOL1 replica 1 ng3:/mnt/storage1/glusterfs/brick1 start It is recommended that remove-brick be run with cluster.force-migration option disabled to prevent possible data corruption. Doing so will ensure that files that receive writes during migration will not be migrated and will need to be manually copied after the remove-brick commit operation. Please check the value of the option and update accordingly. Do you want to continue with your current cluster.force-migration settings? (y/n) y volume remove-brick start: failed: need 2(xN) bricks for reducing replica count of the volume from 3 to 1 [root@srv ~]# gluster volume remove-brick VOL1 replica 3 ng3:/mnt/storage1/glusterfs/brick1 force Remove-brick force will not migrate files from the removed bricks, so they will no longer be available on the volume. Do you want to continue? (y/n) y volume remove-brick commit force: failed: number of bricks provided (1) is not valid. need at least 3 (or 3xN) [root@srv ~]# gluster volume remove-brick VOL1 replica 2 ng3:/mnt/storage1/glusterfs/brick1 ng1:/mnt/storage1/glusterfs/arbiter1 force Remove-brick force will not migrate files from the removed bricks, so they will no longer be available on the volume. Do you want to continue? (y/n) y volume remove-brick commit force: failed: Remove arbiter brick(s) only when converting from arbiter to replica 2 subvolume. [root@srv ~]# gluster volume remove-brick VOL1 replica 1 ng1:/mnt/storage1/glusterfs/arbiter1 force Remove-brick force will not migrate files from the removed bricks, so they will no longer be available on the volume. Do you want to continue? (y/n) y volume remove-brick commit force: failed: need 2(xN) bricks for reducing replica count of the volume from 3 to 1
It is a little bit more complicated when using 3 replicas in 2 brick and 1 arbiter mode. It would be more simpler if the volume consisted of 3 bricks, then only one remove command would be enough to downgrading the replica volume from 3 to 2 replica. In the case, when it should be removed 1 brick replica from a 2 brick and 1 arbiter, there are two commands – first remove the arbiter and then the disconnected brick and that’s what the above commands give output.
Check out how to create such GlusterFS volumes – glusterfs with localhost (127.0.0.1) nodes on different servers – glusterfs volume with 3 replicas