Delete Glusterfs volume when a peer is down – failed: Some of the peers are down

Deleting GlusterFS volumes may fail with an error, pointing out some of the peers are down, i.e. they are disconnected. Even all the volume’s peers of the volume the user is trying to delete are available, still the error appears and it is not possible to delete the volume.
That’s because GlusterFS by design stores the volume configuration spread to all peers – no matter they host a brick/arbiter of the volume or not. If a peer is a part of a GlusterFS setup, it is mandatory to be available and online in the peer status, to be able to delete a volume.
If the user still wants to delete the volume:

  1. * Force remove the brink, which was hosted on the detached peer. If any!
  2. Detach the disconnected peer from the peers
  3. Delete the volume

Here are real examples with and without a brick on the unavailable peer.
The initial volumes and peers configuration:

[root@srv1 ~]# gluster volume info
 
Volume Name: VOL1
Type: Replicate
Volume ID: 02ff2995-7307-4f3d-aa24-862edda7ce81
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: ng1:/mnt/storage1/glusterfs/brick1
Brick2: ng3:/mnt/storage1/glusterfs/brick1
Brick3: ng1:/mnt/storage1/glusterfs/arbiter1 (arbiter)
Options Reconfigured:
features.scrub: Active
features.bitrot: on
cluster.self-heal-daemon: enable
storage.linux-io_uring: off
client.event-threads: 4
performance.cache-max-file-size: 50MB
performance.parallel-readdir: on
network.inode-lru-limit: 200000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
performance.cache-size: 2048MB
performance.client-io-threads: on
nfs.disable: on
transport.address-family: inet
 
Volume Name: VOL2
Type: Replicate
Volume ID: fc2e82e4-2576-4bb1-b9bf-c6b2aff10ef0
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: ng1:/mnt/storage1/glusterfs/brick2
Brick2: ng2:/mnt/storage1/glusterfs/brick2
Brick3: ng1:/mnt/storage1/glusterfs/arbiter2 (arbiter)
Options Reconfigured:
features.scrub: Active
features.bitrot: on
cluster.self-heal-daemon: enable
storage.linux-io_uring: off
performance.parallel-readdir: on
network.compression: off
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
features.cache-invalidation: on

[root@srv ~]# gluster peer status
Number of Peers: 2

Hostname: ng1
Uuid: 7953514b-b52c-4a5c-be03-763c3e24eb4e
State: Peer in Cluster (Connected)

Hostname: ng3
Uuid: 3d273834-eca6-4997-871f-1a282ca90fb0
State: Peer in Cluster (Disconnected)

Delete a GlusterFS volume – all bricks and bricks’ peers are available, but another peer is not.

First, the error, when the disconnected peer is still in peer status list.

[root@srv ~]# gluster volume stop VOL2
Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y
volume stop: VOL2: success
[root@srv ~]# gluster volume delete VOL2
Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y
volume delete: VOL2: failed: Some of the peers are down

Keep on reading!

Create and export a GlusterFS volume with NFS-Ganesha in CentOS 8

GlusterFS built-in NFS server supports only NFS version 3. GlusterFS offers NFS exports using NFS-Ganesha, which supports NFS version 3 and 4 protocols.
NFS-Ganesha server is a user-mode file sharing server, which offers a GlusterFS plugin to export GlusterFS volumes. In the following article, the NSF-Ganesha and GlusterFS are installed and a simple GlusterFS volume is created and then exported through NFS 3 and 4 version protocols.
The version of the software in this article:

  • CentOS Stream release 8 (25.04.2021)
  • GlusterFS 8.4
  • NFS-Ganesha 3.5

STEP 1) Install GlusterFS.

dnf install -y centos-release-gluster
dnf install -y glusterfs-server

The first line will installs a new repository under the SIG management – https://wiki.centos.org/SpecialInterestGroup/Storage. The second line installs the GlusterFS server.

STEP 2) Install NFS-Ganesha.

dnf install -y centos-release-nfs-ganesha30
dnf install -y nfs-ganesha nfs-ganesha-gluster

The first line again installs a new repository under the SIG management and the second line installs the NFS-Ganesha server with Gluster plugin.

STEP 3) Create GlusterFS volume

Start the GlusterFS server and create a simple 3 replicas volume with:
Start the GlusterFS on all the three nodes and enable the GlusterFS communication between the three nodes using firewall-cmd utility. So execute the following commands:

systemctl start glusterd
firewall-cmd --permanent --new-zone=glusternodes
firewall-cmd --permanent --zone=glusternodes --add-source=192.168.0.200
firewall-cmd --permanent --zone=glusternodes --add-source=192.168.0.201
firewall-cmd --permanent --zone=glusternodes --add-source=192.168.0.202
firewall-cmd --permanent --zone=glusternodes --add-service=glusterfs
firewall-cmd --reload

On the first node create the GlusterFS volume. First, add the glnode2 and glnode3 to the cluster.

gluster peer probe glnode2
gluster peer probe glnode3
gluster volume create VOL1 replica 3 transport tcp glnode1:/mnt/storage/gluster/brick glnode2:/mnt/storage/gluster/brick glnode3:/mnt/storage/gluster/brick
gluster volume start VOL1

Keep on reading!

Stopping the glusterfs volume releases disk sleep process hangs

A quick tip for GlusterFS volume. There are multiple possible reasons for a Linux process to hang in “Disk Sleep” state, which even the KILL -9 cannot interrupt:

  • a bug in GlusterFS
  • just bad options turn on online
  • other device relying on a GlusterFS, which is unavailable.
[17294588.184470] INFO: task gdisk:12505 blocked for more than 120 seconds.
[17294588.184538] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[17294588.184628] gdisk           D ffff8ce01fb9acc0     0 12505  26866 0x00000080
[17294588.184780] Call Trace:
[17294588.184844]  [<ffffffffbaed3d81>] ? __wake_up_common_lock+0x91/0xc0
[17294588.184910]  [<ffffffffbb585da9>] schedule+0x29/0x70
[17294588.184974]  [<ffffffffbb5838b1>] schedule_timeout+0x221/0x2d0
[17294588.185037]  [<ffffffffbaed3dc3>] ? __wake_up+0x13/0x20
[17294588.185102]  [<ffffffffc0a05d2e>] ? loop_make_request+0x12e/0x210 [loop]
[17294588.185169]  [<ffffffffbaf06d32>] ? ktime_get_ts64+0x52/0xf0
[17294588.185232]  [<ffffffffbb58549d>] io_schedule_timeout+0xad/0x130
[17294588.185304]  [<ffffffffbb5863dd>] wait_for_completion_io+0xfd/0x140
[17294588.185369]  [<ffffffffbaedb990>] ? wake_up_state+0x20/0x20
[17294588.185468]  [<ffffffffbb157e64>] blkdev_issue_flush+0xb4/0x110
[17294588.185533]  [<ffffffffbb08d335>] blkdev_fsync+0x35/0x50
[17294588.185598]  [<ffffffffbb082f57>] do_fsync+0x67/0xb0
[17294588.185671]  [<ffffffffbb083240>] SyS_fsync+0x10/0x20
[17294588.185734]  [<ffffffffbb592ed2>] system_call_fastpath+0x25/0x2a
[17294708.187598] INFO: task gdisk:12505 blocked for more than 120 seconds.
[17294708.187664] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[17294708.187753] gdisk           D ffff8ce01fb9acc0     0 12505  26866 0x00000080
[17294708.187905] Call Trace:
[17294708.187968]  [<ffffffffbaed3d81>] ? __wake_up_common_lock+0x91/0xc0
[17294708.188033]  [<ffffffffbb585da9>] schedule+0x29/0x70
[17294708.188096]  [<ffffffffbb5838b1>] schedule_timeout+0x221/0x2d0
[17294708.188159]  [<ffffffffbaed3dc3>] ? __wake_up+0x13/0x20
[17294708.188223]  [<ffffffffc0a05d2e>] ? loop_make_request+0x12e/0x210 [loop]
[17294708.188289]  [<ffffffffbaf06d32>] ? ktime_get_ts64+0x52/0xf0
[17294708.188352]  [<ffffffffbb58549d>] io_schedule_timeout+0xad/0x130
[17294708.188416]  [<ffffffffbb5863dd>] wait_for_completion_io+0xfd/0x140
[17294708.188480]  [<ffffffffbaedb990>] ? wake_up_state+0x20/0x20
[17294708.188545]  [<ffffffffbb157e64>] blkdev_issue_flush+0xb4/0x110
[17294708.188624]  [<ffffffffbb08d335>] blkdev_fsync+0x35/0x50
[17294708.188690]  [<ffffffffbb082f57>] do_fsync+0x67/0xb0
[17294708.188754]  [<ffffffffbb083240>] SyS_fsync+0x10/0x20
[17294708.188828]  [<ffffffffbb592ed2>] system_call_fastpath+0x25/0x2a

The above example of dmesg log shows the gdisk process stuck in “Disk Sleep” state, because of a loop device from a file on an unavailable GlusterFS volume! Kill -9 won’t help, the process will remain in this bad state and even a restart would be difficult to perform!

[root@srv1 ~]# gluster volume stop VOL2 
Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y
volume stop: VOL2: success
[root@srv1 ~]# gluster volume start VOL2 
volume start: VOL2: success

The solution is to stop the GlusterFS Volume and all the blocked processes on bad devices such as above would be released. The processes will carry on executing or will end their execution after issuing a stop command to the volume. No problem to start the GlusterFS volume immediately after the stop!
NOTE: executing STOP command would affect all servers using this volume. The volume becomes inaccessible for all!