Migrate from NFS Kernel Server to NFS-Ganesha server under CentOS Stream 9

This article is to show how to migrate from the NFS kernel server to the NFS-Ganesha server under CentOS Stream 9. The most important thing for migrating from one program to another program is how much downtime will be and what is expected to be done by the clients. In this case, what the clients are needed to do when NFS-Ganesha is used for the server?

main menu
install nfs ganesha

Here are the main points when migrating from NFS Kernel Server to the NFS-Ganesha:

  • The nfs-tuils and nfs-ganesha packages and in general, the two software, are perfectly fine installed on the same system. There are no conflicts when NFS Kernel Server and the NFS-Ganesha server are installed at the same time on the same system.
  • The clients, do not need to do anything, except remount the NFS mounts.
  • It should be installed a new community repository by installing the centos-release-nfs-ganesha5 package. The Special Interest Groups (SIG) maintains the repository and the group is within the CentOS community

For installation of NFS-Ganesha and a detailed information check out the older article on the subject – Simple export of a ext4 directory with NFS Ganesha 3.5 server in CentOS 8 with SELinux enforcing, Simple export of a ext4 directory with NFS Ganesha 3.5 server in CentOS 8 without SELinux and Create and export a GlusterFS volume with NFS-Ganesha in CentOS 8

Prerequisite – NFS Kernel Configuration

NFS Kernel Server is installed with nfs-utils packages (and its dependencies) and it has the following simple configuration:

[root@srv ~]# cat /etc/exports
/mnt/storage           192.168.0.0/24(rw,sync,no_root_squash,no_subtree_check)

And here are the NFS services on the system:

[root@srv ~]# systemctl |grep nfs
  proc-fs-nfsd.mount                                         loaded active mounted   NFSD configuration filesystem
  var-lib-nfs-rpc_pipefs.mount                               loaded active mounted   RPC Pipe File System
  nfs-idmapd.service                                         loaded active running   NFSv4 ID-name mapping service
  nfs-mountd.service                                         loaded active running   NFS Mount Daemon
  nfs-server.service                                         loaded active exited    NFS server and services
  nfsdcld.service                                            loaded active running   NFSv4 Client Tracking Daemon
  nfs-client.target                                          loaded active active    NFS client services

The server’s firewall has been tuned for the NFS kernel server, so no need to edit anything in the firewall for the NFS-Ganesha server.
Keep on reading!

Close socket as if the remote closed the connection

If you have a hung process and it happened to be in this state because of the network, for example your client or server program is in read timeout state, you can use

lsof and gbg

to close the network socket simulating the other (remote) end closed it and the process will continue operating normally.

In our case there is a couple of nrpe process hung in read from a network socket:

[root@srv ~]# lsof -n -p 9948
COMMAND  PID USER   FD   TYPE             DEVICE SIZE/OFF    NODE NAME
nrpe    9948 nrpe  cwd    DIR                9,2     4096       2 /
nrpe    9948 nrpe  rtd    DIR                9,2     4096       2 /
nrpe    9948 nrpe  txt    REG                9,2    69960 1053396 /usr/sbin/nrpe
nrpe    9948 nrpe  mem    REG                9,2    62184 1053312 /usr/lib64/libnss_files-2.17.so
nrpe    9948 nrpe  mem    REG                9,2   402384 1051943 /usr/lib64/libpcre.so.1.2.0
nrpe    9948 nrpe  mem    REG                9,2   155784 1057231 /usr/lib64/libselinux.so.1
nrpe    9948 nrpe  mem    REG                9,2   144792 1051919 /usr/lib64/libpthread-2.17.so
nrpe    9948 nrpe  mem    REG                9,2   106848 1053314 /usr/lib64/libresolv-2.17.so
nrpe    9948 nrpe  mem    REG                9,2    15688 1051678 /usr/lib64/libkeyutils.so.1.5
nrpe    9948 nrpe  mem    REG                9,2    58728 1051843 /usr/lib64/libkrb5support.so.0.1
nrpe    9948 nrpe  mem    REG                9,2    90664 1051808 /usr/lib64/libz.so.1.2.7
nrpe    9948 nrpe  mem    REG                9,2    19776 1053308 /usr/lib64/libdl-2.17.so
nrpe    9948 nrpe  mem    REG                9,2   210840 1051701 /usr/lib64/libk5crypto.so.3.1
nrpe    9948 nrpe  mem    REG                9,2    15920 1051682 /usr/lib64/libcom_err.so.2.1
nrpe    9948 nrpe  mem    REG                9,2   963576 1051755 /usr/lib64/libkrb5.so.3.3
nrpe    9948 nrpe  mem    REG                9,2   320408 1051956 /usr/lib64/libgssapi_krb5.so.2.2
nrpe    9948 nrpe  mem    REG                9,2  2173512 1051792 /usr/lib64/libc-2.17.so
nrpe    9948 nrpe  mem    REG                9,2    42520 1051997 /usr/lib64/libwrap.so.0.7.6
nrpe    9948 nrpe  mem    REG                9,2   117680 1053310 /usr/lib64/libnsl-2.17.so
nrpe    9948 nrpe  mem    REG                9,2  2512832 1051648 /usr/lib64/libcrypto.so.1.0.2k
nrpe    9948 nrpe  mem    REG                9,2   470360 1051690 /usr/lib64/libssl.so.1.0.2k
nrpe    9948 nrpe  mem    REG                9,2   164240 1049135 /usr/lib64/ld-2.17.so
nrpe    9948 nrpe    0r   CHR                1,3      0t0    1028 /dev/null
nrpe    9948 nrpe    1w   CHR                1,3      0t0    1028 /dev/null
nrpe    9948 nrpe    2w   CHR                1,3      0t0    1028 /dev/null
nrpe    9948 nrpe    3u  unix 0xffff961d48d37000      0t0   19091 socket
nrpe    9948 nrpe    6u  IPv4          261850576      0t0     TCP 10.10.10.10:5666->10.10.10.254:39056 (ESTABLISHED)

As you can see the FD column shows the File Descriptor number of the opened file (network resource here) and you can use it with

gdb

to simulate closing the network socket as if the remote close it but from the same machine.

[root@srv ~]# gdb -p 9948
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-110.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Attaching to process 9948
Reading symbols from /usr/sbin/nrpe...Reading symbols from /usr/sbin/nrpe...(no debugging symbols found)...done.
(no debugging symbols found)...done.
....
....
Loaded symbols for /lib64/libpcre.so.1
Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libnss_files.so.2
0x00007f91e8295c70 in __read_nocancel () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install nrpe-3.2.0-6.el7.x86_64
(gdb) call shutdown(6, 0)
$1 = 0
(gdb) quit
A debugging session is active.

        Inferior 1 [process 9948] will be detached.

Quit anyway? (y or n) Y
Detaching from program: /usr/sbin/nrpe, process 9948

Just call

call shutdown(FileDescriptorID, 0)

and quit the gdb. In our case the FileDescriptorID is 6, so we executed

call shutdown(6, 0)

And the network socket between this machine and the remote one will be terminated, so the process nrpe could continue its execution.
Of course, in your cases you can look for a specific network connection among many other, but lsof is the tool you can use to identify the connection and the right file descriptor number to use in gdb.