After staring a new LXC container, the syslog program (Syslog-ng) began to throw thousands of errors with this kind of message:
Dec 1 10:50:36 srv kernel: SELinux: inode_doinit_use_xattr: getxattr returned 2 for dev=0:43 ino=-6977140995289226736
Dec 1 10:50:36 srv kernel: SELinux: inode_doinit_use_xattr: getxattr returned 2 for dev=0:43 ino=-6551465724643968476
Dec 1 10:50:36 srv kernel: SELinux: inode_doinit_use_xattr: getxattr returned 2 for dev=0:43 ino=-5980833553552494142
Dec 1 10:50:36 srv kernel: SELinux: inode_doinit_use_xattr: getxattr returned 2 for dev=0:43 ino=-8820947409424952637
Dec 1 10:50:36 srv kernel: SELinux: inode_doinit_use_xattr: getxattr returned 2 for dev=0:43 ino=-8270463809263745561
Dec 1 10:50:36 srv kernel: SELinux: inode_doinit_use_xattr: getxattr returned 2 for dev=0:43 ino=-7923279144252216900
Dec 1 10:50:36 srv kernel: SELinux: inode_doinit_use_xattr: getxattr returned 2 for dev=0:43 ino=-6181977668994943343
Dec 1 10:50:36 srv kernel: SELinux: inode_doinit_use_xattr: getxattr returned 2 for dev=0:43 ino=-7585065875445167421
Dec 1 10:50:36 srv kernel: SELinux: inode_doinit_use_xattr: getxattr returned 2 for dev=0:43 ino=-7923279144252216900
Dec 1 10:50:36 srv kernel: SELinux: inode_doinit_use_xattr: getxattr returned 2 for dev=0:43 ino=-5826517164673898101
Dec 1 10:50:36 srv kernel: SELinux: inode_doinit_use_xattr: getxattr returned 2 for dev=0:43 ino=-7585065875445167421
Dec 1 11:01:01 h3 rsyslogd[1147]: imjournal: 3871493 messages lost due to rate-limiting (20000 allowed within 600 seconds)
These messages were logged in thousands. The same time, the NFS statistics showed a strange peak of using getattr. Something was calling getattr thousands times per second. Despite there were no SELinux blocks in audit.log as the dmesg suggested the SELinux might be blamed.
The LXC container is an application container, which has mound bind directory from the host server. The very same directory is an local NFS share (using NFS-Ganesha) of a GlusterFS volume and the PHP files are situated there.
So the LXC container reads the PHP files from this NFS share. There were no issues to access the files and the application LXC worked just fine.
The problem disappeared when the NFS share was remounted with SELinux permissions using the context word:
All the files are of SELinux label httpd_sys_rw_content_t and after restarting the LXC container there were no SELinux lines in the dmesg and the syslog logs. The administrator should configure the right SELinux permissions to the LXC bound directories. More on why SELinux sometimes does not report on blocks in the audit.log here – Selinux permission denied and no log in audit.log.
This article is to show how to migrate from the NFS kernel server to the NFS-Ganesha server under CentOS Stream 9. The most important thing for migrating from one program to another program is how much downtime will be and what is expected to be done by the clients. In this case, what the clients are needed to do when NFS-Ganesha is used for the server?
Here are the main points when migrating from NFS Kernel Server to the NFS-Ganesha:
The nfs-tuils and nfs-ganesha packages and in general, the two software, are perfectly fine installed on the same system. There are no conflicts when NFS Kernel Server and the NFS-Ganesha server are installed at the same time on the same system.
The clients, do not need to do anything, except remount the NFS mounts.
It should be installed a new community repository by installing the centos-release-nfs-ganesha5 package. The Special Interest Groups (SIG) maintains the repository and the group is within the CentOS community
[root@srv ~]# systemctl |grep nfs
proc-fs-nfsd.mount loaded active mounted NFSD configuration filesystem
var-lib-nfs-rpc_pipefs.mount loaded active mounted RPC Pipe File System
nfs-idmapd.service loaded active running NFSv4 ID-name mapping service
nfs-mountd.service loaded active running NFS Mount Daemon
nfs-server.service loaded active exited NFS server and services
nfsdcld.service loaded active running NFSv4 Client Tracking Daemon
nfs-client.target loaded active active NFS client services
The server’s firewall has been tuned for the NFS kernel server, so no need to edit anything in the firewall for the NFS-Ganesha server. Keep on reading!
This article is for those of you who do not want to install a whole new operating system only to discover some technical details about the default installation like disk layout, packages included, software versions, and so on. Here we are going to review in several sections what is like to have a default installation of Fedora Server 35 using a realnot virtual machine!
The kernel is 5.14.10 it detects successfully the Threadripper 1950X AMD and the system is stable (we booted in UEFI mode).
The installation procedure uses default options for all installation setups – Minimal network installation of Fedora 35 Server. Installed packages are 604 occupying 1.7G space:. Note, this is Fedora Server Install, not minimal install. The server install includes the web console – cockpit version 254.
[root@srv ~]# dnf list installed|wc -l
604
[root@srv ~]# df -h /
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/fedora_fedora-root 15G 1.4G 14G 10% /
Here are quick steps to cache an NFS mounts (it works with NFS-Ganesha servers, too):
Install the daemon tool cachefilesd
Check the configuration file /etc/cachefilesd.conf. In most cases, no need to edit the file! Just check the disk limits if they are good.
Start the cachefilesd daemon.
Mount the network directories with “fsc” option. Umount and mount them all if they’ve been already mounted. The fsc is mandatory option to enable file cacheing of a network mount.
Check stats to see if the file cching is working properly.
The example below is under CentOS 8, but it is almost the same in most Linux distributions.
STEP 1) Install the daemon tool cachefilesd
This is straight forward, just install it with the package manager:
[root@srv ~]# dnf install cachefilesd
Last metadata expiration check: 2:33:44 ago on Tue 08 Dec 2020 07:18:01 AM UTC.
Dependencies resolved.
=============================================================================================================================================================================================
Package Architecture Version Repository Size
=============================================================================================================================================================================================
Installing:
cachefilesd x86_64 0.10.10-4.el8 BaseOS 43 k
Transaction Summary
=============================================================================================================================================================================================
Install 1 Package
Total download size: 43 k
Installed size: 71 k
Is this ok [y/N]: y
Downloading Packages:
cachefilesd-0.10.10-4.el8.x86_64.rpm 3.1 MB/s | 43 kB 00:00
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total 2.8 MB/s | 43 kB 00:00
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Installing : cachefilesd-0.10.10-4.el8.x86_64 1/1
Running scriptlet: cachefilesd-0.10.10-4.el8.x86_64 1/1
Verifying : cachefilesd-0.10.10-4.el8.x86_64 1/1
Installed:
cachefilesd-0.10.10-4.el8.x86_64
Complete!
STEP 2) Check the configuration file and tune for your system.
In most cases, the defaults in /etc/cachefilesd.conf are good to start with:
dir /var/cache/fscache
tag mycache
brun 10%
bcull 7%
bstop 3%
frun 10%
fcull 7%
fstop 3%
# Assuming you're using SELinux with the default security policy included in
# this package
secctx system_u:system_r:cachefiles_kernel_t:s0
The directory where the cache will reside and the lines with the percentages are for disk space limitation. “brun 10%” means cache can runs freely till the disk space drops below 10%. “bcull 7%” – culling the cache when the free space drops below “7%” and more in the man page (or https://linux.die.net/man/5/cachefilesd.conf).
So if one maintains disk free space below 10% the configuration file should be edited.
STEP 3) Start the cachefilesd daemon.
And enable on boot to start automatically.
[root@srv ~]# systemctl start cachefilesd
[root@srv ~]# systemctl enable cachefilesd
Created symlink /etc/systemd/system/multi-user.target.wants/cachefilesd.service → /usr/lib/systemd/system/cachefilesd.service.
[root@srv ~]# systemctl status cachefilesd
● cachefilesd.service - Local network file caching management daemon
Loaded: loaded (/usr/lib/systemd/system/cachefilesd.service; enabled; vendor preset: disabled)
Active: active (running) since Tue 2020-12-08 10:01:24 UTC; 11s ago
Main PID: 29786 (cachefilesd)
Tasks: 1 (limit: 408616)
Memory: 2.5M
CGroup: /system.slice/cachefilesd.service
└─29786 /usr/sbin/cachefilesd -n -f /etc/cachefilesd.conf
Dec 08 10:01:24 srv systemd[1]: Starting Local network file caching management daemon...
Dec 08 10:01:24 srv systemd[1]: Started Local network file caching management daemon.
Dec 08 10:01:24 srv cachefilesd[29786]: About to bind cache
Dec 08 10:01:24 srv cachefilesd[29786]: Bound cache
Dec 08 10:01:24 srv cachefilesd[29786]: Daemon Started
The status command shows the daemon cachefilesd is running. But does it cache?
STEP 4) Mount the network filesystems with option fsc
To make cachefilesd cache a network mount the option fsc must be included in the mount options. Remount may not work correctly, so to be sure a full umount/mount should be executed. Here is an example /etc/fstab file:
And here is the cache directory filled with files. If there are no files, the FS cache is not used, probably the mount is not mounted with FSC! Umount and mount the mounts again.
GlusterFS built-in NFS server supports only NFS version 3. GlusterFS offers NFS exports using NFS-Ganesha, which supports NFS version 3 and 4 protocols. NFS-Ganesha server is a user-mode file sharing server, which offers a GlusterFS plugin to export GlusterFS volumes. In the following article, the NSF-Ganesha and GlusterFS are installed and a simple GlusterFS volume is created and then exported through NFS 3 and 4 version protocols.
The version of the software in this article:
The first line again installs a new repository under the SIG management and the second line installs the NFS-Ganesha server with Gluster plugin.
STEP 3) Create GlusterFS volume
Start the GlusterFS server and create a simple 3 replicas volume with:
Start the GlusterFS on all the three nodes and enable the GlusterFS communication between the three nodes using firewall-cmd utility. So execute the following commands:
This is a review of the netdata graphs. Here you can see what you can expect to have when you install netdata (version 1.10) in you server.
As you can see many of the graphs have detailed explanations and some of them have hits what to monitor and pay attention to.
CHART 1) System Overview and grapsh which gather statistics from all parts of the system like CPU, load, disk, ram, swap, network, processes, idlejitter, interrups, softirqs, softnet, entropy, ipc semaphores, uptime.
This is a fst view of the resources of the system and it presents summarized statistics, not detailed! For example you can expect to have the total CPU usage not per core or processor and so on.
CHART 2) CPU and Load
1) Total CPU utilization, netdata Quotation: “Total CPU utilization (all cores). 100% here means there is no CPU idle time at all. You can get per core usage at the CPUs section and per application usage at the Applications Monitoring section. Keep an eye on iowait. If it is constantly high, your disks are a bottleneck and they slow your system down. Another important metric worth monitoring, is softirq. A constantly high percentage of softirq may indicate network driver issues.” and 2) System Load Average – netdata Quotation: “Current system load, i.e. the number of processes using CPU or waiting for system resources (usually CPU and disk). The 3 metrics refer to 1, 5 and 15 minute averages. Linux calculates this once every 5 seconds. Netdata reads them from /proc/loadavg.””
CHART 3) Disk
1) Total Disk I/O for all disks from /proc/vmstat. You can easily match how much of the read/written data is from/to disks. 2) Memory paged form/to disk.
CHART 4) RAM
1) Read from /proc/meminfo. It shows the total RAM and how much is free, used, cached and in buffers. Together with swap graph this is like “free” linux command in the browser. 2) Read from /proc/meminfo. It shows total, free and used swap memory. 3) Swap I/O – Read from /proc/vmstat. More interesting than the previous one, because here you can get aware how often is used your swap device. In fact if you have ins and outs here even a couple of them you probably need more physical RAM or you have misconfigured a service or a application, which could be identified by graphs in Applications->mem or User->mem – which shows the applications’ and users’ ram usage.
CHART 5) All network traffic on all interfaces – no virtual ones included, but it includes IPv4 and IPv6 traffic.
CHART 6) Processes
1) Read /proc/stat. It appears the Running are “processes in the CPU” and Blocked are in Disk sleep. netdata Quotation: “System processes, read from /proc/stat. Running are the processes in the CPU. Blocked are processes that are willing to enter the CPU, but they cannot, e.g. because they wait for disk activity.” 2) The number of new processes created per second. 3) All system processes – the total number for the given time.
CHART 7) Context Switches and idle
1) Context Switches – how many times the CPU is switching from one process, thread or task to another. 2) netdata Quotation: “idle jitter is calculated by netdata. A thread is spawned that requests to sleep for a few microseconds. When the system wakes it up, it measures how many microseconds have passed. The difference between the requested and the actual duration of the sleep, is the idle jitter. This number is useful in real-time environments, where CPU jitter can affect the quality of the service (like VoIP media gateways).”
CHART 8) Interrupts and softirqs
1) Total number of CPU interrupts, 2) System interrupts – hardware interrupts – which part of your hardware system is doing the interrups – you could identify a hardware abuser. 3) CPU softirqs in detail, read from /proc/softirqs – you could identify a software abuser – a service or a processes
CHART 9) softnet and entropy
1) netdata Quotation: “Statistics for CPUs SoftIRQs related to network receive work. Break down per CPU core can be found at CPU / softnet statistics. processed states the number of packets processed, dropped is the number packets dropped because the network device backlog was full (to fix them on Linux use sysctl to increase net.core.netdev_max_backlog), squeezed is the number of packets dropped because the network device budget ran out (to fix them on Linux use sysctl to increase net.core.netdev_budget).” 2) netdata Quotation: “Entropy, is a pool of random numbers (/dev/random) that is mainly used in cryptography. If the pool of entropy gets empty, processes requiring random numbers may run a lot slower (it depends on the interface each program uses), waiting for the pool to be replenished. Ideally a system with high entropy demands should have a hardware device for that purpose (TPM is one such device). There are also several software-only options you may install, like haveged, although these are generally useful only in servers.”
CHART 10) IPC Semaphores and Uptime
1) The total ipc semaphores used in the system 3) uptime of the system
CHART 11) CPU
Utilization by core/logical processor. You can see how much percentage of the CPU is spent in user, system, iowait (probably disk operations!) and softirq (mainly network, but could be also a program with many threads with a lot context switching between them). Here you can see the first Core utilization graph has softirq of 6.0 and the other have none – this is due to the network card is using only the first core/processor (more to follow on the subject).
CHART 12) Interrupts
Interrupts by core/logical processor. Hardware interrups – enp3s0_28 (the network card), NMI, LOC, PMI, IWI, RES, CAL, TLB and so on. You can see the network interrupts are processed only by the first core/processor. You can change this by setting cpu affinity and to split across all CPU – in most cases you do not need this, because using one core/processor the latency is better, but on a busy server easily could reach 100% busy of the first core and the network packets processing will get in troubles.
CHART 13) softirqs
Software interrupts – TIMER, NET_TX, NET_RX, TASKLET, SCHED, RCU – network, context switches synchronization and so on.
CHART 14) softnet
Quotation netdata: “Statistics for per CPUs core SoftIRQs related to network receive work. Total for all CPU cores can be found at System / softnet statistics. processed states the number of packets processed, dropped is the number packets dropped because the network device backlog was full (to fix them on Linux use sysctl to increase net.core.netdev_max_backlog), squeezed is the number of packets dropped because the network device budget ran out (to fix them on Linux use sysctl to increase net.core.netdev_budget).” You can see how much SoftIRQs related to network receive each CPU. As you can see again the network is processed by the first core/processor.
CHART 15) throttling and cpufreq
1) The throttling of the CPU cores if any and 2) cpu frequency changes. If your server is in idle probably you can see more often to get to lower frequency on some cores/processors.
CHART 16) C-state residency for each core/processor.
CHART 17) Memory
1) Total available RAM for applications, 2) Commited Memory is the all the memory allocated by processes and 3) page faults – Quotation netdata: “A page fault is a type of interrupt, called trap, raised by computer hardware when a running program accesses a memory page that is mapped into the virtual address space, but not actually loaded into main memory. If the page is loaded in memory at the time the fault is generated, but is not marked in the memory management unit as being loaded in memory, then it is called a minor or soft page fault. A major page fault is generated when the system needs to load the memory page from disk or swap memory.”
CHART 18) Kernel and Swap memory
1) Quotation netdata: “Dirty is the amount of memory waiting to be written to disk. Writeback is how much memory is actively being written to disk.” – you can tune kernel to how much dirty memory to hold. 2) Memory used by kernel – netdata Quotation: “The total amount of memory being used by the kernel. Slab is the amount of memory used by the kernel to cache data structures for its own use. KernelStack is the amount of memory allocated for each task done by the kernel. PageTables is the amount of memory dedicated to the lowest level of page tables (A page table is used to turn a virtual address into a physical memory address). VmallocUsed is the amount of memory being used as virtual address space.” 3) slab – netdata Quotation: “Reclaimable is the amount of memory which the kernel can reuse. Unreclaimable can not be reused even when the kernel is lacking memory.”
CHART 19) Hugepages
netdata Quotation: “Hugepages is a feature that allows the kernel to utilize the multiple page size capabilities of modern hardware architectures. The kernel creates multiple pages of virtual memory, mapped from both physical RAM and swap. There is a mechanism in the CPU architecture called “Translation Lookaside Buffers” (TLB) to manage the mapping of virtual memory pages to actual physical memory addresses. The TLB is a limited hardware resource, so utilizing a large amount of physical memory with the default page size consumes the TLB and adds processing overhead. By utilizing Huge Pages, the kernel is able to create pages of much larger sizes, each page consuming a single resource in the TLB. Huge Pages are pinned to physical RAM and cannot be swapped/paged out.”
CHART 20) deduper (ksm)
You can save some RAM with this feature. netdata Quotation: “Kernel Same-page Merging (KSM) performance monitoring, read from several files in /sys/kernel/mm/ksm/. KSM is a memory-saving de-duplication feature in the Linux kernel (since version 2.6.32). The KSM daemon ksmd periodically scans those areas of user memory which have been registered with it, looking for pages of identical content which can be replaced by a single write-protected page (which is automatically copied if a process later wants to update its content). KSM was originally developed for use with KVM (where it was known as Kernel Shared Memory), to fit more virtual machines into physical memory, by sharing the data common between them. But it can be useful to any application which generates many instances of the same data.”
CHART 21) Charts with the performance of the disks and disk devices like raids – charts for every device in the system. Most important charts here are the disk utilization where you can see how busy is your device!
1) The disk I/O Bandwidth – Amount of data transferred to and from disk – “md2”. 2) Disk Completed I/O operations – netdata Quotation: “Completed disk I/O operations. Keep in mind the number of operations requested might be higher, since the system is able to merge adjacent to each other (see merged operations chart).”
CHART 22) Disk I/O
1) The average I/O Operations size of device “md2”, 2) Disk space utilization of device “md2” and 3) inodes usage of device “md2”.
CHART 23) Disk I/O of md0
1) Disk I/O Bandwidth, 2) Disk Completed I/O Operations, 3) The average I/O Operations
CHART 24) Disk I/O of sda
1) Disk I/O Bandwidth, 2) Disk Completed I/O Operations, 3) Disk current I/O Operations
CHART 25) Disk I/O of sda 2
1) Backlog – netdata Quotation: “Backlog is an indication of the duration of pending disk operations. On every I/O event the system is multiplying the time spent doing I/O since the last update of this field with the number of pending operations. While not accurate, this metric can provide an indication of the expected completion time of the operations in progress.”, 2) Disk Utilization Time – one of the most important charts, you can see if you disk is saturated, netdata Quotation: “Disk Utilization measures the amount of time the disk was busy with something. This is not related to its performance. 100% means that the system always had an outstanding operation on the disk. Keep in mind that depending on the underlying technology of the disk, 100% here may or may not be an indication of congestion.” 3) Average Completed I/O Operation Time 4) Average Completed I/O Operation Time
CHART 26) Disk I/O of sda 3
1) netdata Quotation: “The average service time for completed I/O operations. This metric is calculated using the total busy time of the disk and the number of completed operations. If the disk is able to execute multiple parallel operations the reporting average service time will be misleading.” 2) netdata Quotation: “The number of merged disk operations. The system is able to merge adjacent I/O operations, for example two 4KB reads can become one 8KB read before given to disk.” 3) netdata Quotation: “The sum of the duration of all completed I/O operations. This number can exceed the interval if the disk is able to execute I/O operations in parallel.”
CHART 27) Performance statistics for a NFS client working on the system.
1) RPC – calls per second, 2) What kind of RPC calls and how many of them.
Manage Cookie Consent
We use technologies like cookies to store and/or access device information. We do this to improve browsing experience and to show (non-) personalized ads. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional
Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.