Enabling the Nginx plugin for collectd under CentOS (or any other system using SELinux) might be confusing for a newbie. Most sources on the Internet would just install collectd-nginx:
yum install -y collectd-nginx
and configure it in the nginx.conf and collectd.conf. Still, the statistics might not work as expected, the collectd may not be able to gather statistics from the Nginx.
SELinux may prevent collectd (plugin) daemon to connect to Nginx and gather statistics from the Nginx stats page.
Checking the collectd log and it reports a problem:
[root@srv ~]# systemctl status collectd ● collectd.service - Collectd statistics daemon Loaded: loaded (/usr/lib/systemd/system/collectd.service; enabled; vendor preset: disabled) Active: active (running) since Sun 2020-04-26 02:43:20 UTC; 3min 5s ago Docs: man:collectd(1) man:collectd.conf(5) Main PID: 18521 (collectd) Tasks: 11 (limit: 26213) Memory: 5.3M CGroup: /system.slice/collectd.service └─18521 /usr/sbin/collectd Apr 26 02:43:20 srv.local systemd[1]: Started Collectd statistics daemon. Apr 26 02:43:20 srv.local collectd[18521]: Initialization complete, entering read-loop. Apr 26 02:43:20 srv.local collectd[18521]: nginx plugin: curl_easy_perform failed: Apr 26 02:43:20 srv.local collectd[18521]: read-function of plugin `nginx' failed. Will suspend it for 20.000 seconds. Apr 26 02:43:40 srv.local collectd[18521]: nginx plugin: curl_easy_perform failed: Apr 26 02:43:40 srv.local collectd[18521]: read-function of plugin `nginx' failed. Will suspend it for 40.000 seconds. Apr 26 02:44:20 srv.local collectd[18521]: nginx plugin: curl_easy_perform failed: Apr 26 02:44:20 srv.local collectd[18521]: read-function of plugin `nginx' failed. Will suspend it for 80.000 seconds. Apr 26 02:45:40 srv.local collectd[18521]: nginx plugin: curl_easy_perform failed: Apr 26 02:45:40 srv.local collectd[18521]: read-function of plugin `nginx' failed. Will suspend it for 160.000 seconds.
Not very informative. We already knew there were no statistics for Nginx. But when you check the /var/log/messages (and audit?) the exact cause is revealed – a missing SELinux allowance permission.
Apr 26 02:44:20 srv collectd[18521]: nginx plugin: curl_easy_perform failed: Apr 26 02:44:20 srv collectd[18521]: read-function of plugin `nginx' failed. Will suspend it for 80.000 seconds. Apr 26 02:44:23 srv dbus-daemon[1402]: [system] Activating service name='org.fedoraproject.Setroubleshootd' requested by ':1.61' (uid=0 pid=1370 comm="/usr/sbin/sedispatch " label="system_u:system_r:auditd_t:s0") (using servicehelper) Apr 26 02:44:23 srv dbus-daemon[1402]: [system] Successfully activated service 'org.fedoraproject.Setroubleshootd' Apr 26 02:44:23 srv setroubleshoot[18583]: SELinux is preventing /usr/sbin/collectd from name_connect access on the tcp_socket port 443. For complete SELinux messages run: sealert -l 3966ca03-1a2d-4957-a1c7-45ea7d125ef0 Apr 26 02:44:24 srv platform-python[18583]: SELinux is preventing /usr/sbin/collectd from name_connect access on the tcp_socket port 443.#012#012***** Plugin catchall_boolean (47.5 confidence) suggests ******************#012#012If you want to allow nis to enabled#012Then you must tell SELinux about this by enabling the 'nis_enabled' boolean.#012#012Do#012setsebool -P nis_enabled 1#012#012***** Plugin catchall_boolean (47.5 confidence) suggests ******************#012#012If you want to allow collectd to tcp network connect#012Then you must tell SELinux about this by enabling the 'collectd_tcp_network_connect' boolean.#012#012Do#012setsebool -P collectd_tcp_network_connect 1#012#012***** Plugin catchall (6.38 confidence) suggests **************************#012#012If you believe that collectd should be allowed name_connect access on the port 443 tcp_socket by default.#012Then you should report this as a bug.#012You can generate a local policy module to allow this access.#012Do#012allow this access for now by executing:#012# ausearch -c 'reader#0' --raw | audit2allow -M my-reader0#012# semodule -X 300 -i my-reader0.pp#012
And the fix is really easy:
setsebool -P collectd_tcp_network_connect=1
“-P” means this SELinux option will remain persistent over reboots.
It is good to restart the service, because probably the plugin is suspended for a log period of time and the first try to get the Nginx information could take hours. It depends on how much time took to fix the problem from the last start of collectd service.
[root@srv ~]# setsebool -P collectd_tcp_network_connect=1 [root@srv ~]# systemctl restart collectd [root@srv ~]# systemctl status collectd ● collectd.service - Collectd statistics daemon Loaded: loaded (/usr/lib/systemd/system/collectd.service; enabled; vendor preset: disabled) Active: active (running) since Sun 2020-04-26 02:53:26 UTC; 3s ago Docs: man:collectd(1) man:collectd.conf(5) Main PID: 18855 (collectd) Tasks: 11 (limit: 26213) Memory: 3.5M CGroup: /system.slice/collectd.service └─18855 /usr/sbin/collectd Apr 26 02:53:26 srv.local collectd[18855]: plugin_load: plugin "cpu" successfully loaded. Apr 26 02:53:26 srv.local collectd[18855]: plugin_load: plugin "disk" successfully loaded. Apr 26 02:53:26 srv.local collectd[18855]: plugin_load: plugin "interface" successfully loaded. Apr 26 02:53:26 srv.local collectd[18855]: plugin_load: plugin "load" successfully loaded. Apr 26 02:53:26 srv.local collectd[18855]: plugin_load: plugin "memory" successfully loaded. Apr 26 02:53:26 srv.local collectd[18855]: plugin_load: plugin "network" successfully loaded. Apr 26 02:53:26 srv.local collectd[18855]: plugin_load: plugin "nginx" successfully loaded. Apr 26 02:53:26 srv.local collectd[18855]: plugin_load: plugin "protocols" successfully loaded. Apr 26 02:53:26 srv.local collectd[18855]: Systemd detected, trying to signal readiness. Apr 26 02:53:26 srv.local collectd[18855]: Initialization complete, entering read-loop.
The audit log
There are also records in audit log (/var/log/audit/audit.log):
[root@srv ~]# ausearch -c 'reader#0' --raw type=AVC msg=audit(1587867699.817:429761): avc: denied { name_connect } for pid=17888 comm="reader#0" dest=80 scontext=system_u:system_r:collectd_t:s0 tcontext=system_u:object_r:http_port_t:s0 tclass=tcp_socket permissive=0 type=SYSCALL msg=audit(1587867699.817:429761): arch=c000003e syscall=42 success=no exit=-13 a0=3 a1=7f3e086447d0 a2=10 a3=2f59cdb87a6dbc items=0 ppid=1 pid=17888 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="reader#0" exe="/usr/sbin/collectd" subj=system_u:system_r:collectd_t:s0 key=(null)ARCH=x86_64 SYSCALL=connect AUID="unset" UID="root" GID="root" EUID="root" SUID="root" FSUID="root" EGID="root" SGID="root" FSGID="root" type=PROCTITLE msg=audit(1587867699.817:429761): proctitle="/usr/sbin/collectd"
Explanation in human-readable format:
[root@srv ~]# ausearch -c 'reader#0' --raw | audit2allow #============= collectd_t ============== #!!!! This avc is allowed in the current policy allow collectd_t http_port_t:tcp_socket name_connect;
You may want to make an SELinux module file and import it in the current SELinux rulesets:
] ausearch -c 'reader#0' --raw | audit2allow -M my-collectd-module && semodule -X 300 -i my-collectd-module.pp
If you are missing any of the above used SELinux related commands you should install the policycoreutils and policycoreutils-python-utils packages:
yum install -y policycoreutils-python-utils policycoreutils