Create graph for Linux Processes grouped by states using Grafana, InfluxDB and collectd

This article shows how to make a graph showing a Linux machine’s processes states. This plugin could gather the number of the processes grouped by their state or metadata per the selected process defined in the configuration (metadata includes process state, size of the resident segment size (RSS), system/user time used, and so on.). The purpose of this article is to make a graph with all the processes grouped by their state. Graphs per process data are not included here.

main menu
Processes states of a live web server.

The Linux machine is using collectd to gather the processes statistics and send them to the time series back-end – InfluxDB. Grafana is used to visualize the data stored in the time series back-end InfluxDB and organize the graphs in panels and dashboards. Check out the previous articles on the subject to install and configure such software to collect, store and visualize data – Monitor and analyze with Grafana, influxdb 1.8 and collectd under CentOS Stream 9, Monitor and analyze with Grafana, influxdb 1.8 and collectd under Ubuntu 22.04 LTS and Create graph for Linux CPU usage using Grafana, InfluxDB and collectd
The collectd daemon is used to gather data on the Linux system and to send it to the back-end InfluxDB.

Key knowledge for the Processes collectd plugin

  • The collectd plugin Processes official page – https://collectd.org/wiki/index.php/Plugin:Processes
  • The Processes plugin options – https://collectd.org/documentation/manpages/collectd.conf.5.shtml#plugin_processes
  • to enable the Processes plugin, load the plugin with the load directive in /etc/collectd.conf
    LoadPlugin processes
    
  • The Processes plugin collects data every 10 seconds.
  • processses_value – a single Gauge value – a metric, which value that can go up and down. It is used to count the number of processes in the different states (the state is saved in a tag value of one record). So there are multiple gauge values with different tags for the different process states at a given time.
    tag key tag value description
    host server hostname The name of the source this measurement was recorded.
    type cpu ps_state is the type, which will group the processes by states.
    type_instance processes’ states States are – blocked, paging, running, sleeping, stopped, zombies.
  • A Gauge value – a metric, which value that can go up and down. More on the topic – Data sources.

    A GAUGE value is simply stored as-is. This is the right choice for values which may increase as well as decrease, such as temperatures or the amount of memory used.

  • To cross check the value, the user can use the /proc/stat
    [root@srv ~]# cat /proc/stat 
    cpu  804 0 732 6240 198 106 25 0 0 0
    cpu0 444 0 345 3092 121 44 14 0 0 0
    cpu1 359 0 387 3147 76 62 11 0 0 0
    intr 72376 117 9 0 0 0 0 0 0 1 2 0 0 156 0 187 187 0 0 188 273 0 0 0 0 0 0 6574 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    ctxt 216350
    btime 1667997331
    processes 1359
    procs_running 2
    procs_blocked 0
    softirq 38704 2 5003 5 290 6565 0 74 5796 0 20969
    

    Some of the lines are pretty clear about what they mean by “procs_running“, “processes“, “procs_blocked” and so on.

Keep on reading!

Create graph for Linux CPU usage using Grafana, InfluxDB and collectd

This article shows how to make a graph showing a Linux machine’s CPU Usage.

main menu
example cpu usage

The Linux machine is using collectd to gather the load average and send it to the time series back-end – InfluxDB. Grafana is used to visualize the data stored in the time series back-end InfluxDB and organize the graphs in panels and dashboards. Check out the previous articles on the subject to install and configure such software to collect, store and visualize data – Monitor and analyze with Grafana, influxdb 1.8 and collectd under CentOS Stream 9 and Monitor and analyze with Grafana, influxdb 1.8 and collectd under Ubuntu 22.04 LTS.
The collectd daemon is used to gather data on the Linux system and to send it to the back-end InfluxDB.

Key knowledge for the cpu collectd plugin

  • The collectd plugin CPU official page – https://collectd.org/wiki/index.php/Plugin:CPU
  • The CPU plugin options – https://collectd.org/documentation/manpages/collectd.conf.5.shtml#plugin_cpu
  • to enable the CPU plugin, load the plugin with the load directive in /etc/collectd.conf
    LoadPlugin cpu
    
  • The CPU plugin collects data every 10 seconds.
  • cpu_value – 1 derive value is saved in the database. All values are in jiffies – the kernel unit of time. Showing just jiffers is not practical, that’s why all CPU graphs convert jiffers to CPU percentage usage.
    tag key tag value description
    host server hostname The name of the source this measurement was recorded.
    instance execution units number The execution unit this measurement was recorded. For example, systems with 8 cores will have 8 different execution units, so instances from 0 to 7. A graph representing the usage of a single CPU core is possible.
    type cpu The only type available is cpu.
    type_instance CPU usage metrics CPU metrics – idle, interrupt, nice, softirq, steal, system, user, wait.
  • DERIVE value – a metric, in which the change of the value is interesting. For example, it can go up indefinitely and it is important how fast it goes up, there are functions and queries, which will give the user the derivative value.

    These data sources assume that the change of the value is interesting, i.e. the derivative. Such data sources are very common with events that can be counted, for example, the number of emails that have been received per second by an MTA since it was started. The total number of emails is not interesting.

  • To cross check the value, the user can use the /proc/stat
    [root@srv ~]# cat /proc/stat 
    cpu  939 0 988 51486 200 261 56 0 0 0
    cpu0 483 0 473 25796 89 114 25 0 0 0
    cpu1 455 0 514 25690 110 147 31 0 0 0
    intr 123072 118 9 0 0 0 0 0 0 1 6 0 0 156 0 409 409 0 0 1184 501 0 0 0 0 0 0 6823 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    ctxt 279137
    btime 1666874114
    processes 1373
    procs_running 1
    procs_blocked 0
    softirq 64069 2 13685 7 544 6967 0 77 15801 0 26986
    

Keep on reading!

Create graph for Linux Load Average using Grafana, InfluxDB and collectd

This article shows how to make a graph showing a Linux machine’s load average.

main menu
A real load average graph of a web server

The Linux machine is using collectd to gather the load average and send it to the time series back-end – InfluxDB. Grafana is used to visualize the data stored in the time series back-end InfluxDB and organize the graphs in panels and dashboards. Check out the previous articles on the subject to install and configure such software to collect, store and visualize data – Monitor and analyze with Grafana, influxdb 1.8 and collectd under CentOS Stream 9 and Monitor and analyze with Grafana, influxdb 1.8 and collectd under Ubuntu 22.04 LTS.
The collectd daemon is used to gather data on the Linux system and to send it to the back-end InfluxDB.

Key knowledge for the load collectd plugin

  • The collectd plugin Load official page – https://collectd.org/wiki/index.php/Plugin:Load
  • The Load plugin options – https://collectd.org/documentation/manpages/collectd.conf.5.shtml#plugin_load
  • to enable the load plugin, load the plugin with the load directive in /etc/collectd.conf
    LoadPlugin load
    
  • The Load plugin collects data every 10 seconds.
  • load_longterm, load_midterm, load_shortterm – 3 gauge values are saved in the database.
  • Gauge value – a metric, which value that can go up and down.

    A GAUGE value is simply stored as-is. This is the right choice for values which may increase as well as decrease, such as temperatures or the amount of memory used.

  • To cross check the value, the user can use the uptime command under Linux or /proc/loadavg
    [root@srv ~]# uptime
     23:08:09 up 52 min,  2 users,  load average: 1.00, 0.77, 0.38
    [root@srv ~]# cat /proc/loadavg 
    1.00 0.77 0.38 2/176 1900
    

Keep on reading!

Monitor and analyze with Grafana, influxdb 1.8 and collectd under Ubuntu 22.04 LTS

This is an updated version of the previous version of this topic – Monitor and analyze with Grafana, influxdb 1.8 and collectd under CentOS Stream 9, but this time for Ubuntu 22.04 LTS. The article describes how to build modern analytic and monitoring solutions for system and application performance metrics. A solution, which may host all the server’s metrics and a sophisticated application, allows easy analyses of the data and powerful graphs to visualize the data.
A brief introduction to the main three software used to build the proposed solution:

  1. Grafana – an analytics and a web visualization tool. It supports dashboards, charts, graphs, alerts, and many more.
  2. influxdb – a time series database. Bleeding fast reads and writes and optimized for time.
  3. collectd – a data collection daemon, which obtain metrics from the host it is started and sends the metrics to the database (i.e. influxdb). It has around 170 plugins to collect metrics.

What is the task of each tool:

  1. collectd – gathers metrics and statistics using its plugins every 10 seconds on the host it runs and then sends the data over UDP to the influxdb using a simple text-based protocol.
  2. influxdb – listens on an open UDP port for data coming from multiple collectd instances installed on many different devices. In this case, a Linux server running Ubuntu 22.04 LTS.
  3. Grafana – an analytics and a web visualization tool. A web application, which connects to the InfluxDB and visualizes the time series metrics in graphs organized in dashboards. Graphs for CPU, memory, network, storage usage, and many more.
  4. nginx to enable SSL and proxy in front of the Grafana.

The whole solution uses the Ubuntu 22.04 LTS server edition distro. Installing the Ubuntu 22.04 LTS is a mandatory step to proceed further with this article – Installation of base Ubuntu server 22.04 LTS
The UDP influxdb port should be open per IP basis and web port of the web server (nginx) is up to the purpose of the solution – it can be behind a VPN or openly accessible by Internet.

STEP 1) Install additional repositories for Grafana, InfluxDB and collectd.

collectd is part of the Ubuntu official repositories. Grafana and InfluxDB maintain their official repositories. Here is how to install them.
Add the InfluxDB repository by first, importing the key of the InfluxDB repository and add the URL of the repository in /etc/apt/sources.list.

myuser@srv:~$ sudo curl -sL https://repos.influxdata.com/influxdb.key | sudo apt-key add -
Warning: apt-key is deprecated. Manage keyring files in trusted.gpg.d instead (see apt-key(8)).
OK
echo 'deb https://repos.influxdata.com/debian stable main' > /etc/apt/sources.list.d/influxdata.list

Then, repeated the same procedure with the Grafana repository:

myuser@srv:~$ sudo curl -sL https://packages.grafana.com/gpg.key | sudo apt-key add -
Warning: apt-key is deprecated. Manage keyring files in trusted.gpg.d instead (see apt-key(8)).
OK
echo 'deb https://packages.grafana.com/oss/deb stable main' > /etc/apt/sources.list.d/grafana.list

Execute apt update to include the available file packages from all repositories including the ones:

apt update

Keep on reading!

Add source InfluxDB 1.8 with basic authentication in Grafana using the web interface

This article shows how to add a new source in Grafana with screenshots. The source is InluxDB 1.8 with basic authentication enabled. The main purpose of this article is to give the user knowledge of how to:

  • Enable basic authentication in InfluxDB
  • Create users – administrative and ordinary ones in InfluxDB and give permissions for the database.
  • Add the InfluxDB source in Grafana using web interface. with basic authentication enabled with credentials created in the article.

It is supposed the InfluxDB is installed and running on the loopback 127.0.0.1, at least. If the InfluxDB service is not local for the Grafana service replace the 127.0.0.1 with the appropriate IP and adjust the firewall such that it accepts connections from the Grafana server IP. For installing InfluxDB with detailed information including firewall modifications there is another article here – Monitor and analyze with Grafana, InfluxDB 1.8 and collectd under CentOS Stream 9.
No installation information for InfluxDB or Grafana is included in this article and if they are needed check out the article above.

STEP 1) Create users in InfluxDB.

By default, InfluxDB authentication is disabled and no users are required to access and manage the service and the databases. That’s why, the first thing to do is to create an administrative user, which will manage the databases when the basic authentication will be enabled. At the same time, when creating the administrative user, ordinary users may be created, too.
To connect to the InfluxDB to manage the service the InfluxDB command-line tool influx will be used. influx connects to http://127.0.0.1:8086 – an HTTP interface to access the InfluxDB service.

[root@srv ~]# influx
Connected to http://localhost:8086 version 1.8.10
InfluxDB shell version: 1.8.10
> CREATE USER admin WITH PASSWORD 'aiqu8ohth9Cheeshai]c' WITH ALL PRIVILEGES
> SHOW USERS
user  admin
----  -----
admin true
> CREATE USER collectd WITH PASSWORD 'ohg|ahTh9Sa|quoh8zoh'
> GRANT READ ON "collectd" TO "collectd"
> SHOW USERS
user     admin
----     -----
admin    true
collectd false
>

First, the administrative user with admin name is created, and then the ordinary user with the collectd name. For the ordinary user, the access privileges are granted only for READ on the collectd database. It is typical to name the database and the user accessing it with the same name. The format of the GRANT command is the following:

GRANT "[PRIVILEGES]" ON "[database_name]" TO "[user_name]"

READ privileges are enough for Grafana to access the data.
Keep on reading!

Monitor and analyze with Grafana, influxdb 1.8 and collectd under CentOS Stream 9

This article describes how to build a modern analytic and monitoring solutions for system and application performance metrics. A solution, which may host all the server’s metrics and a sophisticated application, allows easy analyses of the data and powerful graphs to visualize the data.
A brief introduction to the main three software used to build the proposed solution:

  1. Grafana – an analytics and a web visualization tool. It supports dashboards, charts, graphs, alerts, and many more.
  2. influxdb – a time series database. Bleeding fast reads and writes and optimized for time.
  3. collectd – a data collection daemon, which obtain metrics from the host it is started and sends the metrics to the database (i.e. influxdb). It has around 170 plugins to collect metrics.

What is the task of each tool:

  1. collectd – gathers metrics and statistics using its plugins every 10 seconds on the host it runs and then sends the data over UDP to the influxdb using a simple text-based protocol.
  2. influxdb – listens on an open UDP port for data coming from multiple collectd instances installed on many different devices. In this case, a Linux server running CentOS Stream 9.
  3. Grafana – an analytics and a web visualization tool. A web application, which connects to the InfluxDB and visualizes the time series metrics in graphs organized in dashboards. Graphs for CPU, memory, network, storage usage, and many more.
  4. nginx to enable SSL and proxy in front of the Grafana.

The whole solution uses the CentOS Stream 9 Linux distro. Installing the CentOS Stream 9 is a mandatory step to proceed further with this article – Network installation of CentOS Stream 9 (20220606.0) – minimal server installation
The UDP influxdb port should be open per IP basis and web port of the web server (nginx) is up to the purpose of the solution – it can be behind a VPN or openly accessible by Internet.

STEP 1) Install additional repositories for Grafana, influxdb and collectd.

Install CentOS official EPEL and OpsTools repositories. EPEL provides additional packages to the base CentOS packages and OpsTools provides collectd and more collectd plugins than the ones included in the built-in repositories.

dnf install -y epel-release centos-release-opstools

Add the InfluxDB repository by creating a file in /etc/yum.repos.d/influxdb.repo

[influxdb]
name = InfluxDB Repository - RHEL $releasever
baseurl = https://repos.influxdata.com/centos/$releasever/$basearch/stable
enabled = 1
gpgcheck = 1
gpgkey = https://repos.influxdata.com/influxdb.key

Finally, add the Grafana repository in file /etc/yum.repos.d/grafana.repo

[grafana]
name=grafana
baseurl=https://packages.grafana.com/oss/rpm
repo_gpgcheck=1
enabled=1
gpgcheck=1
gpgkey=https://packages.grafana.com/gpg.key
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt

Keep on reading!

Install and use collectd-ping under CentOS 8 to monitor latency

Tracking the network latency of the servers’ network is not an easy job. Most monitoring software is capable to monitor the state of the server, but how to monitor the state of the connectivity and the network latency and even the Internet connectivity with some respectful addresses like 1.1.1.1 or 8.8.8.8? It should be easy to do it with ICMP and ping command but using the collectd daemon and one of its plugins offers collectd-ping from https://collectd.org/wiki/index.php/Plugin:Ping to save all the history in a time series back-end and using grafanahttps://grafana.com/ (or other graphs/histograms and etc software) to make graphs.
Using the collectd-ping plugin in conjunction with grafana may reach the similar effect as using the old and gold smokeping.
CentOS 7 included the collectd-ping plugin in its official repository, but in CentOS 8 the plugin is missing! Under Cent OS 8 the CentOS SIG OpsTools https://wiki.centos.org/SpecialInterestGroup/OpsTools includes the collectd-ping plugin in their repository. More on SIG and OpsTools may be obtained in the later page. In general, it is safe to use this repository it would not break user’s system.
Here is how to install and configure it. Real grafana examples are also included at the end.

The example here assumes there is a grafana server installed with influxdb backend.

STEP 1) Add OpsTools repository and install the collectd and collectd-ping.

The OpsTools repository is installed with centos-release-opstools package.
Here is what is going to install:

dnf install -y centos-release-opstools
dnf install -y collectd collectd-ping

Keep on reading!

Reset lost admin password of Grafana server (CentOS 7)

It could happen to lose your Grafana admin password and luckily there is a pretty easy way for the system administrator of the Grafana server to reset the admin password.
There is really good and simple page of how you can reset the admin password with grafana-cli in the documentation of Grafana website. Probably you can check it, too.

The problem is that most distributions like the one we use (CentOS 7) rearrange the home directory of the Grafana software and if you try to use grafana-cli you’ll end up with the error mentioned in the section below the solution we offer you here! And you really could mess it up and hurry to say this software is buggy!
First, “grafana-cli admin reset-admin-password ” needs to know where is your configuration file and Grafana home directory, because from the configuration it will extract what kind of back-end server you use like sqlite3 or MySQL (or other) and the path or login data if needed to the back-end, then it will reset the password with the given one. At the bottom of this article, there is another method of resetting the Grafana admin password without grafana-cli – manual reset in the database.
So first you should check with what command-line arguments was your instance of Grafana server started:
Keep on reading!