One of our big Nginx cache servers has recently been upgraded to have 70T storage, which is pretty good storage for a proxy. And in a hurry to configure the big storage we changed only the “max_size” option of proxy_cache_path directive! After a week in production, the proxy reached 23T and it just stopped growing with no apparent reason! Space and Nginx max_size were OK 75T total space and 70T for the proxy cache, but no Nginx had not added more space after reaching 23T occupied space for two days, which was impossible because all files were kept for 5 years and 200G per day were generated. No errors in the logs and we even use “virtual host traffic status module” – Live status information like used space and more for nginx proxy cache, but still no clue why it did not grow above this threshold of 23T! And it began to remove cached objects!
proxy_cache_path /mnt/cache levels=1:2 keys_zone=STATIC:900m inactive=42600h max_size=70000g;
It appeared we exhausted the shared memory zone limit for the zone! And Nginx cache just stopped growing.
According to the Nginx manual “One megabyte (of shared memory zone), a zone can store about 8 thousand keys“. Apparently, after 23T of files, we have passed 7 200 000 keys and exhausted the limit we configured in the proxy_cache_path line!
The solution is really simple just increase the limit for the shared memory zone for the zone.
proxy_cache_path /mnt/cache levels=1:2 keys_zone=STATIC:4000m inactive=42600h max_size=70000g;
In the past, with small cache (15T) it was enough to have 900Mbytes for the cache’s shared memory. Now we set it to 4000 Mbytes to be able to store approximately 32 000 000 keys. We have 23T and 900M of shared memory for keys (for our setup, your setup may differ a lot!!!) and setting it to 4000M, which is more than 4 times bigger than before it will probably be enough for the rest free storage to be used at full extent.
Be careful this operation will trigger the “Nginx cache loader” to load the cache index and may produce IO during this operation!
Nginx shared memory zone size
Nginx workers use shared mappings – mmap, which is different from the SYSV and POSIX shared memory (so you cannot use ipcs tool to check for shared memory). You should check how many memory currently the process is using. Here is how you can get the size of the shared memory zone occupied by the Nginx processes and as you can see each Nginx worker is around 900M of column “RSS” (Resident Set Size):
[root@srv ~]# ps -o rss,pid,comm,user,cmd -C nginx RSS PID COMMAND USER CMD 904888 3979 nginx nginx nginx: worker process 905116 3980 nginx nginx nginx: worker process 904828 3981 nginx nginx nginx: worker process 905176 3982 nginx nginx nginx: worker process 905196 3983 nginx nginx nginx: worker process 905008 3984 nginx nginx nginx: worker process 904908 3985 nginx nginx nginx: worker process 905372 3986 nginx nginx nginx: worker process 905088 3987 nginx nginx nginx: worker process 902688 3988 nginx nginx nginx: worker process 904932 3989 nginx nginx nginx: worker process 905032 3990 nginx nginx nginx: worker process 26452 3991 nginx nginx nginx: cache manager process 33928 8148 nginx root nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf
For a single Nginx process:
[root@srv ~]# cat /proc/3981/status |grep RssShmem RssShmem: 894240 kB
You can check the occupied inodes of your file system with df to get approximately how many files you have:
[root@srv ~]# df -i Filesystem Inodes IUsed IFree IUse% Mounted on devtmpfs 16452656 715 16451941 1% /dev tmpfs 16455999 1 16455998 1% /dev/shm tmpfs 16455999 1153 16454846 1% /run tmpfs 16455999 17 16455982 1% /sys/fs/cgroup /dev/md1 2076704 39687 2037017 2% / /dev/md3 1214685184 6897020 1207788164 1% /mnt/cache tmpfs 16455999 5 16455994 1% /run/user/0
inodes around 6 897 020 and not growing for days. This number is very close to the maximum keys, which 900M key shared memory zone may store!
Two days after changing the key shared memory zone limit to 4000Mbytes:
proxy_cache_path /mnt/cache levels=1:2 keys_zone=STATIC:4000m inactive=42600h max_size=70000g;
The Nginx workers passed 900Mbytes RSS (Resident Set Size) and it reached 1Gbyte. The occupied cached sized grew with 1T and continued to grow.
[root@srv ~]# ps -o rss,pid,comm,user,cmd -C nginx RSS PID COMMAND USER CMD 52256 8148 nginx root nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf 1005624 16899 nginx nginx nginx: worker process 1005948 16900 nginx nginx nginx: worker process 1005936 16901 nginx nginx nginx: worker process 1005912 16902 nginx nginx nginx: worker process 1005832 16903 nginx nginx nginx: worker process 1005836 16904 nginx nginx nginx: worker process 1005868 16905 nginx nginx nginx: worker process 1005932 16906 nginx nginx nginx: worker process 1005796 16907 nginx nginx nginx: worker process 1005980 16908 nginx nginx nginx: worker process 1005848 16909 nginx nginx nginx: worker process 1005888 16910 nginx nginx nginx: worker process 26328 16911 nginx nginx nginx: cache manager process
The occupied inodes also increased to 7 484 291, which means the cache added around 700 000 new files.
[root@srv ~]# df -i Filesystem Inodes IUsed IFree IUse% Mounted on devtmpfs 16452656 715 16451941 1% /dev tmpfs 16455999 1 16455998 1% /dev/shm tmpfs 16455999 1153 16454846 1% /run tmpfs 16455999 17 16455982 1% /sys/fs/cgroup /dev/md1 2076704 39690 2037014 2% / /dev/md3 1214685184 7484582 1207200602 1% /mnt/cache tmpfs 16455999 5 16455994 1% /run/user/0