Pass-through the NVIDIA card in a LXC container

Pass-through the NVIDIA card to be used in the LXC container is simple enough and there are three simple rules to watch for:

  • mount bind the NVIDIA devices in /dev to the LXC container’s /dev
  • Allow cgroup access for the bound /dev devices.
  • Install the same version of the NVIDIA driver/software under the host and the LXC container or there will be multiple errors of the sort – version mismatch

main menu
config

When using the LXC container pass-through, i.e. mount bind, the video card may be used simultaneously on the host and on all the LXC containers where it is mount bind. Multiple LXC containers share the video device(s).

This is a working LXC 4.0.12 configuration:

# Distribution configuration
lxc.include = /usr/share/lxc/config/common.conf
lxc.arch = x86_64

# Container specific configuration
lxc.rootfs.path = dir:/mnt/storage1/servers/gpu1u/rootfs
lxc.uts.name = gpu1u

# Network configuration
lxc.net.0.type = macvlan
lxc.net.0.link = enp1s0f1
lxc.net.0.macvlan.mode = bridge
lxc.net.0.flags = up
lxc.net.0.name = eth0
lxc.net.0.hwaddr = fe:77:3f:27:15:60

# Allow cgroup access
lxc.cgroup2.devices.allow = c 195:* rwm
lxc.cgroup2.devices.allow = c 234:* rwm
lxc.cgroup2.devices.allow = c 237:* rwm


# Pass through device files
lxc.mount.entry = /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry = /dev/nvidia1 dev/nvidia1 none bind,optional,create=file
lxc.mount.entry = /dev/nvidia2 dev/nvidia2 none bind,optional,create=file
lxc.mount.entry = /dev/nvidia3 dev/nvidia3 none bind,optional,create=file
lxc.mount.entry = /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry = /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry = /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry = /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
lxc.mount.entry = /dev/nvidia-caps dev/nvidia-caps none bind,optional,create=dir


# Autostart
lxc.group = onboot
lxc.start.auto = 1
lxc.start.delay = 10

Keep on reading!

Change the LXC container root folder under CentOS with SELinux

The default LXC container folder in CentOS (all versions – 7,8, Stream 8 and Stream 9) is /var/lib/lxc, which may resides in the root partition. When changing the lxc.rootfs or (the main directory /var/lib/lxc) to another place, the containers may still work without any additional SELinux permissions. Some tools like lxc-attach would definitely stop working with permission errors – lxc_attach_run_shell: 1333 Permission denied – failed to exec shell. This article will show how to use lxc-create and SELinux commands to properly change the LXC container’s rootfs.
For detailed information how to create a LXC container check out – Run LXC CentOS Stream 9 container with bridged network under CentOS Stream 9 or Run LXC Ubuntu 22.04 LTS container with bridged network under CentOS Stream 9.

Create LXC container with not default path

  • Change the rootfs only. To change only the LXC container root filesystem location use “–dir=” lxc-create option:
    lxc-create --template download -n mycontainer2 --dir=/mnt/storage/servers/mycontainer2 -- --dist centos --release 9-Stream --arch amd64
    

    It will place the files under /mnt/storage/servers/mycontainer2, but the configuration will still be located in /var/lib/lxc/mycontainer2/.

    [root@srv ~]# ls -altr /var/lib/lxc/mycontainer2/
    total 16
    drwxr-xr-x. 3 root root 4096 Oct 14 13:42 ..
    drwxr-xr-x. 2 root root 4096 Oct 14 13:42 rootfs
    -rw-r-----. 1 root root  775 Oct 14 13:42 config
    drwxrwx---. 3 root root 4096 Oct 14 13:42 .
    [root@srv ~]# ls -altr /var/lib/lxc/mycontainer2/rootfs/
    total 8
    drwxr-xr-x. 2 root root 4096 Oct 14 13:42 .
    drwxrwx---. 3 root root 4096 Oct 14 13:42 ..
    [root@srv ~]# ls -altr /mnt/storage/servers/mycontainer2/
    total 76
    drwxrwxrwt.  2 root root 4096 Aug  9  2021 tmp
    drwxr-xr-x.  2 root root 4096 Aug  9  2021 srv
    lrwxrwxrwx.  1 root root    8 Aug  9  2021 sbin -> usr/sbin
    drwxr-xr-x.  2 root root 4096 Aug  9  2021 opt
    drwxr-xr-x.  2 root root 4096 Aug  9  2021 mnt
    drwxr-xr-x.  2 root root 4096 Aug  9  2021 media
    lrwxrwxrwx.  1 root root    9 Aug  9  2021 lib64 -> usr/lib64
    lrwxrwxrwx.  1 root root    7 Aug  9  2021 lib -> usr/lib
    drwxr-xr-x.  2 root root 4096 Aug  9  2021 home
    dr-xr-xr-x.  2 root root 4096 Aug  9  2021 boot
    lrwxrwxrwx.  1 root root    7 Aug  9  2021 bin -> usr/bin
    dr-xr-xr-x.  2 root root 4096 Aug  9  2021 afs
    dr-xr-xr-x.  2 root root 4096 Oct 14 07:11 sys
    dr-xr-xr-x.  2 root root 4096 Oct 14 07:11 proc
    drwxr-xr-x. 12 root root 4096 Oct 14 07:11 usr
    drwxr-xr-x.  8 root root 4096 Oct 14 07:11 run
    drwxr-xr-x. 18 root root 4096 Oct 14 07:11 var
    dr-xr-x---.  2 root root 4096 Oct 14 07:12 root
    drwxr-xr-x.  2 root root 4096 Oct 14 07:12 selinux
    drwxr-xr-x. 19 root root 4096 Oct 14 07:15 .
    drwxr-xr-x.  4 root root 4096 Oct 14 13:41 ..
    drwxr-xr-x.  3 root root 4096 Oct 14 13:42 dev
    drwxr-xr-x. 63 root root 4096 Oct 14 13:42 etc
    
  • Change the LXC container path – the folder containing the configuration and the container’s root filesystems use “-P”
    lxc-create -P /mnt/storage/servers/ --template download -n mycontainer -- --dist centos --release 9-Stream --arch amd64
    

    All the LXC container configuration and root filesystem will be placed under /mnt/storage/servers/[container_name], which in the example above is /mnt/storage/servers/mycontainer

    [root@srv ~]# ls -al /mnt/storage/servers/mycontainer
    total 16
    drwxrwx---.  3 root root 4096 Oct 14 13:38 .
    drwxr-xr-x.  4 root root 4096 Oct 14 13:41 ..
    -rw-r-----.  1 root root  780 Oct 14 13:38 config
    drwxr-xr-x. 19 root root 4096 Oct 14 07:15 rootfs
    

It is better to use the “-P” and to change the LXC container location than only the filesystem path. In this case, a good practice is to make a symbolic link in /var/lib/lxc/[container-name] to the new location:

ln -s /mnt/storage/servers/mycontainer /var/lib/lxc/mycontainer

So all LXC tools will continue to work without explicitly adding an option for the new path of this container.

Change the SELinux file context to be container_var_lib_t of the LXC root filesystem

Add the file context container_var_lib_t to the container’s root filesystem path and change the SELinux labels.
First, verify all the needed tools are installed:

dnf install -y policycoreutils-python-utils container-selinux

Then, add a new file context to the path /mnt/storage/servers/mycontainer and run the restorecon to change the SELinux labels to container_var_lib_t

semanage fcontext -a -t container_var_lib_t '/mnt/storage/servers/mycontainer(/.*)?'
restorecon -Rv /mnt/storage/servers/mycontainer

The file context may be shown with:

[root@srv ~]# ls -alZ /mnt/storage/servers/mycontainer
total 16
drwxrwx---.  3 root root unconfined_u:object_r:container_var_lib_t:s0 4096 Oct 14 13:38 .
drwxr-xr-x.  4 root root unconfined_u:object_r:mnt_t:s0               4096 Oct 14 13:41 ..
-rw-r-----.  1 root root unconfined_u:object_r:container_var_lib_t:s0  780 Oct 14 13:38 config
drwxr-xr-x. 19 root root unconfined_u:object_r:container_var_lib_t:s0 4096 Oct 14 07:15 rootfs

Failing to set the proper SELinux labels may result to errors such as lxc_attach_run_shell: 1333 Permission denied – failed to exec shell

lxc_attach_run_shell: 1333 Permission denied – failed to exec shell

An annoying error when using the LXC container tools like lxc-attach, which is really simple to fix.

[root@srv ~]# lxc-attach -n db-cluster-3
lxc_container: attach.c: lxc_attach_run_shell: 1333 Permission denied - failed to exec shell
[root@srv ~]#

This error just reports the bash shell in the container cannot be started and the SELinux audit file adds some errors, too:

type=AVC msg=audit(1665745824.682:24229): avc:  denied  { entrypoint } for  pid=20646 comm="lxc-attach" path="/usr/bin/bash" dev="md3" ino=111806476 scontext=system_u:system_r:unconfined_service_t:s0 tcontext=unconfined_u:object_r:var_log_t:s0 tclass=file
type=SYSCALL msg=audit(1665745824.682:24229): arch=c000003e syscall=59 success=no exit=-13 a0=24412c6 a1=7ffe87c07170 a2=2443870 a3=7ffe87c08c60 items=0 ppid=20644 pid=20646 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts12 ses=3304 comm="lxc-attach" exe="/usr/bin/lxc-attach" subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null)
type=PROCTITLE msg=audit(1665745824.682:24229): proctitle=6C78632D617474616368002D6E0064622D636C75737465722D33
type=AVC msg=audit(1665745824.682:24230): avc:  denied  { entrypoint } for  pid=20646 comm="lxc-attach" path="/usr/bin/bash" dev="md3" ino=111806476 scontext=system_u:system_r:unconfined_service_t:s0 tcontext=unconfined_u:object_r:var_log_t:s0 tclass=file
type=SYSCALL msg=audit(1665745824.682:24230): arch=c000003e syscall=59 success=no exit=-13 a0=7f08b5e579a0 a1=7ffe87c07170 a2=2443870 a3=7ffe87c08c60 items=0 ppid=20644 pid=20646 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts12 ses=3304 comm="lxc-attach" exe="/usr/bin/lxc-attach" subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null)
type=PROCTITLE msg=audit(1665745824.682:24230): proctitle=6C78632D617474616368002D6E0064622D636C75737465722D33

So clearly, the problem is in SELinux, and turn it off temporarily with

setenforce 0

Turning off the SELinux is not the right thing! There are two aspects to the problem:

  • Missing SELinux rules, which are installed with a special package container-selinux
  • Wrong SELinux permissions for the LXC container’s root directory. In most cases, the user just changes the default /var/lib/lxc/[container] to something new and the LXC works, but it breaks some LXC parts.

Installing container-selinux is easy:

dnf install -y container-selinux

Or the old yum:

yum install -y container-selinux

Then check the SELinux attributes with:

[root@srv ~]# ls -altrZ /mnt/storage/servers/mycontainer/
drwxr-xr-x. root root unconfined_u:object_r:var_log_t:s0 ..
-rw-r--r--. root root unconfined_u:object_r:var_log_t:s0 config
drwxrwx---. root root unconfined_u:object_r:var_log_t:s0 .
drwxr-xr-x. root root unconfined_u:object_r:var_log_t:s0 rootfs

The problem is var_log_t, which is an SELinux file context and it should be container_var_lib_t. Stop the container and fix the permissions. If the default directory (/var/lib/lxc) were used, it would not have this problem. Adding the SELinux file context definition to the new directory is mandatory when changing the directory root of a container:

[root@srv ~]# semanage fcontext -a -t container_var_lib_t '/mnt/storage/servers/mycontainer(/.*)?'
[root@srv ~]# restorecon -Rv /mnt/storage/servers/mycontainer/
restorecon reset /mnt/storage/servers/mycontainer context unconfined_u:object_r:var_log_t:s0->unconfined_u:object_r:container_var_lib_t:s0
.....
.....
restorecon reset /mnt/storage/servers/mycontainer/config context unconfined_u:object_r:var_log_t:s0->unconfined_u:object_r:container_var_lib_t:s0

All files permissions under /mnt/storage/servers/mycontainer/ should be fixed with the restorecon. Start the LXC container and try to attach it with lxc-attach. Now, there should not be any errors:

[root@srv ~]# lxc-attach -n mycontainer
[root@mycontainer ~]#

The files’ context is the right one – container_var_lib_t:

[root@srv ~]# ls -altrZ /mnt/storage/servers/mycontainer/
drwxr-xr-x. root root unconfined_u:object_r:var_log_t:s0 ..
-rw-r--r--. root root unconfined_u:object_r:container_var_lib_t:s0 config
drwxrwx---. root root unconfined_u:object_r:container_var_lib_t:s0 .
drwxr-xr-x. root root unconfined_u:object_r:container_var_lib_t:s0 rootfs

More on LXC containershttps://ahelpme.com/category/software/lxc/.

Run LXC Ubuntu 22.04 LTS container with bridged network under CentOS Stream 9

In continuation of the previous article Run LXC CentOS Stream 9 container with bridged network under CentOS Stream 9, this time the LXC container will be Ubuntu 22.04 LTS Jammy Jellyfish.
To receive a better understanding why to use LXC or a much detailed information of some steps in this article it is better to visit the previously mention article and the original Run LXC CentOS 8 container with bridged network under CentOS 8.

STEP 1) Install the needed software EPEL repository and the LXC and its dependencies

To install LXC software the EPEL CentOS Stream 9 repository must be installed. At present, the LXC included in CentOS Stream 9 EPEL repository is 4.0.

dnf install -y epel-release
dnf install -y lxc lxc-templates container-selinux
dnf install -y wget tar

lxc-templates uses template “download” to download different Linux distribution images from http://images.linuxcontainers.org/, which now redirects to http://uk.lxd.images.canonical.com/ (an Ubuntu lxd images mirror).
The container-selinux should be installed only if the host, i.e. the CentOS Stream 9 install, is with enabled SELinux. The packages offers additional SELinux rules or for the LXC and LXC tools like lxc-attach and more.

STEP 2) Create a Ubuntu 22.04 LTS with the help of LXC templates

[root@srv ~]# lxc-create --template download -n mycontainer -- --dist centos --release 9-Stream --arch amd64

In addition, there is a “–variant” option along with “--dist” and “--release” to specify which variant to install – default, cloud, desktop or other. There is a variant column in the table on the images’ page mentioned above.
Keep on reading!

Run LXC CentOS Stream 9 container with bridged network under CentOS Stream 9

In continue of the previous article with CentOS 8 – Run LXC CentOS 8 container with bridged network under CentOS 8, here is an updated version with CentOS Stream 9 running LXC container. In this case, the LXC container is CentOS Stream 9, too.
Under CentOS 8, the LXC software is from branch 3.x, but in CentOS Stream 9 the LXC is 4.x and there are some differences in the LXC configuration file.
It’s worth mentioning the differences between docker/podman containers and LXC from the previous article:

  • Multiprocesses.
  • Easy configuration modification. Even hot-plugin supported.
  • Unprivileged Linux containers.
  • Complex network setups. Multiple network interfaces connected to different networks, for example.
  • Live systemd, i.e. systemd or SysV init are booted as usual. Much of the software relies on systemd/udev features and in many cases, it is really hard to run software without a systemd or init process

Here are the steps to boot a CentOS Stream 9 container under CentOS Stream 9 host server:

STEP 1) Install EPEL repository.

EPEL CentOS Stream 9 repository now includes LXC 4.0 software.

dnf install -y epel-release

STEP 2) Install LXC software and start LXC service.

At present, the LXC software version is 4.0.12. The package lxc-templates includes template scripts to create a Linux distribution environment like CentOS, Ubuntu, Debian, Gentoo, ArchLinux, Oracle, Alpine, and many others and it also includes the configuration templates to start these Linux distributions. In fact, lxc-templates now includes a download script to download images from the Internet.

dnf install -y lxc lxc-templates container-selinux
dnf install -y wget tar

The wget and tar are required if LXC templates installation is going to be performed.
There is an additional package for container’s SELinux, which should be installed before starting the LXC service, because some of the SELinux rules may not apply in the system. If the SELinux is disabled the installation of container-selinux package might be skipped.

STEP 3) Create a CentOS Stream 9 container with the help of LXC templates and run it.

Use the lxc-templates to prepare a CentOS Stream 9 container environment. The currently available containers are listed here http://images.linuxcontainers.org/, which now redirects to http://uk.lxd.images.canonical.com/ (an Ubuntu lxd images mirror). Check out the URL and choose the right container. Here the CentOS Stream 9 amd64, i.e. release 9-Stream, is used.

[root@srv ~]# lxc-create --template download -n mycontainer -- --dist centos --release 9-Stream --arch amd64

In addition, there is a “–variant” option along with “--dist” and “--release” to specify which variant to install – default, cloud, desktop or other. There is a variant column in the table on the images’ page mentioned above.
Keep on reading!

lxc and interface lo does not exist in virtualized server

Virtualizing a real server with an LXC container is pretty easy – do a rsync and run it. Sometimes there are some glitches when starting the LXC container for the first time. Such errors like the following – no networking available at the start, but when attached to the started container it seems to have the network interfaces with no IPs. Even, though it is possible to set the IPs manually the init scripts do not work.

[root@srv ~]# lxc-start -F -n n7763.node-int.info
lxc-start: live300.mytv.bg: start.c: proc_pidfd_open: 1607 Function not implemented - Failed to send signal through pidfd
INIT: version 2.88 booting

   OpenRC 0.12.4 is starting up Gentoo Linux (x86_64) [LXC]

 * /proc is already mounted
 * Mounting /run ... * /run/openrc: creating directory
 * /run/lock: creating directory
 * /run/lock: correcting owner
 * Caching service dependencies ... [ ok ]
 * setting up tmpfiles.d entries for /dev ... [ ok ]
 * Creating user login records ... [ ok ]
 * Wiping /tmp directory ... [ ok ]
 * Bringing up network interface lo ...RTNETLINK answers: File exists
 [ ok ]
 * Updating /etc/mtab ... [ ok ]
 * Bringing up interface lo
 *   ERROR: interface lo does not exist
 *   Ensure that you have loaded the correct kernel module for your hardware
 * ERROR: net.lo failed to start
 * setting up tmpfiles.d entries ... [ ok ]
INIT: Entering runlevel: 3
 * Loading iptables state and starting firewall ... [ ok ]
 * Bringing up interface lo
 *   ERROR: interface lo does not exist
 *   Ensure that you have loaded the correct kernel module for your hardware
 * ERROR: net.lo failed to start
 * Bringing up interface eth0
 *   ERROR: interface eth0 does not exist
 *   Ensure that you have loaded the correct kernel module for your hardware
 * ERROR: net.eth0 failed to start

And it appeared that the old /dev was still in place, which messed up with virtualization and the init scripts.
The solution is simple just

  1. remove the existing /dev
  2. create a new empty one

And the LXC container of the real server will start with a network as usual.

So when virtualizing a real server into LXC container after doing RSYNC of the storage, it is mandatory to create an empty /dev, /proc, and /sys directories!

More on the LXC containers – Run LXC CentOS 8 container with bridged network under CentOS 8.

Debug options for LXC and lxc-start when lxc container could not start

Setup and running LXC container is really easy, but sometimes it is unclear why the LXC container could not start. Most of the time, there is a generic error, which says nothing for the real reason:

root@srv ~ # lxc-start -n test-lxc
lxc-start: test-lxc: lxccontainer.c: wait_on_daemonized_start: 867 Received container state "ABORTING" instead of "RUNNING"
lxc-start: test-lxc: tools/lxc_start.c: main: 306 The container failed to start
lxc-start: test-lxc: tools/lxc_start.c: main: 309 To get more details, run the container in foreground mode
lxc-start: test-lxc: tools/lxc_start.c: main: 311 Additional information can be obtained by setting the --logfile and --logpriority options

No specific reason why the LXC container test-lxc can not be started and the lxc-start command failed. There is just an offer to use the logging options and here is how the administrator of the box may do it by including the following lxc-start options:

-l DEBUG –logfile=test-lxc.log –logpriority=9

Here is a real-world example of an old kernel trying to run LXC 4.0
Keep on reading!