I was playing with LXC to debug a problem a user had with OAI inside an unprivileged LXC container. So I set up LXC on my machine. What follows is a recap of the steps. This was run on an up-to-date Fedora 41. This is compiled from various sources:

Basic installation and running: LXC Getting Started
Fedora-specific tips: Setting up unprivileged containers with LXC on Fedora 38
Some details about Linux kernel primitives: Tutorial: Using Linux Primitives to Build Your Own Containers - Stéphane Graber & Christian Brauner. The interesting effect of an unprivileged container root account having all privileges but not the “right to use them” is discussed starting at minute 39.

Installation

Install LXC and start the relevant services.

sudo dnf install lxc lxc-templates lxc-extra
sudo systemctl start lxc

This also starts lxc-net for networking, and there is now an lxcbr0 bridge that will provide connectivity to the container.

The firewall on this default Fedora system is installed and activated, but would block IP assignment to the containers. I did not bother further with this and disabled the firewall for my tests, but this is a security risk.

sudo systemctl stop firewalld.service

Some information on how this could be configured properly is here.

Easy container start

The easy way, which is not recommended, is to start a privileged container. The following will guide through OS selection and container creation, then start that container.

sudo lxc-create --name mycontainer --template download
sudo lxc-start --name mycontainer

You can then start a shell inside the container, inspect it, show running or stopped containers like so:

sudo lxc-attach --name mycontainer
sudo lxc-info -n mycontainer
sudo lxc-ls --fancy
sudo lxc-ls --stopped

The lxc-info command will show the IP address. Without some firewall rules or disabled firewall, it won’t have IP connectivity.

A container can be stopped, and then disposed like the following. Note that it’s not recommended to destroy a container while it’s running.

sudo lxc-stop --name mycontainer
sudo lxc-destroy --name mycontainer

The general LXC configuration is at /etc/lxc/default.conf, and the corresponding container configuration is at /var/lib/lxc/mycontainer/config. In the latter directory, there is also the rootfs for the container.

The problem with that approach is that the root user inside the container is mapped to the root user on the host. For security reasons, it’s better to start containers unprivileged.

Correct container start

Creating unprivileged containers is slightly more complex, as it requires to make a mapping of the host user IDs to a new user ID range for the root user inside the container. The LXC documentation specifies to create this, but in my case, it was already there:

$ cat /etc/subuid
richie:524288:65536
$ cat /etc/subgid
richie:524288:65536

Means that for user richie, 65536 UIDs and GIDs will be mapped to the host UID/GID starting at 524288 (the LXC documentation uses 1000000 instead of the preconfigured 524288 above).

Now, we need to tell LXC about this ID mapping. First, copy the general LXC default config for new containers from /etc/lxc/default.conf into .config/lxc/, then append the UID/GID mapping to that file (the LXC getting starting guide has a script for that!)

lxc.idmap = u 0 524288 65536
lxc.idmap = g 0 524288 65536

Finally, the host user needs to be enabled to create network devices, so enable this:

echo "$(id -un) veth lxcbr0 10" | sudo tee -a /etc/lxc/lxc-usernet

Now we are ready to start a container. It cannot be directly started with lxc-create, because, as the LXC documentation explains:

To run unprivileged containers as an unprivileged user, the user must be allocated an empty delegated cgroup (this is required because of the leaf-node and delegation model of cgroup2, not because of liblxc). […] It is not possible to simply start a container from a shell as a user and automatically delegate a cgroup. Therefore, you need to wrap each call to any of the lxc-* commands in a systemd-run command.

Use this to create and start the container:

systemd-run --unit=my-unit --user --scope -p "Delegate=yes" -- lxc-create --name oai --template download
systemd-run --unit=my-unit --user --scope -p "Delegate=yes" -- lxc-start --name oai

An additional complication was that the container start was aborted with an error message

Permission denied - Could not access /home/richie

It did not have enough rights to read from my home directory (which has the container configuration and rootfs in .local/share/lxc/oai/config). The solution here is to create a file access control list permission for the (mapped) UID of the container:

setfacl -m u:524288:x /home/richie/
getfacl /home/richie/

It was then possible to attach and stop the container as previously:

lxc-attach --name oai
lxc-stop --name oai

Mount a directory into the container

It’s relatively straight-forward to mount a directory into an existing container. First, inside the running container, create the directory. I wanted to mount the OAI directory to debug the problem, so in the container, I created /oai, then stopped the container. Adding

lxc.mount.entry = /home/richie/oai oai none bind 0 0

to .local/share/lxc/oai/config mounts the (existing) /home/richie/oai from the host into /oai of the container. Restart the container, and it should be feasible to see the files inside the container.

The OAI problem

The problem with the unprivileged container is that root inside the container also has all capabilities (such as CAP_SYS_NICE), but not necessarily the “right” to use them, as the user that created the container does not have the capability.

OAI uses a syscall to check the (effective) capabilities for CAP_SYS_NICE, which should indicate that it can create threads with increased priority or specific capabilities. The container root has that capability, but on my system, the normal user account does not have it. Therefore, OAI first detects the capability, but then fails to actually create the threads.

One solution is to drop the capability before starting OAI. This can be achieved by adding

lxc.cap.drop = sys_nice

to the container configuration file .local/share/lxc/oai/config.

After stopping and restarting the container, OAI should detect the missing capability, print a warning, and then run normally.

This “limitation” is described in the Youtube video I linked to.

LXC Quickstart

2025/04/13

Installation

Easy container start

Correct container start

Mount a directory into the container

The OAI problem