Simple Container Security

You can apply the principle of least privilege with a few simple modifications to your Containerfile & by adding a couple of arguments to your container run command. I’ll be referencing a small FastAPI app called netinfo. Here is its Containerfile:

Use an unprivileged user

Don’t run your application as the root user within the container, instead:

  • Create an unprivileged user.
  • Run the app from within that user’s home directory.

Add the following two lines to your Containerfile:

RUN groupadd -r netinfo && useradd -r -g netinfo netinfo
WORKDIR /home/netinfo

Instruct podman to run the container with the netinfo user:

podman run --rm -u netinfo -p 8000:8000 <IMAGE-ID>

If you were to run a shell inside the container, you would be connected as the netinfo user within the /home/netinfo directory:

netinfo@fc6f4f4f68ca:~$ pwd
/home/netinfo

Disable the root user

An easy way to do this, is by changing the default shell from /bin/bash to /usr/sbin/nologin. Add the following line to your Containerfile:

RUN chsh -s /usr/sbin/nologin root

Use a read-only file system

If your app doesn’t require writing to disk, use a read-only file system.

podman run --rm --read-only -u netinfo -p 8000:8000 <IMAGE-ID>

And you’d be unable to make file system modifications:

netinfo@fc6f4f4f68ca:~$ touch hello
touch: cannot touch 'hello': Read-only file system

Prevent Privilege Escalation

Add the argument --security-opt=no-new-privileges to your run command.

podman run --rm --read-only --security-opt=no-new-privileges -u netinfo -p 8000:8000 <IMAGE_ID>

Drop All Kernel Capabilities and add as needed

Add the argument --cap-drop=all to your run command.

podman run --rm --read-only --cap-drop=all --security-opt=no-new-privileges -u netinfo -p 8000:8000 <IMAGE_ID>

You can then use the --cap-add argument to add any capabilities your app might need. E.g.:

  • CAP_NET_ADMIN allows the process to perform network-related operations,
  • CAP_NET_BIND_SERVICE allows it to bind to port numbers less than 1024,
  • CAP_SYS_TIME allows it to modify the system clock,
  • etc…

Limit resource usage with Control Groups

While Linux Namespaces allow you to separate access to resources, they don’t allow you to limit usage. You need Linux Control Groups for that.

# only half a CPU core
podman run --cpus="0.5" ...

# only 225MB maximum available memory
podman run --memory="225m" ...

seccomp, SELinux and AppArmor

These are bit more advanced and outside of the scope of this article. But seccomp gives you even finer-grained control over the sys-calls a process within your container can make.

SELinux is a MAC (mandatory access control) mechanism thats label users, processes, files & system resources. It governs which user or process can access which files & resources. AppArmor is similar, but uses file paths and focuses on processes.