You can apply the principle of least privilege with a few simple modifications to your Containerfile & by adding a couple of arguments to your container
run command. I’ll be referencing a small FastAPI app called netinfo. Here is its Containerfile:
Use an unprivileged user
Don’t run your application as the root user within the container, instead:
- Create an unprivileged user.
- Run the app from within that user’s home directory.
Add the following two lines to your
RUN groupadd -r netinfo && useradd -r -g netinfo netinfo WORKDIR /home/netinfo
Instruct podman to run the container with the netinfo user:
podman run --rm -u netinfo -p 8000:8000 <IMAGE-ID>
If you were to run a shell inside the container, you would be connected as the netinfo user within the /home/netinfo directory:
netinfo@fc6f4f4f68ca:~$ pwd /home/netinfo
Disable the root user
An easy way to do this, is by changing the default shell from
/usr/sbin/nologin. Add the following line to your Containerfile:
RUN chsh -s /usr/sbin/nologin root
Use a read-only file system
If your app doesn’t require writing to disk, use a read-only file system.
podman run --rm --read-only -u netinfo -p 8000:8000 <IMAGE-ID>
And you’d be unable to make file system modifications:
netinfo@fc6f4f4f68ca:~$ touch hello touch: cannot touch 'hello': Read-only file system
Prevent Privilege Escalation
Add the argument
--security-opt=no-new-privileges to your run command.
podman run --rm --read-only --security-opt=no-new-privileges -u netinfo -p 8000:8000 <IMAGE_ID>
Drop All Kernel Capabilities and add as needed
Add the argument
--cap-drop=all to your run command.
podman run --rm --read-only --cap-drop=all --security-opt=no-new-privileges -u netinfo -p 8000:8000 <IMAGE_ID>
You can then use the
--cap-add argument to add any capabilities your app might need. E.g.:
- CAP_NET_ADMIN allows the process to perform network-related operations,
- CAP_NET_BIND_SERVICE allows it to bind to port numbers less than 1024,
- CAP_SYS_TIME allows it to modify the system clock,
Limit resource usage with Control Groups
While Linux Namespaces allow you to separate access to resources, they don’t allow you to limit usage. You need Linux Control Groups for that.
# only half a CPU core podman run --cpus="0.5" ... # only 225MB maximum available memory podman run --memory="225m" ...
seccomp, SELinux and AppArmor
These are bit more advanced and outside of the scope of this article. But seccomp gives you even finer-grained control over the sys-calls a process within your container can make.
SELinux is a MAC (mandatory access control) mechanism thats label users, processes, files & system resources. It governs which user or process can access which files & resources. AppArmor is similar, but uses file paths and focuses on processes.