Podman rootless with systemd

Hi,

i having trouble to setup a container with podman rootless and systemd. I’m testing on Nixos unstable.

In root mode, my container is working

 virtualisation = {
        podman = {
            enable = true;
            dockerCompat = true;
            defaultNetwork.dnsname.enable = true;
        };
        oci-containers = {
            backend = "podman";
            containers.pgadmin = {
                image = "dpage/pgadmin4";
                autoStart = true;
                ports = [ "8084:80" ];
                environment = {
                    PGADMIN_DEFAULT_EMAIL = "user@domain.com";
                    PGADMIN_DEFAULT_PASSWORD = "SuperSecret";
                };
            };
        };
    };

But when adding the systemd user config to exec my container with a dedicated user, the service is failing

 ...
 users.users.pgadmin= {
     isNormalUser = true;
 };
 systemd.services.podman-pgadmin.serviceConfig.User = "pgadmin";
 ...

see the log

Nov 28 02:29:16 sdserver systemd[1]: Starting podman-pgadmin.service...
Nov 28 02:29:16 sdserver podman-pgadmin-pre-start[32056]: time="2022-11-28T02:29:16+01:00" level=warning msg="RunRoot is pointing to a path (/run/user/1003/containers) which i>
Nov 28 02:29:16 sdserver podman-pgadmin-pre-start[32056]: Error: creating tmpdir: mkdir /run/user/1003: permission denied
Nov 28 02:29:16 sdserver podman-pgadmin-pre-start[32069]: rm: cannot remove '/run/podman-pgadmin.ctr-id': Permission denied
Nov 28 02:29:16 sdserver systemd[1]: podman-pgadmin.service: Control process exited, code=exited, status=1/FAILURE
Nov 28 02:29:16 sdserver podman-pgadmin-post-stop[32071]: time="2022-11-28T02:29:16+01:00" level=warning msg="RunRoot is pointing to a path (/run/user/1003/containers) which i>
Nov 28 02:29:16 sdserver podman-pgadmin-post-stop[32071]: Error: creating tmpdir: mkdir /run/user/1003: permission denied
Nov 28 02:29:16 sdserver systemd[1]: podman-pgadmin.service: Control process exited, code=exited, status=125/n/a
Nov 28 02:29:16 sdserver systemd[1]: podman-pgadmin.service: Failed with result 'exit-code'.
Nov 28 02:29:16 sdserver systemd[1]: Failed to start podman-pgadmin.service.
Nov 28 02:29:17 sdserver systemd[1]: podman-pgadmin.service: Scheduled restart job, restart counter is at 5.
Nov 28 02:29:17 sdserver systemd[1]: Stopped podman-pgadmin.service.
Nov 28 02:29:17 sdserver systemd[1]: podman-pgadmin.service: Start request repeated too quickly.
Nov 28 02:29:17 sdserver systemd[1]: podman-pgadmin.service: Failed with result 'exit-code'.
Nov 28 02:29:17 sdserver systemd[1]: Failed to start podman-pgadmin.service.

So /run/user/1003/containers does not exists. What did i miss in my config ?

Thanks

3 Likes

I face the same problem. The core issue is that Podman services created with virtualisation.oci-containers.containers.<name> are installed as /etc/systemd/user/podman-<name>.service.

When you do that (systemd.services.podman-pgadmin.serviceConfig.User = "pgadmin";), one would expect podman-pgadmin.service to be run/executed by user pgadmin. But, for that to happen, the unit file for podman-pgadmin.service needs to be placed inside /home/pgadmin/.config/systemd/user (not in /etc/systemd/user/).

The current workaround is to manually write a systemd service by hand for every container that you want and manage it using home-manager. For containers to be truly rootless (in the same sense as it works on RHEL/Fedora), there should an oci-containers “module” inside home-manager too.

1 Like

When you do that (systemd.services.podman-pgadmin.serviceConfig.User = “pgadmin”;), one would expect podman-pgadmin.service to be> run/executed by user pgadmin. But, for that to happen, the unit file for podman-pgadmin.service needs to be placed inside> home/pgadmin.config/systemd/user (not in etc/systemd/user).

That’s not correct.

In the first case it will be run by system systemd instance but running as user pgadmin and started by systemctl start podman-pgadmin. So run by root as pgadmin.

If placed inside /etc/systemd/user or /home/pgadmin/.config/systemd/user, it will be run by the user systemd instance and started by systemctl --user start podman-pgadmin when logged in as pgadmin. So by and as pgadmin.

3 Likes

I misspoke. With systemd.services.<name>.serviceConfig.User, the unit file will be in /etc/systemd/system/ but the User field will be occupied with the value mentioned. The behaviour that I mentioned applies to systemd.user.services. My bad.


Doing a systemctl --user status podman-<name>.service (as the non-root user) gave me the error that the given service doesn’t exist for this user (Unit podman-<name>.service could not be found.). I also did a podman ps with this non-root user; that came up empty.

But then, I logged in as root (didn’t just prefix commands with sudo; I was logged in as root, in root’s shell) and ran systemctl status podman-<name>.service and got the details I would expect. I also did a podman ps and found out that the containers are running by the user root but also as the user root.

If you define a service through systemd.services.foo, that will run by the system systemd instance regardless of what you put into serviceConfig.User. Many system level services do this as you absolutely want to run as few things as possible as root.

1 Like

I found a workaround months ago.

As reminder, i wanted to :

  • start podman container at boot.
  • rootless container so run by simple user and NOT root
  • allow the user start/stop… the container

In the configuration.nix

    # create my user
    users.users."pgmadin" = {
        isNormalUser = true;
        uid = 1003;
    };

    # In order for our user to run containers automatically on boot, 
    # we need to enable systemd linger support.
    # This will ensure that a user manager is run for the user at boot and kept around after logouts.
    system.activationScripts = {
        enableLingering = ''
            # remove all existing lingering users
            rm -r /var/lib/systemd/linger
            mkdir /var/lib/systemd/linger
            # enable for the subset of declared users
            touch /var/lib/systemd/linger/pgadmin
        '';
    };

    virtualisation = {
        ## setup podman
        podman = {
            enable = true;
            dockerCompat = true;
            defaultNetwork.settings.dns_enable = true;
        };
        ## declare containers
        oci-containers = {
            ## use podman as default container engine
            backend = "podman";
        };
    };

    ## pgadmin container
    systemd.user.services.pgadmin = {
        enable = true;
        unitConfig = { ConditionUser = "pgadmin"; };
        wantedBy = [ "default.target" ];
        after = [ "network.target" ];
        description = "pgadmin container";
        path = [ "/run/wrappers" ];
        serviceConfig =
        let
            bash = "${pkgs.bash}/bin/bash";
            podmancli = "${config.virtualisation.podman.package}/bin/podman";
            podname = "pgadmin";
            image = "dpage/pgadmin4:6.19";
            cid = "%t/podman/%n.cid";
            pid = "%t/podman/%n.pid";
            startpre = [
                "${pkgs.coreutils-full}/bin/rm -f ${cid} ${pid}"
                "-${podmancli} stop --ignore ${podname}"
                "${podmancli} rm --force --ignore ${podname}"
            ];
            stoppost = [
                "${podmancli} stop --ignore ${podname}"
                "${podmancli} rm --force --ignore ${podname}"
                "${pkgs.coreutils-full}/bin/rm -f ${cid} ${pid}"
            ];
        in
        {
            ExecStartPre = startpre;
            ExecStart = "${podmancli} run " +
                "--rm " +
                "--replace " +
                "--name=${podname} " +
                "--conmon-pidfile=${pid} " +
                "--cidfile=${cid} " +
                "--cgroups=no-conmon " +
                "--sdnotify=conmon " +
                "--log-driver=journald " +
                "--name=${podname} " +
                "-p 127.0.0.1:5050:80 " +
                "-v pgadmin_data:/var/lib/pgadmin " +
                "-e PGADMIN_DEFAULT_EMAIL='admin@localhost.fr' "+
                "-e PGADMIN_DEFAULT_PASSWORD='pgadmin' "+
                "-d " +
                "${image}";
            ExecStop = "${podmancli} stop ${podname}";
            ExecStopPost = stoppost;
            Type = "notify";
            NotifyAccess = "all";
            Restart = "no";
            TimeoutStopSec = 70;
        };
    };

Then in the pgadmin session

# start user service
systemctl --user start pgadmin
# check status
systemctl --user status pgadmin
# it works so start at boot
systemclt --user enable pgadmin

This way only my user pgadmin can control the service and my podman container is rootless.

3 Likes

I forked oci-containers.nix and was able to run rootless podman containers managed as a normal systemd unit.

serviceUser was created using autoSubUidGidRange = true to add new entries to /etc/subuid.
Linger state was enabled for this user (linger = true).

I’ve added the following lines to the systemd unit section after NotifyAccess="all";:

User = "${container.serviceUser}";
Group = "${container.serviceGroup}";
Delegate = "yes";

/run/podman-${escapedName}.ctr-id was replaced to /run/user/${serviceUserUID}/podman-${escapedName}.ctr-id.

Delegate = "yes" option was the key to do it, without it systemd stopped the service with the errors like:

user@1111.service: Got notification message from PID 161000, but reception only permitted for main PID 69181

I use systemctl stop podman-<container-name> and systemctl start podman-<container-name> command to manage the container.

I can create MR/PR if anyone is interested.

7 Likes

hell yeah I’m interested! Please also mention @thefossguy on GitHub if/when you create a PR.

Same here. I’ve had a quick glance at the open PRs but I’m not sure if you’ve had the time to get your’s through. Please mention me @andresgongora as well :slight_smile:

Hi @dv1618 did you create a PR and did it got merged?

I use Home Manager for rootless podman containers, but Home Manager adds bit extra to the configuration and your PR would simplify the configs quite a bit.

UPDATE 2024-12-30 for Nixos 24.11

As a reminder, i want systemd user service container :

  • the container MUST be started when server start.
  • the container MUST be handle by a specific (non root) user.

The example bellow show 2 containers (postgres/pgadmin), started on server start. They can be managed by the dedicated “postgres” user (non-root)

cat podmanworkaround.nix

#
# PODMAN WORKAROUND
# when default network is pasta
# it requires the host network interfaces all setup
#
{ config, pkgs, ... }:

{

    systemd.user.services.podmanworkaround = {
        enable = true;
        description = "User-level proxy to system-level multi-user.target";
        documentation = [
            "https://github.com/containers/podman/issues/22197#issuecomment-2564715584"
            "man:podman-systemd.unit(5)"
        ];
        wantedBy = [ "default.target" ];
        after = [ "network.target" ];
        path = [ "/run/wrappers" ];
        serviceConfig = {
            Type = "oneshot";
            # Set a timeout as by default oneshot does not have one and in case network-online.target
            # never comes online we do not want to block forever, 90s is the default systemd unit timeout.
            TimeoutStartSec = 90;
            ExecStart = "${pkgs.bash}/bin/bash -c 'until systemctl is-active multi-user.target; do echo wait for multi-user.target; sleep 0.5; done; echo multi-user.target is active'";
            RemainAfterExit = "yes";
        };
    };

    systemd.user.targets.podmanworkaround = {
        enable = true;
        description = "User-level multi-user.target";
        documentation = [
            "https://github.com/containers/podman/issues/22197#issuecomment-2564715584"
            "man:podman-systemd.unit(5)"
        ];
        requires = [ "podmanworkaround.service" ];
        wants = [ "podmanworkaround.service" ];
        after = [ "podmanworkaround.service" ];
    };

}

cat postgres.nix

#
# POSTGRES service
#
{ config, pkgs, ... }:

{
    # create postgres user
    users.users."postgres" = {
        isNormalUser = true;
        linger = true; # lingering is the all tricks which enable user services with ConditionUser to be started on server start
        autoSubUidGidRange = true;
    };


    ## postgres server
    systemd.user.services.postgres = {
        enable = true;
        unitConfig = { ConditionUser = "postgres"; }; # only the opening user session "postgres" will trigger this service to start. He will be able to manage the container too
        wantedBy = [ "default.target" ];
        requires = [ "podmanworkaround.target" ]; # https://github.com/containers/podman/issues/22197#issuecomment-2564715584
        after = [ "podmanworkaround.target" ];    # https://github.com/containers/podman/issues/22197#issuecomment-2564715584
        description = "postgres container";
        path = [ "/run/wrappers" ];
        serviceConfig = 
        let
            podmancli = "${config.virtualisation.podman.package}/bin/podman";
			cid = "%t/podman/%n.cid";
            podname = "postgres";
            image = "postgres:16.4";
            options = "-p 5432:5432 " +
                      "-v postgres_data:/var/lib/postgresql/data " +
                      "-e POSTGRES_PASSWORD='toto' " +
                      "-e TZ='Europe/Paris' ";
            startpre = [
            	"${pkgs.coreutils-full}/bin/rm -f ${cid}"
            ];
            stoppost = [
                "${podmancli} rm --volumes --force --ignore --cidfile=${cid}"
            ];
        in
        {
            ExecStartPre = startpre;
            ExecStart = "${podmancli} run " +
		                "--rm " +
		                "--replace " +
		                "--name=${podname} " +
		                "--cgroups=split " +
		                "--sdnotify=conmon " +
		                "--log-driver=journald " +
		                "--name=${podname} " +
		                options +
		                "-d " +
		                "${image}";
            ExecStop = "${podmancli} stop --cidfile=${cid}";
            ExecStopPost = stoppost;
            Delegate="yes";
            Type = "notify";
            NotifyAccess = "all";
            SyslogIdentifier="%N";
            Restart = "no";
            TimeoutStopSec = 70;
            KillMode = "mixed"; # https://unix.stackexchange.com/a/714428
        };
    };

    ## pgadmin server
    systemd.user.services.pgadmin = {
        enable = true;
        unitConfig = { ConditionUser = "postgres"; }; # only the opening user session "postgres" will trigger this service to start. He will be able to manage the container too
        wantedBy = [ "default.target" ];
      	requires = [ "podmanworkaround.target" ]; # https://github.com/containers/podman/issues/22197#issuecomment-2564715584
        after = [ "podmanworkaround.target" ];    # https://github.com/containers/podman/issues/22197#issuecomment-2564715584
        description = "pgadmin container";
        path = [ "/run/wrappers" ];
        serviceConfig = 
        let
            podmancli = "${config.virtualisation.podman.package}/bin/podman";
			cid = "%t/podman/%n.cid";
            podname = "pgadmin";
            image = "dpage/pgadmin4:latest";
            options = "--network pasta:--map-guest-addr,169.254.1.2 --add-host host.containers.internal:169.254.1.2 --add-host host.docker.internal:169.254.1.2 " + # allow connecting to host from inside the container for podman v5.2.3. podman v5.3 do it by default
                      "-p 127.0.0.1:8050:80 " +
                      "-v pgadmin_data:/var/lib/pgadmin " +
                      "-e PGADMIN_DEFAULT_EMAIL='email@host.local' "+
                      "-e PGADMIN_DEFAULT_PASSWORD='toto' ";
            startpre = [
            	"${pkgs.coreutils-full}/bin/rm -f ${cid}"
            ];
            stoppost = [
                "${podmancli} rm --volumes --force --ignore --cidfile=${cid}"
            ];
        in
        {
            ExecStartPre = startpre;
            ExecStart = "${podmancli} run " +
		                "--rm " +
		                "--replace " +
		                "--name=${podname} " +
		                "--cidfile=${cid} " +
		                "--cgroups=split " +
		                "--sdnotify=conmon " +
		                "--log-driver=journald " +
		                "--name=${podname} " +
		                options +
		                "-d " +
		                "${image}";
            ExecStop = "${podmancli} stop ${podname}";
            ExecStopPost = stoppost;
            Delegate="yes";
            Type = "notify";
            NotifyAccess = "all";
            SyslogIdentifier="%N";
            Restart = "no";
            TimeoutStopSec = 70;
            KillMode = "mixed"; # https://unix.stackexchange.com/a/714428
        };
    };

}

Known issue : an issue is still existing from the beginning of nixos/podman (2021).

When the dedicated user manually call systemctl --user stop postgres everything is working as expect.

But when systemd itself call the stop (via a machine reboot/shutdown), there is sometimes a Kill which does not let podman properly stop the container. For images which ìnternaly use chown command (like postgres or pgadmin), the underlying volume is dirty with incorrect permissions. Then the start will throw error until the root chown the volume permissions to user uid.

some threads (1, 2) about this, but no workaround.

Help appreciate :grinning:

Are you OK to use Home Manager?

Rootless Podman Pods/Containers work with Home Manager since then those start with the user’s Systemd session. See My guide to Rootless Podman setup with Home Manager.

fwiw I recently opened nixos/oci-containers: support rootless containers & healthchecks by Ma27 · Pull Request #368565 · NixOS/nixpkgs · GitHub : so far, rootless containers seem to work reasonably well, even with healthchecks.

1 Like

Thumbs up for your efforts to simplify the usage of Rootless Podman containers! One question: How does this work with rootless containers where different users could run containers with the same name?

  • Move the ctr-id into /run/${containerName} to make podman can actually write to it since it’s now in its RuntimeDirectory.

I don’t think that’s doable at all: the container name is part of the systemd unit name.
So having virtualisation.containers.foo.user = "x"; and virtualisation.containers.foo.user = "y"; wouldn’t evaluate.

This uses the existing virtualisations.containers module, so the containers are part of the system configuration (and thus part of the system instance of systemd, not a user session).

Yeah, my bad :man_facepalming:

Since the rootless containers are created with the same Nix config, this is obviously not an issue. I have been working too long with Home Manager.

I think my problem is between systemd and podman configuration.
I feel systemd is killing podman (SIGTERM or SIGKILL) before the ExecStop=podman stop is called.

Did you experience this issue with HomeManager when using lingering ?

My hand-written Systemd service for starting/stopping the pod works OK

podman pod stop

Dec 29 11:55:48 portti systemd[1329]: Stopping Start podman 'unifi' pod...
░░ Subject: A stop job for unit UNIT has begun execution
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ A stop job for unit UNIT has begun execution.
░░
░░ The job identifier is 2364.
Dec 29 11:55:48 portti podman[102298]: 2024-12-29 11:55:48.255275461 +0200 EET m=+0.021262790 pod stop 6e220168d6ddf7ea7dbbecaaa80b316e04ef51d1>
Dec 29 11:55:48 portti podman[102298]: 2024-12-29 11:55:48.27544484 +0200 EET m=+0.041432170 container died 6edb97ce52d42ef96dba895cdff20538780>
Dec 29 11:55:49 portti podman[102298]: 2024-12-29 11:55:49.187286703 +0200 EET m=+0.953274120 container cleanup 6edb97ce52d42ef96dba895cdff2053>
Dec 29 11:55:49 portti podman[102298]: unifi
Dec 29 11:55:49 portti systemd[1329]: Stopped Start podman 'unifi' pod.
░░ Subject: A stop job for unit UNIT has finished
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ A stop job for unit UNIT has finished.
░░
░░ The job identifier is 2364 and the job result is done.

podman stop

… but it seems the container using Home Manager’s services.podman.containers is not stopped properly.

Dec 29 01:51:27 portti systemd[1329]: Stopping Start Unifi Network Application (podman)...
░░ Subject: A stop job for unit UNIT has begun execution
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ A stop job for unit UNIT has begun execution.
░░
░░ The job identifier is 2221.
Dec 29 01:51:27 portti unifi-network-application[75586]: [custom-init] No custom files found, skipping...
Dec 29 01:51:37 portti podman-unifi-network-application[75686]: time="2024-12-29T01:51:37+02:00" level=warning msg="StopSignal SIGTERM failed to stop container unifi-netw>
Dec 29 01:51:37 portti podman[75686]: 2024-12-29 01:51:37.905215049 +0200 EET m=+10.095760716 container died 8c402001b17935808ca65b3f9813569aa245be376c6adc5906c1def080320>
Dec 29 01:51:38 portti podman[75686]: 2024-12-29 01:51:38.241519338 +0200 EET m=+10.432064986 container remove 8c402001b17935808ca65b3f9813569aa245be376c6adc5906c1def0803>
Dec 29 01:51:38 portti podman-unifi-network-application[75686]: 8c402001b17935808ca65b3f9813569aa245be376c6adc5906c1def080320e69
Dec 29 01:51:38 portti systemd[1329]: podman-unifi-network-application.service: Main process exited, code=exited, status=137/n/a
░░ Subject: Unit process exited
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ An ExecStart= process belonging to unit UNIT has exited.
░░
░░ The process' exit code is 'exited' and its exit status is 137.
Dec 29 01:51:38 portti systemd[1329]: podman-unifi-network-application.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ The unit UNIT has entered the 'failed' state with result 'exit-code'.
Dec 29 01:51:38 portti systemd[1329]: Stopped Start Unifi Network Application (podman).
░░ Subject: A stop job for unit UNIT has finished
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ A stop job for unit UNIT has finished.
░░
░░ The job identifier is 2224 and the job result is done.
Dec 29 01:51:38 portti systemd[1329]: podman-unifi-network-application.service: Consumed 5.383s CPU time, 344.2M memory peak.
░░ Subject: Resources consumed by unit runtime
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ The unit UNIT completed and consumed the indicated resources.

I don’t know if it’s specific to Nixos but didn’t find any other thread related to this issue.

Last but not least : i install nixos in a windows Hyper-V VM using the same configuration, i’ve only changed the static ip address. After 100+ reboot in 3 days, i have never got the issue in this VM.

So what cause the issue in all the baremetal servers ?

I ran into a super weird bug running chris10k’s nix configuration on my own server…
since the issue is related, but a very large tangent, I’m linking this here:

thanks for posting this, its helped me alot!