Docker userns-remap

Hey everyone! I am interested to have docker setup for development purposes, I decided to use userns-remap in NIXOS to have a container with a user that could not touch the content of /nix/store(inside the container, of course). I think I could achieve it by a very similar solution to Docker and --userns-remap, how to manage volume permissions to share data between host and container? - Stack Overflow post. The problem is the uid of user inside the docker container is not going to match with the remapped uid in the NIXOS host! Here is my sample to reproduce. Here is my nixos config to define docker users

users = {
    mutableUsers = true;
    users = {
      hadi = {
        isNormalUser = true;
        uid = 1000;
        subUidRanges = [
          { startUid = 100000; count = 65536; }
        ];
        subGidRanges = [
          { startGid = 100000; count = 65536; }
        ];
        group = "hadi";
        ...
      };

      # docker users
      hadi-root = {
        isNormalUser = true;
        createHome = false;
        uid = 100000;
        group = "hadi-root";
        extraGroups = [ ];
      };
      hadi-dev = {
        isNormalUser = true;
        createHome = false;
        uid = 101000;
        group = "hadi-dev";
        extraGroups = [ ];
      };
    };

    groups = {
      hadi = {
        gid = 1000;
      };
      hadi-root = {
        gid = 100000;
      };
      hadi-dev = {
        gid = 101000;
      };
    };
  };

therefore, it is the host id

 id
uid=1000(hadi) gid=1000(hadi) groups=1000(hadi),1(wheel), ... ,100000(hadi-root),101000(hadi-dev)

also I have

$ cat /etc/subuid                 
hadi:100000:65536
$ cat /etc/subgid
hadi:100000:65536

to set the userns-remap I have

virtualisation = {
    docker = {
      enable = true;
      rootless = {
        enable = true;
        setSocketVariable = true;
      };
      daemon.settings = {
        userns-remap = "hadi";
      };
    };

to remap the ids. Now let’s have this Dockerfile

FROM ubuntu:latest
ARG UNAME=dev
ARG UID=1000
ARG GID=1000
RUN groupadd -g $GID -o $UNAME
RUN useradd -m -u $UID -g $GID -o -s /bin/bash $UNAME
USER $UNAME
CMD [“tail”, "-f", “/dev/null”]

and run it like usual!

docker build -t test . && docker run --rm -it -v $PWD/my_volume_in_host:/home/dev/my_volume test /bin/bash

then run sleep inf inside the docker. I expect it to be mapped into the hadi-dev user, but what I see is

$ ps aux | grep sleep                                             
100999      5192  0.0  0.0   2792  1408 pts/0    S+   10:23   0:00 sleep inf

what?! it’s one off! shouldn’t it be 101000, instead of 100999 based on the remap config above?

Also notice I mounted my_volume above, so take a look at

dev@3b702e4c382a:~$ ls -al
total 12
drwxr-xr-x 1 1001 1001   74 Mar 31 23:09 my_volume

inside the container. the user and group of my_volume are 1001:1001, while in the host we have

$ ls -al
total 28
drwxr-xr-x  1 hadi-dev hadi-dev    74 Mar 31 19:09 my_volume_in_host

recall the uid of hadi-dev is 101000 so it should be remapped to 1000 inside the container, but why 1001?!! btw, I tried to convince myself that it’s for the security purposes, but such an argument has a big flaw, because if I run the container with root user, the root is mapped to the hadi, without any problem and this collision of uids weren’t considered as security flaw!!

I am confused and ran out of guesses! it would be so much appreciated if you give me any suggestion or even any guess. Thanks

Now that I wrote it and fact checked my assumptions, I reached to this quote from Run the Docker daemon as a non-root user (Rootless mode) | Docker Docs

Rootless mode executes the Docker daemon and containers inside a user namespace. This is very similar to userns-remap mode,

maybe this means rootless and userns-remap are not working together! But the remap files /etc/subuid and /etc/subgid are explicit definition. I would prefer fail-fast instead of fail-safe here.

However, if you have any idea to avoid container’s user to touch the container’s directories with root access(/nix/store, inside the container of course), and also, mount a host directory to develop, let me know.

Eureka! Just switch to podman, which is rootless by default, and then applied --userns keep-id, no extra users(hadi-root and hadi-dev) are needed :slight_smile: check out podman/troubleshooting.md at 19600fa5e37cefa380929364bc514822f28a5141 · containers/podman · GitHub

for instance, I am running

podman run -td --rm --volume=${PWD}:/home/dev/src \
  --user $UID:$GID --userns keep-id:uid=$UID,gid=$GID dev-machine:latest