Binding the /nix path in a container makes remounting fail

litchipi · September 24, 2024, 6:12pm

Context

I am selfhosting a forgejo instance on my NixOS homelab.
On the CI, I want the containers to share the host /nix/store directly, and I am using -v options in docker for this.

That’s my runner’s setup.

      settings = {
        log.level = "debug";
        container.options = builtins.concatStringsSep " " [
          "-v /nix:/nix"
          "-v ${storeDeps}/bin:/bin"
          "-e PATH=/bin"
        ];
        container.valid_volumes = [ "/nix" "${storeDeps}/bin" ];
        container.network = lib.mkDefault "host";
        runner.envs.NIX_CONFIG= builtins.concatStringsSep "\n" [
          "experimental-features = nix-command flakes"
          "auto-optimise-store = false"
        ];
        runner.capacity = lib.mkDefault 8;
        runner.timeout = lib.mkDefault "1h";
        cache.enabled = true;
      };

The issue

When a job is taken by the runner, everything goes well, the binaries in ${storeDeps}/bin are found (so the nix store is readable, and my dependencies are there)

However, when the script does a nix run, it fails with the message:

error: remounting /nix/store writable: Operation not permitted

I wonder how I could either fix this, or work around it in order to have a writable store inside the container, shared with the host

My inspirations

I’m coming from Sharing Nix store between containers

Also inspired myself from clan-infra/actions-runner.nix at main - clan-infra - gitea: Gitea Service

waffle8946 · September 24, 2024, 8:06pm

I don’t know if nix clients in the container can communicate with the host’s nix daemon, but IMO a better question is, do you need them to? (i.e. can you manage the nix store only from the host, to avoid dealing with the problem in the first place?)

litchipi · September 25, 2024, 7:25am

Well I know the socket is located in /nix/var/nix/daemon-socket/socket, and from the post “Sharing Nix store between containers”, apparently it’s enough for it to work.

The idea is just to be able to run the same nix run in the CI without each time downloading the dependencies from the network. Either I share the nix store, or I serve it from the host, but I don’t see any other solution really. I’m open for other possibilities ^-^’ !

Atemu · September 26, 2024, 7:21am

It sounds like you could want Super Colliding Nix Stores

Nebucatnetzer · September 26, 2024, 8:40am

I do this with GitLab runners. I mount /nix to /nix in the container but as read-only.
Nix inside the container then communicates via the socket with the host and the host does the work of downloading the packages.
This way I can share the store with all the containers.

I mount the Nix config of the host as well inside the containers.

litchipi · October 1, 2024, 9:43am

Yes, I don’t really understand why, but setting the volume as read-only prevent the contained machine to try to remount the /nix folder as writable, while allowing write access to the /nix/store folder.

kjeremy · October 1, 2024, 2:44pm

Can you explain this better? How do you:

Avoid conflicting with the container’s /nix directory?
Handle multiple containers?

I could never get this approach to work without lots of sql errors when multiple containers tried to build the same thing. I had to mount the host’s daemon socket and use --store everywhere to talk to the daemon socket.

Nebucatnetzer · October 1, 2024, 3:33pm

In our case we use an Ubuntu LTS image, so there is no conflict with an existing /nix directory. We just have to extend the path variable so that it includes /nix/var/nix/profiles/default/bin.
So far I haven’t seen this problem but since only the daemon on the host works with the DB I suspect it should be fine. The directory gets mounted as read-only so it won’t be able to write to it anyway.

I admit the whole thing feels super hacky but so far it works and it allows us not having to pull down all the packages for each run.

kjeremy · October 1, 2024, 3:39pm

I’m surprised that you can write to the /nix/var/nix/daemon-socket/socket if it’s mounted as read only. I will have to play around.

Nebucatnetzer · October 1, 2024, 4:06pm

Good point but honestly I don’t know how it works.
We don’t even run in as root inside the containers so it could either way not write to the socket.

$ ls -l /nix/var/nix/daemon-socket/
total 0
srw-rw-rw- 1 root root 0 Sep  4 21:22 socket
$ ls -l /nix/var/nix
total 28
drwxr-xr-x 2 root root 4096 Sep  4 21:22 daemon-socket
drwxr-xr-x 2 root root 4096 Sep 19 14:46 db
-rw------- 1 root root    0 Apr 30 14:31 gc.lock
drwxr-xr-x 4 root root 4096 Apr 30 14:31 gcroots
drwxr-xr-x 2 root root 4096 Sep 19 14:45 gc-socket
drwxr-xr-x 3 root root 4096 Apr 30 14:31 profiles
drwxr-xr-x 2 root root 4096 Oct  1 16:04 temproots
drwxr-xr-x 2 root root 4096 Jul 17 09:29 userpool
Cleaning up project directory and file based variables 00:00
Job succeeded

Edit: Nevermind the socket would be writeable by anyone but in the Gitlab Runner config we mount it like this "/nix:/nix:ro".
Which seems to work.

$ touch /nix/foo.txt
touch: cannot touch '/nix/foo.txt': Read-only file system
Cleaning up project directory and file based variables 00:00
ERROR: Job failed: exit code 1

litchipi · October 1, 2024, 7:38pm

Well explain idk, but this is the setup, hopefully it’ll give you more clues:

    services.gitea-actions-runner.instances.baseRunner = {
      enable = true;
      name = "base-runner";
      url = forgejoUrl; # Variable set upper in the file
      tokenFile = tokenFile.file; # Idem

      # On my CI files, it configures "runs-on: nix"
      # It'll use the oci-container defined below
      labels = [ "nix:docker://forgejo-ci-nix" ];

      settings = {
        log.level = "info";
        container.network = lib.mkDefault "host";
        runner.capacity = lib.mkDefault 8;
        runner.timeout = lib.mkDefault "1h";
        cache.enabled = true;

        container.options = "-v /nix:/nix:ro"; # Mount /nix on container in read only
        container.valid_volumes = [ "/nix" ]; # Allow the /nix volume
        runner.envs.NIX_SSL_CERT_FILE = "/etc/ssl/certs/ca-bundle.crt"; # Prevent SSL errors

        # Sets up the container's nix config while not overwriting the /etc/nix/nix.conf file (as it's mounted read-only)
        runner.envs.NIX_CONFIG = builtins.concatStringsSep "\n" [
          "experimental-features = nix-command flakes"
          "auto-optimise-store = false"
          "build-users-group ="
        ];
      };
    };

    # Create the container holding the CI operations
    virtualisation.oci-containers.containers.forgejo-ci-nix = {
      image = "forgejo-ci-nix:latest";
      hostname = "forgejo-ci";

      imageFile = pkgs.dockerTools.buildImage {
        name = "forgejo-ci-nix";
        tag = "latest";

        copyToRoot = pkgs.buildEnv {
          name = "deps-bin";
          paths = with pkgs; [
            bash
            git nixVersions.latest
            nodejs
            coreutils findutils curlFull wget busybox
          ] ++ containerDependencies;
          pathsToLink = [ "/bin" ];
        };

        # Have the container sleep indefinitely so that it stays alive
        # The CI will still be able to get a shell on it and execute stuff
        config.Cmd = [ "/bin/sleep" "infinity"];

        runAsRoot = ''
          #!${pkgs.runtimeShell}

          mkdir -p /etc/ssl
          cp -a "${pkgs.cacert}/etc/ssl/certs" /etc/ssl/certs   # Copy the files required for NIX_SSL_CERT_FILE
        '';
      };
    };

Like I said, I have no idea why I’m able to write to the nix store using a read-only bind mount.
However, this settings works well, and the CI jobs are using the host nix store, have the expected dependencies, etc …

Nebucatnetzer · December 3, 2024, 2:14pm

I just ran into an, in hindsight quite obvious, gotcha.
If you run an image that contains the /nix directory and you mount it from the runner as well you overwrite the Nix store inside the container.
If the runner isn’t the same as the one that built the image it will complain about not finding paths or binaries.
There it’s best to have separate config for running Nix and one for running the resulting images.