Runtime composition of flake-based system configurations?

Question

Is there a way to compose multiple flake-based system configurations where the base configuration is in a fixed location (e.g. /etc/nixos/flake.nix or some predefined flake registry alias) and the overlay configuration is copied to an arbitrary disk location at runtime?

Background

When using CI/CD systems managed by Nix (e.g. NixOS, non-NixOS Linux distributions + system-manager, macOS + nix-darwin), a user might want to layer their own system modules/configurations on top of the provided ones describing the base CI/CD environment.

For example, the base CI/CD system might have a minimal set of system services up and running but a specific repository’s CI/CD jobs may want extra virtualization system services (e.g. Docker daemon, libvirtd), additional kernel modules, or different driver versions (e.g. PCIe device driver testing).

Some of the out-of-the-box services, however, can’t be restarted when applying the user-defined system configuration (with nixos-rebuild/system-manager/darwin-rebuild switch). For example, GitLab Runner’s autoscaling executors work by having a runner manager SSH into a (usually ephemeral) runner. Restarting sshd on the runner will break the connection and unintentionally fail the CI/CD job.

GitLab Runner

In the context of GitLab Runner specifically, the repository a CI/CD job is for is cloned to a dynamic path (docs). For example, a setup using the instance auto-scaling executor on AWS EC2 might have a runner that looks like this:

/
│   # GitLab Runner build directory.
└── builds/
    └── {runner token key}/
        └── {concurrent project ID}/
            └── {namespace}/
                │   # 📍 CI/CD job script working directory.
                └── {project name}/
                    │   # Clojure project.
                    ├── inputs/
                    │   └── main/
                    │       └── clojure/
                    │           └── hello.cljc
                    ├── bb.edn
                    ├── deps.edn
                    │
                    │   # Nix configuration.
                    ├── flake.nix
                    └── flake.lock

Ideally, there’s some way to refer to the base system configuration flake (e.g. /etc/nixos/flake.nix or some predefined flake registry alias) in the project’s flake such that:

  • The base system configuration and repository’s system configuration inputs aren’t mixed (i.e. they can use their own nixpkgs revisions).
    • This is to prevent applies from restarting sshd and other system services that maintain the job connection.
  • Doesn’t require users to write files/values to magical places.
    • Essentially, users only need to do nixos-rebuild/system-manager/nix-darwin switch --flake . in their job scripts.

Maybe there’s some magic we can do with nixPath or extraArgs/specialArgs?

This is very much a hack and not an intended way to do anything. A cleaner way to do this would be to just ensure that all CI runners can deploy VMs and then to deploy VMs for testing, nixpkgs has great support for this, and the NixOS test infrastructure could help.

If you must, you could use --impure and load flakes from a well-known host location with builtins.getFlake, and then do some ugly introspection to get the modules defined in nixosConfigurations.default (or I guess figure out what the hostname is at runtime).

This is all very impure, not fool-proof since there’s no guarantee the config matches, and not at all how nix is intended to be used. It’ll be a nasty bit of code, good luck.

2 Likes

Do also note that builtins.getFlake has its own bugs and may not operate identically to a flake reference used in other contexts (e.g. Flakes using paths do not infer `git+file:` despite documentation to that effect · Issue #5836 · NixOS/nix · GitHub).

1 Like

Unfortunately, this is from an awkward constraint imposed by a mix of CI/CD provider + network connectivity + cloud service provider limitations.

Ideally, GitLab Runner has some official executor that supports spinning up systems with arbitrary disk images. That way, someone could build a disk image in a prior job and use that as the runner system image in a subsequent job. This is only supported via the image keyword in GitLab CI/CD YAML pipeline definitions for OCI container images (so only userspace customizations) with the Docker and Kubernetes executors.

The next option would be nested virtualization (likely with NixOS’s VM test infrastructure). This one is tricky for 1 technical and 1 social reason:

  • Technical
    • This specific setup has to use AWS since it’s the only one that has connectivity to our company’s private network.
    • EC2 virtualized instances don’t officially support nested virtualization.
      • Only EC2 metal instances are guaranteed to since they expose the hardware directly.
      • Metals, however, usually have a lot more hardware resources (CPU, memory) than needed for simple CI/CD jobs so they create a cost issue.
  • Social
    • We can probably get people to adopt Nix for build/development shells, but using the NixOS VM test infrastructure might be a taller order.

We can still run QEMU VMs on EC2 virtualized instances, but they might be a bit slow without KVM.

Nested virtualization also makes PCIe device passthrough a bit tricky since there’s UX tradeoffs for PCIe device binding. The device starts either unbound or bound to the host. If it’s bound to the host, CI/CD job scripts have to unbind it and then pass it through to the QEMU VM. If it’s unbound, CI/CD job scripts have to bind it to the host if they don’t need a QEMU VM.

So that leads to abandoning system configuration isolation and letting users do runtime overlays instead. The impurity is unpleasant in many ways, but feels like a fair UX tradeoff compared to the gymnastics that may be needed for proper system configuration isolation.

Maybe another option is to have users commit a separate flake-ci.nix file which uses a predefined flake registry alias as an input?

flake-ci.flake

{
  inputs = {
    nix-gitlab-ci-cd = {
     type = "indirect";
     # System flake registry lookup.
     flake-id = "nix-gitlab-ci-cd";
    };
    project = {
      type = "indirect";
      # Need to register the project flake in the system flake registry with `nix registry add . project`.
      #
      # Using `type = "path"` with a relative path doesn't work.
      flake-id = "project";
    };
    nixpkgs = {
      follows = "project/nixpkgs";
    };
  };

  outputs = inputs: {
    nixosConfigurations = {
      default = inputs.nixpkgs.lib.nixosSystem {
        modules = [
          (
            {
              config,
              lib,
              pkgs,
              ...
            }: {
              # Import the base module.
              imports = [ inputs.nix-gitlab-ci-cd.nixosModules.default ];

              # Overlay.
              # ...
            }
          )
        ];
      };
    };
  };
}

That way, the CI flake isn’t evaluated when creating local build/development shells.

Not sure if this will accidentally clobber the flake.lock in the repository clone though when running nixos-rebuild/system-manager/darwin-rebuild switch --flake ./flake-ci.nix.

Flakes are not flake.nix files, they are directories/repos that contain a flake.nix file.
You can’t name the file something arbitrary.

Hmm maybe this then for repo layout?

.
│   # CI system configuration.
│   #
│   # `{nixos-rebuild/system-manager/darwin-rebuild} switch --flake ./ci`
├── ci/
│   └── flake.nix
│
│   # Clojure project.
├── inputs/
│   └── main/
│       └── clojure/
│           └── hello.cljc
├── bb.edn
├── deps.edn
│
│   # Nix configuration.
├── flake.nix
└── flake.lock

That also avoids concerns around flake.lock overwrites since it’s in a dedicated folder.

Unless you really need full host access - which sounds unlikely given that you can’t control the hosts - maybe https://devenv.sh/ is what you actually need? I.e., treat the services that need to be spun up as services, not components of the host system.

That works for services packaged in Nixpkgs (e.g. MinIO, PostgreSQL, Kafka) but there are services which are only distributed as OCI containers.

For most consumers, this is fine since they can use Podman instead of Docker Engine for rootless containers (though this won’t work for some containers).

We, however, have teams vending OCI containers which need to test with multiple container runtimes (Podman, Docker Engine + runc/kata, CRI-O + runc/kata, containerd + runc/kata) which may have an associated daemon that’s a systemd root unit. Some of these aren’t that convenient to run in a NixOS VM. In particular, any container runtime using GPU passthrough.


I’ve accidentally painted a narrow picture given the emphasis on GitLab Runner and AWS in the prior posts.

The general goal is a company-wide CI/CD fleet which needs to support a wide range of hardware platforms and build tooling. Conceptually, there’s 2 layers:

  • Platform
    • Hardware
      • Generic x86-64 (initial target)
      • Generic AArch64 (initial target)
      • Apple Silicon Mac
      • Some specific server SKU (e.g. AWS EC2 p5.48xlarge)
      • Some internal embedded systems platform
    • Supervisor
      • Operating system (initial target)
      • OCI container runtime (initial target)
      • Hypervisor
  • CI/CD Interface
    • GitLab Runner (initial target)
    • GitHub Actions
    • CircleCI
    • …

Nix is the tool of choice for managing the supervisor. Here’s some example supervisors and how Nix slots in:

  • Operating System (for the GitLab Runner instance executor)
    • NixOS
      • Scripts run in a Bash shell on NixOS.
      • Managed with NixOS configurations.
    • Ubuntu
      • Scripts run in a Bash shell on Ubuntu.
      • Managed with system-manager configurations.
    • macOS
      • Scripts run in a Bash shell on macOS.
      • Managed with nix-darwin configurations.
  • OCI container runtime (for the GitLab Runner Docker autoscaler executor)
    • Linux + runc + Docker Engine
      • Scripts run in a Linux container.
      • NixOS under the hood.
    • Linux + kata + Docker Engine
      • Ditto.
    • Linux + Podman
      • Ditto.

CI/CD job scripts run on the supervisor may use a variety of build tooling, such as:

  • Purely OCI containers (don’t use Nix at all).
    • May use Docker Compose or local Kubernetes clusters, so the Docker daemon or CRI-O systemd root units are needed.
  • Nix for only build/development shells.
    • CI/CD jobs use nix develop --command bash -c "{build command}".
  • Nix as the build system.
    • CI/CD jobs use nix build .#{package}.
  • Nix wrappers (e.g. devbox, devenv, flox).
  • Nested virtualization (e.g. QEMU, NixOS VM test infrastructure).

OCI container workflows and PCIe device testing are driving most of the complexity here. These are best run with OS supervisors rather than some kind of (nested) virtualization scheme.

On the container side, this is to avoid things like Docker-in-Docker (DinD) which is an unpleasant experience (e.g. local dev talks to localhost:{port} for containers, but CI/CD scripts have to talk to docker:{port} since containers are nested in a sidecar Docker container).

On the PCIe device testing side, this is to avoid setting up PCIe device passthrough (e.g. containerized GPU workloads).

Since OS supervisors are the least painful to use and bring-your-own-disk-image isn’t an option, we’re likely stuck with host configuration overlays as the least painful option.

I’d personally take a step back and check if centralizing CI for such wide use cases makes sense, and even investigate if GitLab CI itself makes sense in this case.

That said, this is very much veering towards the kind of support consultancies will ask incredible amounts of money for, so without investigating the XY problem too much further, yeah, something like the wonky approach I described is your only option to achieve what you’re asking for. You can probably vary this somewhat, but the experience is unlikely to be nice either way.

1 Like