Managing Nix build resources with cgroups?

evanrelf · June 27, 2020, 1:26am

I’m wondering if it’s possible to limit the system resources of a Nix build via cgroups or something similar. My usecase is for building lots of things in parallel in continuous integration. I’d like all the parallel builds to have a maximum number of CPU time and/or memory.

Is this possible, considering the user running nix-build isn’t the one that actually performs the build? The Nix daemon assigns the build to one of the nixbld* users, which is chosen at random, so how can I constrain a Nix build’s resources?

I’m really trying to avoid using containers or virtual machines in this scenario, since a lot of the builds involve NixOS tests and I don’t want nested virtualization.

jonringer · June 27, 2020, 7:19am

there’s a few options you can do with a nixos system:

nix.buildCores # number of cores per build, think `make -j N`
nix.maxJobs # number of builds that can be ran in parallel

if you have a pool of compute resources, generally you can configure the buildCores and jobs of each machine, and then assign the tasks to a specific category of machine. More info can be found Distributed build - NixOS Wiki

evanrelf · June 27, 2020, 4:43pm

I think those are the only knobs I can tweak right now.

Unfortunately not all builds respect NIX_BUILD_CORES, and that doesn’t cover memory usage, so I’ll probably keep searching for ways to use cgroups.

jonringer · June 27, 2020, 7:23pm

I’m not aware of a way to limit memory usage for builds, you can set the intial GC_HEAP size for nix utilities, but generally builds take much more memory.

NIX_BUILD_CORES is up to the maintainer to enable. For some packages, it can be detrimental. For example, bash-completion will fail with NIX_BUILD_CORES=128.

markuskowa · June 27, 2020, 8:04pm

I vaguely remember that @Mic92 did present a demo of nix builds with cgroups two or three years back on NixCon. Having cgroup constrained builds in mainline nix would be a killer feature. In the current version all checks (or other programs that are executed during the build) that are parallelized, for example with OpenMP, will just use all available cores on the builder. On a machine with a large core count that may actually be slower than just running on a single core.

primeos · June 28, 2020, 11:11am

If you want to constrain the nix-daemon (i.e. all builds, not individual ones) using cgroups (via systemd) you could use something like this:

systemd.services.nix-daemon.serviceConfig = {
  MemoryHigh = "5G";
  MemoryMax = "6G";
};

There are a lot of options (see e.g. systemd.resource-control and systemd.exec).

If you want to instead constrains individual builds then it’ll be more difficult and probably require patching Nix (but I haven’t looked into this).

Mic92 · June 29, 2020, 5:27pm

I did a talk about the nix sandbox. The nix sandbox does not make use of cgroups. A first step would be to create one cpu cgroup per builder process to ensure some fairness between build jobs. However if you have a CI job using many build jobs than it won’t help a lot. I don’t see any other but using one container per user right now with a dedicated nix-daemon.

knedlsepp · November 18, 2020, 9:10am

I’ve seen this recently: https://github.com/NixOS/nix/pull/3600. I have little knowledge on systemd-cgroup but to me it sounds like this this would be a step in the right direction.

jkarni · November 15, 2022, 10:41am

A first step would be to create one cpu cgroup per builder process to ensure some fairness between build jobs.

You could also create a systemd slice for each nixbld user (user-NIXBLD1_ID.slice), and put the resource limits there, no?

Mic92 · November 15, 2022, 4:31pm

Systemd applies slices via logind as far as I know. I don’t think this would work for the nix-daemon service.

adrian-gierakowski · August 5, 2023, 10:24am

I wonder if it would be possible to use libcgroup’s cgrulesengd, with each nixbld user assigned to it’s own cgroup via cgrules.conf (as described here)