Only single core in nix-shell

Running nix-shell on a certain shell.nix file on a 16 core Linux system only gives me access to a single CPU. However when I disable NIX_AFFINITY_HACK I get access to all cores:

$ nix-shell shell.nix --run nproc
1

$ NIX_AFFINITY_HACK=0 nix-shell shell.nix --run nproc
16

This doesn’t happen in other shell.nix files. Unfortunately I can’t share the shell.nix at the moment since it’s proprietary. What I can say is that I don’t set NIX_AFFINITY_HACK anywhere.

How can a shell.nix set the CPU affinity?

tl;dr; Try to use taskset. See the final part of this post for explanations.

I am a bit afraid to answer this, as it may sadow this thread from experienced users on this topic. But at the same time it can act as a bump, so here it is.

The best entry point i found is the issue #2359

nix run sets affinity to only one CPU, and the shell & further children inherit it.

  1. With that, you get mos of the information you need. The affinity is set by the nix client, to reside on the same cpu as the nix daemon, to share caches. Running as root should hence avoid the affinity hack.
  2. If this affinity setting “leaks” into child processes, this is considered a bug. The aforementioned issue has been fixed promptly by Eelco [1].
  3. A recommended technique to undo the affinity hack is unlikely to exist, as it is supposed to never leak.

For me, you have most probably found a bug in nix-shell. What version of nix are you using ? What does nproc return outside of nix-shell ?

If you wand a quick and dirty solution, you could investigate how to use sched_setaffinity [2] from your own program. In bash this means using taskset.

$ nproc
4
$ nix-shell "<nixpkgs>" -A hello --run nproc
4
$ taskset -c 1 nproc
1
$ taskset -c 1 nix-shell "<nixpkgs>" -A hello --run nproc
1
$ taskset -c 1 nix-shell "<nixpkgs>" -A hello --run "taskset -c 0,1,2,3 nproc"
4

You can use taskset -c 0-200 for a bit of portability :-). This does not fix the nix bug, but should work around your issue.

[1] https://github.com/NixOS/nix/commit/cc7b4386b16885a22ccabb019381539fecb00230
[2] sched_setaffinity(2) - Linux man page

1 Like

My colleague debugged this issue further and found the following issue in Nix:
https://github.com/NixOS/nix/issues/3345

It turns out the the issue finally comes from non-re-entrant code. The backup gets overwritten by the set value on second enter. See https://github.com/NixOS/nix/issues/3345

Now, I wonder why this specific case triggers the issue, but it is not extremely important.