Running nix-shell on a certain shell.nix file on a 16 core Linux system only gives me access to a single CPU. However when I disable
NIX_AFFINITY_HACK I get access to all cores:
$ nix-shell shell.nix --run nproc
$ NIX_AFFINITY_HACK=0 nix-shell shell.nix --run nproc
This doesn’t happen in other shell.nix files. Unfortunately I can’t share the shell.nix at the moment since it’s proprietary. What I can say is that I don’t set
How can a shell.nix set the CPU affinity?
tl;dr; Try to use
taskset. See the final part of this post for explanations.
I am a bit afraid to answer this, as it may sadow this thread from experienced users on this topic. But at the same time it can act as a bump, so here it is.
The best entry point i found is the issue #2359
nix run sets affinity to only one CPU, and the shell & further children inherit it.
- With that, you get mos of the information you need. The affinity is set by the nix client, to reside on the same cpu as the nix daemon, to share caches. Running as root should hence avoid the affinity hack.
- If this affinity setting “leaks” into child processes, this is considered a bug. The aforementioned issue has been fixed promptly by Eelco .
- A recommended technique to undo the affinity hack is unlikely to exist, as it is supposed to never leak.
For me, you have most probably found a bug in nix-shell. What version of nix are you using ? What does
nproc return outside of nix-shell ?
If you wand a quick and dirty solution, you could investigate how to use sched_setaffinity  from your own program. In bash this means using taskset.
$ nix-shell "<nixpkgs>" -A hello --run nproc
$ taskset -c 1 nproc
$ taskset -c 1 nix-shell "<nixpkgs>" -A hello --run nproc
$ taskset -c 1 nix-shell "<nixpkgs>" -A hello --run "taskset -c 0,1,2,3 nproc"
You can use
taskset -c 0-200 for a bit of portability :-). This does not fix the nix bug, but should work around your issue.
 sched_setaffinity(2) - Linux man page
My colleague debugged this issue further and found the following issue in Nix:
It turns out the the issue finally comes from non-re-entrant code. The backup gets overwritten by the set value on second enter. See https://github.com/NixOS/nix/issues/3345
Now, I wonder why this specific case triggers the issue, but it is not extremely important.