I/O & CPU scheduling, jobs & cores... and performance, baby

hello everyone!!! :sweat_smile:

im back at yall again with yet another (noob) question! :crazy_face:

so, before i get a crazy computer for doing all the server and (remote) building stuff, i was thinking… how do i maximise the I/O scheduling performance of the nix builders right now?

so, ive seen(1) some(2) posts(3) about these fairly new little options(4):

nix.daemonIOSchedClass
nix.daemonCPUSchedPolicy
nix.daemonIOSchedPriority
nix.settings.cores
nix.settings.max-jobs

…and they’ve been preset automatically, meaning… there’s more juice left! :smiling_face_with_horns: but i have no idea how to tweak it correctly! i know, i know, you’re gonna tell me not to, but i wanna! :zany_face:

for example, i have a laptop. i game on it. but! i dont care about battery life (or have it disconnected and it is always on AC), as in, im always on that "performance" (CPU) scheduling. what must one do to make the build process go fast?

i may be wrong, but here’s what i calculated to be the best setup for a 4 cores, 8 threads computer:

nix.settings.cores = 4
nix.settings.max-jobs = 1

wait. that cant be right, can it? did i miss something, or…? …oh. i mean it kinda does make sense, but at the same time it doesnt? so, since most jobs (in nix terms its called a derivation, apparently) cant use a lot of cores simultaneously (to build stuff) anyway, i guess its fine, especially if you have like.. 8, 12 at best, of them? do they also count towards threads or only logical cores? so… i should be having just 1 simultaneous build task at ONCE? but… right now, with the default settings, i have 4? the buildPhase thing, right? i have 4 of these at the same time, which is pretty slow for me… one buildPhase at once would fit me better, so, once again, i am very confused: what values should those settings be (not just cores and max-jobs, but also the scheduling thingies?

EDIT: oh btw speaking of concurrent tasks. is it possible somehow to change the amount of SIMULTANEOUS DOWNLOADS per build/update? because with my network, the fastest way to download things is to NOT “parallel” it, but rather to download them one by one, and that’d sort it out, i think! thanks in advance!!!

thank you guys for any help in advance :sob: :sob: and again sorry for this silly question that is clearly described in the manual…

2 Likes

To save everyone else the trouble of looking: there is a page at Nix Evaluation Performance - NixOS Wiki but it has nothing relevant!

1 Like

finally, a page that describes what it is like to have ADHD :crazy_face:

anyway, lets get serious a bit. LOTS OF QUESTIONS. and rambling. you love it.

in the manual,

there is… nothing new. um, i thought it would be different than what’s already described in the nixos search engine:

which isnt a great start (for me) to, uh, understand whats going on. well…

btw, the “declared in” links are 404? (in the manual)

snip. sorry, i was yapping/rambling a bit about how the manual is a bit confusing. but im gonna do it anyway, so imagine a situation: you are a total newbie and your little laptop is struggling with nix derivation thingies and/or a build processes. time to google, haha! you find this:

and then the manuals. you find the nix daemon options. then, you must figure out the difference between "idle", "other", "batch" and "best-effort". but only "idle" appears twice (both for I/O and CPU schedulers) and those other strings are for the scheduling policy of the CPU. so, it says that "idle" is for interactively used computers, UGH! this should’ve been just phrased something like “for actively used (mobile) computers during updates or configurations with system.autoUpgrade.* enabled” or something. sorry. okay, you then switch to "idle" for responsiveness, just for the laptop to not overheat and die…

nix.daemonCPUSchedPolicy = "idle";
nix.daemonIOSchedClass = "idle";
nix.daemonIOSchedPriority = 7; (doesnt work with "idle" anyway)
^ this worsens building speed, improves system performance (while building)

and the opposite would be…

nix.daemonCPUSchedPolicy = "other"; (default)
nix.daemonIOSchedClass = "best-effort"; (default)
nix.daemonIOSchedPriority = 0;
^ this improves building speed, worsens system performance (while building)

and finally! what about these?

nix.settings.cores
nix.settings.max-jobs

well, once again, according to this page:

edit: what a weird url. surely nothing bad will happen with this dynamically changing link…

…i think those options are just a matter of preference, is it not?

by default, nix.settings.cores is 0, which is to use all available LOGICAL (not threads) cores, but… then it says this:

Some builds may become non-deterministic with this option; use with care!

what does THAT mean? …has anyone had this problem with non-determinism in a declarative system? that is a CRAZY sentence, goes hard. okay, maybe someone have, oh well, anyway…

Packages will only be affected if enableParallelBuilding is set for them.

no idea what this is. do they mean that this option will work ONLY IF the package, that is NOT in the binary cache, will use the specified amount of cores, which has enableParallelBuilding inside of it? uhm, why would it not have that? im just curious, please do tell! :sob:

okay, so you do all that and… oh. you have to… rebuild the configuration… while still using the previous schedulers… yeah.

oh, but this is interesting:

so, cores and max-jobs are multiplied? so, i will get 4 parallel processes in the end, even if max-jobs is 1? thats cool and all, but… WHAT IS A “PARALLEL PROCESS” AND A “NON-INTERACTIVE TASK”?! sorry…

is it the buildPhase thing, the ACTUAL building process, or is it the downloading from binary cache, unpacking, whatever else is happening when you rebuild, all of the above? which is which, how and when, I/O or CPU, or all of the above???

im just gonna hit “reply” and hope for the best :crazy_face:

EDIT: i shortened the yapping parts :3

You got this one backwards. A hyper-threaded processor, advertised as, say, having 8 cores and 16 threads (2 threads per core) has 8 physical cores, but 16 logical cores, i.e. “hardware threads”. A Linux system will mostly report this as “16 cores”, without distinction. htop will show individual load for 16 cores, /proc/cpuinfo will list 16 cores. See also this question.

My understanding is that this refers to the fact that certain compilers or compiling certain codebases may not produce bit-for-bit identical output, or compilation may outright fail when you allow them to compile several parts of the code in parallel, but will produce a consistent result if you compile stuff sequentially.

enableParallelBuilding seems to be the default for any package that uses stdenv (which is the majority of packages in nixpkgs). Notably, this issue lists a bunch of issues that were previously present when building certain packages in parallel. You can also use GitHub code search to find packages where parallel building is still explicitly disabled due to issues.

I think the documentation you linked is pretty clear. max-jobs defines how many derivations will be built in parallel (at the same time) - the Nix daemon itself takes care of this. cores defines how many “cores” may be utilized while building each one of those derivations, and it is up to the derivation to respect this (e.g. by running make -j$NIX_BUILD_CORES - the cores setting is exposed to the derivation as the NIX_BUILD_CORES environment variable).

The funny thing I realized is that NixOS sets both max-jobs and cores to auto or 0, meaning they will both default default to the number of logical cores. So for a CPU with 16 threads you will potentially have 16 * 16 = 256 build processes. There’s been discussion about improving this, but there’s no consensus yet, it seems.

Looks like max-substitution-jobs is what you need. I don’t blame you for not immediately realizing that substitute roughly means “download from a cache” in Nix land.

3 Likes

It’s not. But as https://nixos.org/manual/nixpkgs/unstable/#var-stdenv-enableParallelBuilding mentions:

Unless set to false , some build systems with good support for parallel building including cmake , meson , and qmake will set it to true .

So it’s usually parallelised anyway.

1 Like

wowie!!! :exploding_head: :exploding_head: thanks thanks!! really, a lot. but uhm, what about the other 3 options i mentioned?

nix.daemonIOSchedClass
nix.daemonCPUSchedPolicy
nix.daemonIOSchedPriority

did i get that right, when i said

or am i confusing actual compilers like glibc, gcc, etc. and nix “derivation” builders? are those two different things?

:frog:

OH! ALSO. wait wait wait, one more thing

what about nix.daemonCPUSchedPolicy’s

This policy propagates to build processes. other is the default scheduling policy for regular tasks. The batch policy is similar to other, but optimised for non-interactive tasks.

what exactly are “regular tasks” and “non-interactive tasks”? like a [Y/n] or..?

THANKS IN ADVANCE!!!

:snowflake:

EDIT: but wait! there’s more!

services.system76-scheduler.enable = true;
services.system76-scheduler.assignments = {
  nix-builds = {
    nice = 10; # from -20 (high) to 19 (low)
    class = "batch"; # "idle", "batch", "other", "rr", "fifo"
    ioClass = "idle"; # "idle", "best-effort", "realtime"
    matchers = [
      "nix-daemon"
    ];
  };
};

p.s. sorry, cant stop yapping :stuck_out_tongue:

edit2: btw nobody mentioned this?

nixpkgs.config.enableParallelBuildingByDefault

this is literally enableParallelBuilding, but for the entire system lol

edit3 (2026 feb): OOOH check this out

https://wiki.nixos.org/wiki/Distributed_build

i have NO idea why this never appeared in my search results, ffs. this is exactly what i was looking for LOL! this is still not the solution, though, cos i still have a few more questions (sorry): are there any… drawbacks (?) from compiling for/on not-original-hardware? like, for example, you download some binary for the correct architecture/platform, but then it doesnt work or its performance is unusually slow, just because it wasnt built on specifically your machine? you know what i mean? not sure if this kind of situation has a name? i guess, what im trying to say is, how to make sure that what i am about to compile will work hardware-insensitive(ly)?

should i even worry about any of this stuff orrr… i dunno, theres just so many build flags and arguments, im just curious if nixpkgs themselves are built utilising some of this… oh, wait, nvm

edit4 (a few months later): oh. so thats what that is! yeah, i knew this, pft, obviously

That’s because building is very different from evaluation.

1 Like

Im currently playing around with cores and max-jobs a bit myself.
During massive recompiles (like enabling cuda or setting the host architecture to get better gcc optimization) a high jobcount is great, but running on 32 hyperthreads processor has a gigantic implied requirement for RAM. I saw usually 16 buildjobs running with 32 threads (ninja -j 32) each. Even 60GB RAM + 100GB swap couldn’t fit everything and my build crashed.

After trying a few different settings I landed on 8 max-jobs with 16 cores each, which seems to be mostly great. One observation in the failed attempts was that as soon as the swap starts to get involved, performance and cpu utilization plummets.

Basically we want to maximize CPU usage in the build process, while having at least a certain amount of concurrency to make use of other ressources (network, disk). Setting a fixed number seems to be suboptimal: sometimes my cpu gets bored because none of the build processes uses it, and sometimes even with 8x16 my ram gets overloaded and swapping starts.

That leads me to think that it might be better to make these values dynamic, and write a control heuristic that looks at the recent ressource usage and throttles new build tasks if the ram gets too full. In a first step it could stop new jobs from spawning if it detects overloaded RAM. That way we could set the max-jobs to a higher-then-good value, have good utilization and fast building while not overloading the system too much.

Is there prior work on something like this? Else I might try my hand at it at some point, should probably become a new function in nixd.

Thinking this further would lead to a scheduler that intelligently queues jobs that make best use of the ressources that are currently free, but that isn’t really realistic.

1 Like

We used to have a poor man’s limiting until a few years ago:

I think in future the best shot will build on

(as you just discovered; lately I haven’t been following these topics really, though)

4 Likes