I/O & CPU scheduling, jobs & cores... and performance, baby

Im currently playing around with cores and max-jobs a bit myself.
During massive recompiles (like enabling cuda or setting the host architecture to get better gcc optimization) a high jobcount is great, but running on 32 hyperthreads processor has a gigantic implied requirement for RAM. I saw usually 16 buildjobs running with 32 threads (ninja -j 32) each. Even 60GB RAM + 100GB swap couldn’t fit everything and my build crashed.

After trying a few different settings I landed on 8 max-jobs with 16 cores each, which seems to be mostly great. One observation in the failed attempts was that as soon as the swap starts to get involved, performance and cpu utilization plummets.

Basically we want to maximize CPU usage in the build process, while having at least a certain amount of concurrency to make use of other ressources (network, disk). Setting a fixed number seems to be suboptimal: sometimes my cpu gets bored because none of the build processes uses it, and sometimes even with 8x16 my ram gets overloaded and swapping starts.

That leads me to think that it might be better to make these values dynamic, and write a control heuristic that looks at the recent ressource usage and throttles new build tasks if the ram gets too full. In a first step it could stop new jobs from spawning if it detects overloaded RAM. That way we could set the max-jobs to a higher-then-good value, have good utilization and fast building while not overloading the system too much.

Is there prior work on something like this? Else I might try my hand at it at some point, should probably become a new function in nixd.

Thinking this further would lead to a scheduler that intelligently queues jobs that make best use of the ressources that are currently free, but that isn’t really realistic.

1 Like