Nix-build: profiling memory and cpu usage

why is nix-build using too many cpu cores?
why is nix-build failing with random errors?
is nix-build running out of memory?

lets try to answer these questions

stress-test.nix

a build file to waste some cpu and memory

# stress-test.nix
# nix-build ./stress-test.nix

{ pkgs ? import <nixpkgs> { } }:

let
  nix-build-profiler = (pkgs.callPackage (pkgs.fetchFromGitHub {
    owner = "milahu";
    repo = "nix-build-profiler";
    rev = "60efb7314532fc21b5b67518f50fabfb71d10740";
    sha256 = "sha256-zeSasab1jA34To9bd995HPBf+BKLqfL98xVVt9AAXMQ=";
  }) { });
in

pkgs.stdenv.mkDerivation {
  name = "stress-test";
  dontUnpack = true; # no source

  nativeBuildInputs = with pkgs; [
    stress-ng
    nix-build-profiler
  ];

  # https://www.mankier.com/1/stress-ng#Examples
  buildPhase = ''
    (
      set -x
      echo stressing 90% of $NIX_BUILD_CORES of $(nproc) cores
      stress-ng --cpu-method all --cpu-load 90 --timeout 10s --cpu $NIX_BUILD_CORES
      echo stressing 90% of $(nproc) of $(nproc) cores
      stress-ng --cpu-method all --cpu-load 90 --timeout 10s --cpu 0
      echo stressing 80% of available memory
      stress-ng --vm 8 --vm-bytes 80% --timeout 10s
    )
    exit 1 # force rebuild
  '';
}

save it as stress-test.nix and run nix-build ./stress-test.nix
now you should see the output of nix-build-profiler

nix-build-profiler

GitHub - milahu/nix-build-profiler: profiling cpu and memory usage of nix-build

this is a python script based on psutil, running inside the nix-build sandbox.
every second, it dumps the process tree to the build log

the output of nix-build-profiler is interleaved with the build log.
advantage: we see performance side-by-side with the build status

example output

example build:

nix-build -A qt6.qtwebengine

problems:
htop shows a load of 130…150 but should be 32.
random build errors with export NINJAFLAGS="-j32 -l32"

hypothesis: nix-build is running out of memory

load  mem rss  proc @ 2022-06-24 15:13:15 +0000
127.9   1   4G bash info={'child_procs': 2, 'cmdline': ['bash', '-e', '/nix/store/9krlzvny65gdc8s7kpb6lkx8cd02c25b-default-builder.sh'], 'cwd': '/build/qtwe>
127.9   1   4G   ninja info={'child_procs': 15, 'cmdline': ['ninja', '-j32', '-l32']}
  1.0   0  36M     g++ src/webenginequick/ui/CMakeFiles/WebEngineQuickDelegatesQml.dir/webenginequickdelegatesqml_qmltyperegistrations.cpp.o -o src/webengin>
  0.9   0  36M     g++ src/webenginequick/ui/CMakeFiles/WebEngineQuickDelegatesQml.dir/.rcc/qmlcache/WebEngineQuickDelegatesQml_qmlcache_loader.cpp.o -o src>
  0.9   0  36M     g++ src/webenginequick/ui/CMakeFiles/WebEngineQuickDelegatesQml.dir/.rcc/qmlcache/WebEngineQuickDelegatesQml_AlertDialog_qml.cpp.o -o src>
  0.9   0  36M     g++ src/webenginequick/ui/CMakeFiles/WebEngineQuickDelegatesQml.dir/.rcc/qmlcache/WebEngineQuickDelegatesQml_AuthenticationDialog_qml.cpp>
115.5   1   3G     sh info={'child_procs': 1, 'cmdline': ['sh', '-c', 'cd /build/qtwebengine-everywhere-src-6.3.1/build/src/gn && /nix/store/vrsv842wx54vknd>
115.5   1   3G       cmake info={'child_procs': 1, 'cmdline': ['cmake', '--build', '.']}
115.5   1   3G         ninja info={'child_procs': 1, 'cmdline': ['ninja']}
115.5   1   3G           ninja info={'child_procs': 130, 'cmdline': ['ninja', '-C', '/build/qtwebengine-everywhere-src-6.3.1/build/src/gn/Release', 'gn'], '>
  0.9   0  25M             g++ -c ../../../../src/3rdparty/gn/src/base/command_line.cc -o src/base/command_line.o
  0.9   0  25M             g++ -c ../../../../src/3rdparty/gn/src/base/environment.cc -o src/base/environment.o
  1.0   0  24M             g++ -c ../../../../src/3rdparty/gn/src/base/files/file.cc -o src/base/files/file.o
  1.0   0  25M             g++ -c ../../../../src/3rdparty/gn/src/base/files/file_enumerator.cc -o src/base/files/file_enumerator.o
  0.9   0  24M             g++ -c ../../../../src/3rdparty/gn/src/base/files/file_path.cc -o src/base/files/file_path.o
  0.8   0  23M             g++ -c ../../../../src/3rdparty/gn/src/base/files/file_path_constants.cc

for gcc and g++, most arguments are hidden, to make the process tree more compact

the whole build process uses only about 1% memory
(rss 4G / total 256G = 1.56%)
so the “out of memory” hypothesis seems false

ninjas are overloading

the total load is around 128 on a 128 core machine
but $NIX_BUILD_CORES is 32 …
so the build is using too many cpu cores

we see 2 ninja procs. the first ninja has 15 child processes …

ninja info={‘child_procs’: 15, ‘cmdline’: [‘ninja’, ‘-j32’, ‘-l32’]}

… and the second ninja has 130 child processes

ninja info={‘child_procs’: 130, ‘cmdline’: [‘ninja’, ‘-C’, ‘/build/qtwebengine-everywhere-src-6.3.1/build/src/gn/Release’, ‘gn’],

… because the flags -j32 -l32 were not passed to the second ninja

when ninja calls ninja, they dont negotiate cpu limits
because the default ninja version does not implement the gnumake jobserver protocol

solution:
use ninja-tokenpool with jobclient and jobserver support (via)
and start the main build process with ninja --tokenpool-master

builders are overloading

when a builder wants to run multiple processes or threads in parallel (2 or more)
then it should use the gnumake-tokenpool protocol
to negotiate the number of parallel workers with the main build process (jobserver)

before and after

before: build was using up to 128 cores

(todo: graph)

after: max load is 32 == NIX_BUILD_CORES

see also

2 Likes

oops, wrong category. this belongs to Guides - NixOS Discourse