My remote building experience has massively degraded since around October-November 2021, so much that the amount of pull requests I review has shrunk by a large margin.
And I always wonder if I’m the only one experiencing these issues, because I don’t see many other people complaining about this situation. And subsequently I don’t see the much attention given to the following issues.
I’ve tried multiple combinations of nix versions and build machines for a while now and all of them yield frustrating results.
- Nix 2.3.16 (my current setup on all machines, my workstations use nixos-unstable, the remote builders 21.11)
- Builds get randomly stuck in
nix-store --serve --write
# ps auxf [...] root 18595 0.0 0.0 12484 6608 ? Ss 01:28 0:00 \_ sshd: root@notty root 18597 0.0 0.0 617664 17312 ? Ssl 01:28 0:00 \_ nix-store --serve --write # strace -p 18597 strace: Process 18597 attached read(0,
- Builds quickly error out with
error: unexpected end-of-file
- Builds get randomly stuck in
- Nix 2.4/2.5
- Builds quickly error out with SIGABRT (signal 6)
- Mixed setups (eg. 2.3 vs 2.4) often result in segfaults on the local end (IIRC)
- Also tried out both
ssh://
andssh-ng://
, both of which yield different but ultimately unusable results
Anything but trivial builds basically needs a helper function like this, so builds are retried until they work.
succeed () {
while true
do
$@ && break
done
}
The situation has become unworkable for me. Last night I queued builds for my server closures only to find out it was stuck for over 8 hours doing nothing, having finished no server closure at all.
Building things locally works just fine. On all of these machines.
I have so much capacity to build things, that I can’t put to good use, it’s so very frustrating. This is very much a high-level rant, because I’m trying to gauge how many others are affected by these kinds of problems.
My remote builder config looks like this:
{
nix.buildMachines = [ {
hostName = "remoteserver";
sshUser = "ssh://hexa";
sshKey = "/home/hexa/.ssh/id_remotebuild";
systems = [
"x86_64-linux"
"i686-linux"
];
maxJobs = 32;
speedFactor = 4;
supportedFeatures = [ "big-parallel" "kvm" "nixos-test" "benchmark" ];
mandatoryFeatures = [ ];
} {
hostName = "homeserver";
sshUser = "ssh://root";
systems = [
"x86_64-linux"
"i686-linux"
];
maxJobs = 4;
speedFactor = 4;
supportedFeatures = [ "kvm" "nixos-test" ];
mandatoryFeatures = [ ];
} {
hostName = "aarch64-builder";
sshUser = "ssh://root";
sshKey = "/home/hexa/.ssh/id_ed25519";
system = "aarch64-linux";
maxJobs = 1;
speedFactor = 3;
supportedFeatures = [ "big-parallel" ];
mandatoryFeatures = [ ];
} ];
nix.distributedBuilds = true;
nix.extraOptions = ''
builders-use-substitutes = true
'';
}
My local nix.conf looks like this, the remote builders don’t have any special config.
# WARNING: this file is generated from the nix.* options in
# your NixOS configuration, typically
# /etc/nixos/configuration.nix. Do not edit it!
build-users-group = nixbld
max-jobs = 4
cores = 0
sandbox = true
extra-sandbox-paths =
substituters = https://cache.nixos.org/
trusted-substituters =
trusted-public-keys = cache.nixos.org-1:6NCHdD59X431o0gWypbMrAURkbJ16ZPMQFGspcDShjY=
auto-optimise-store = true
require-sigs = true
trusted-users = root hexa
allowed-users = *
system-features = nixos-test benchmark big-parallel kvm
sandbox-fallback = false
keep-outputs = true
keep-derivations = true
builders-use-substitutes = true