Unexpected 11h build after auto update

Greetings,

my Laptop, which is a production machine I heavily rely on, is configured to update automatically on a daily basis (at 17:00). Yesterday, for the first time, the update triggered an unexpected local build which took 11h (and several dozens GiB of local disk space). Luckily, it was Friday afternoon so my work wasn’t affected.

Now I have two questions:

  1. Is there a way to prevent local builds during auto updates and use only packages that are available in the binary cache (something like --force-cached-only)? I can’t afford updates to affect my daily work. (For the time being, I changed my config to
    system.autoUpgrade.dates = "Saturday 09:00";
    to make sure that doesn’t happen in the future. But I would prefer to get updates more regularly. Besides, I would also like to avoid the machine compiling the whole Saturday.)
  2. I don’t even know which package(s) caused the local build, are there any log files I could inspect?

For the record, in my system flake.nix the inputs look like this

inputs = {
  nixpkgs.url = "github:NixOS/nixpkgs/nixos-23.11";
  nixpkgs-unstable.url = "github:NixOS/nixpkgs/nixos-unstable";
  home-manager = {
    url = "github:nix-community/home-manager/release-23.11";
    inputs.nixpkgs.follows = "nixpkgs";
  };
  nixos-hardware.url = "github:NixOS/nixos-hardware/master";
};

The unstable branch is only used to pull the latest versions of 1Password and Spotify.

Thanks in advance for your help!

The good thing about today being Saturday is that I’ve got spare time to dig into what exactly happened last night.

(Yesterday, at the time the auto update was scheduled to happen, the System Monitor listed several processes called “clang++” by the user “nixbld1” which caused 100% system load for several hours - at some point I decided to go to bed and when I checked systemctl status nixos-upgrade.service this morning it stated that systemd[1]: nixos-upgrade.service: Deactivated successfully as well as systemd[1]: Finished NixOS Upgrade and something along the lines of Consumed 4d CPU time. The whole process took 10 hours and 55 minutes.)

So, I just compared the latest two generations by doing:

$ nix shell nixpkgs#nix-diff
$ cd /nix/var/nix/profiles/
$ nix-diff $(nix-store -qd system-23-link system-24-link)

which yielded a massive wall of text. Now, I don’t know anything about software development (nor the intricacies of how Nix works) but this part looks suspiciously like it’s referring to what caused the spike in system load:

# ... omitting a wall of text ...
• The input derivation named `drawio-22.0.3` differs
                - /nix/store/krhjidwiyipr3nvyxmcx224j772yazwz-drawio-22.0.3.drv:{out}
                + /nix/store/jk48m8jshl9a7ng02kzfff4hyvvpwv8x-drawio-22.0.3.drv:{out}
                • The set of input derivation names do not match:
                    - electron-27.2.3
                    + electron-27.3.2
                • The environments do not match:
                    buildPhase=''
                    runHook preBuild
                    
                    yarn --offline run electron-builder --dir \
                      --config electron-builder-linux-mac.json \
                      -c.electronDist=/nix/store/whnnjxsc9vw18266qis3inak8qv32xc1-electron-27.2.3/libexec/electron-c.electronDist=/nix/store/q58hmfnssd5jgnjbgmhbbh9l0148ggrx-electron-27.3.2/libexec/electron \
                      -c.electronVersion=27.2.3-c.electronVersion=27.3.2
                    
                    runHook postBuild
                    
                ''
                    installPhase=''
                    runHook preInstall
                    
                    mkdir -p "$out/share/lib/drawio"
                    cp -r dist/*-unpacked/{locales,resources{,.pak}} "$out/share/lib/drawio"
                    
                    install -Dm644 build/icon.svg "$out/share/icons/hicolor/scalable/apps/drawio.svg"
                    
                    makeWrapper '/nix/store/whnnjxsc9vw18266qis3inak8qv32xc1-electron-27.2.3/bin/electron''/nix/store/q58hmfnssd5jgnjbgmhbbh9l0148ggrx-electron-27.3.2/bin/electron' "$out/bin/drawio" \
                      --add-flags "$out/share/lib/drawio/resources/app.asar" \
                      --add-flags "\${NIXOS_OZONE_WL:+\${WAYLAND_DISPLAY:+--ozone-platform-hint=auto --enable-features=WaylandWindowDecorations}}" \
                      --inherit-argv0
                    
                    runHook postInstall
                    
                ''
              • The input derivation named `element-desktop-1.11.57` differs
                - /nix/store/4zsr4b58q68fs28z3g4a39x5phf786ba-element-desktop-1.11.57.drv:{out}
                + /nix/store/zhidy6fvafx89067w894gzbhxj9px14h-element-desktop-1.11.57.drv:{out}
                • The set of input derivation names do not match:
                    - electron-28.1.4
                    + electron-28.2.2
                • The environments do not match:
                    installPhase=''
                    runHook preInstall
                    
                    # resources
                    mkdir -p "$out/share/element"
                    ln -s '/nix/store/k2vlmwl317dhsbg7a33cijvcn4mqfqv8-element-web-1.11.57' "$out/share/element/webapp"
                    cp -r '.' "$out/share/element/electron"
                    cp -r './res/img' "$out/share/element"
                    rm -rf "$out/share/element/electron/node_modules"
                    cp -r './node_modules' "$out/share/element/electron"
                    cp $out/share/element/electron/lib/i18n/strings/en_EN.json $out/share/element/electron/lib/i18n/strings/en-us.json
                    ln -s $out/share/element/electron/lib/i18n/strings/en{-us,}.json
                    
                    # icons
                    for icon in $out/share/element/electron/build/icons/*.png; do
                      mkdir -p "$out/share/icons/hicolor/$(basename $icon .png)/apps"
                      ln -s "$icon" "$out/share/icons/hicolor/$(basename $icon .png)/apps/element.png"
                    done
                    
                    # desktop item
                    mkdir -p "$out/share"
                    ln -s "/nix/store/mhm99jlbwf9ky6akj9rpavl7kw0p15d7-element-desktop.desktop/share/applications" "$out/share/applications"
                    
                    # executable wrapper
                    # LD_PRELOAD workaround for sqlcipher not found: https://github.com/matrix-org/seshat/issues/102
                    makeWrapper '/nix/store/ky283mhcs2g0rg0ardi84g8nsb3cgkzm-electron-28.1.4/bin/electron''/nix/store/h1fall2vff9pdh8i32xqn5bg7yni6vvx-electron-28.2.2/bin/electron' "$out/bin/element-desktop" \
                      --set LD_PRELOAD /nix/store/bv4iyj4n6n882spb97k3cgmfw444hyhp-sqlcipher-4.5.5/lib/libsqlcipher.so \
                      --add-flags "$out/share/element/electron" \
                      --add-flags "\${NIXOS_OZONE_WL:+\${WAYLAND_DISPLAY:+--ozone-platform-hint=auto --enable-features=WaylandWindowDecorations}}"
                    
                    runHook postInstall
                    
                ''
              • The input derivation named `geogebra-6-0-794-0` differs
                - /nix/store/pzd26j8ad16z3c5lz4h7dnzs58ji6b30-geogebra-6-0-794-0.drv:{out}
                + /nix/store/mhn0df1xxr32v91ringwvnlxkpf1bifi-geogebra-6-0-794-0.drv:{out}
                • The set of input derivation names do not match:
                    - electron-27.2.3
                    + electron-27.3.2
                • The environments do not match:
                    installPhase=''
                    mkdir -p $out/libexec/geogebra/ $out/bin
                    cp -r GeoGebra-linux-x64/{resources,locales} "$out/"
                    makeWrapper /nix/store/whnnjxsc9vw18266qis3inak8qv32xc1-electron-27.2.3/bin/electron/nix/store/q58hmfnssd5jgnjbgmhbbh9l0148ggrx-electron-27.3.2/bin/electron $out/bin/geogebra --add-flags "$out/resources/app"
                    install -Dm644 "/nix/store/wphn8arphwxp89mdqsddkcn5pbzaf48y-geogebra.desktop/share/applications/"* \
                      -t $out/share/applications/
                    
                    install -Dm644 "/nix/store/v5jiy93vy2nmzccynclqrhhi72qqca8r-geogebra-logo.svg" \
                      "$out/share/icons/hicolor/scalable/apps/geogebra.svg"
                    
                ''
# ... omitting the rest that seems unrelated ...

So, to me it seems like the “culprits” are Electron, Drawio, Element and GeoGebra? Might that be correct or do I misinterpret the output here?

First off, there’s nothing wrong with using systemctl stop nixos-upgrade to stop hogging your CPU if this happens again in the future. Nix builds and NixOS updates are atomic (assuming you use boot not switch, which I’d recommend configuring for unattended updates anyway), so this can’t break the update process midway through or anything.

Not necessarily. Given the build times, yes, you probably built chromium+electron like 4 times, but just the fact that the derivations differ doesn’t tell us much. That’s what should happen if there was an update. Normally, nix should have downloaded these from the cache, which should have been filled with binaries built by hydra.

This is unless the derivations you’re building somehow differ from the upstream ones. This can happen for a variety of reasons:

  1. You explicitly override/change something in the package in your configuration.
    • You’d probably know if you did this
  2. You explicitly override/change one of the dependencies of the package in your configuration.
    • You’re more likely to have overlooked something like that
  3. You’re using an overlay to import unstable packages and weren’t careful about how the packages are evaluated.
  4. Hydra had not yet managed to complete the build when you updated
    • You’re using the correct branches as your flake inputs, so this is unlikely, unless:
      • You’re also importing a nixpkgs from a fetchurl somewhere, seen that often enough
      • You’re not actually using flakes but just building a configuration.nix and your channels are wrong
    • It may also happen if a hydra build fails, but this is also unlikely given you have multiple different versions of electron in there

And probably others that don’t immediately come to mind, so it’s hard to know what exactly is going wrong without seeing your full config.

Not realistically with current nix features, at least to my knowledge. NixOS contains lots of little mini-packages, including the activation script that actually installs the new generation. These fundamentally need to be built locally, because they are specific to your configuration. Disabling all local builds would make it impossible to build your system.

Nix doesn’t really have a way to distinguish between the size of builds yet, it’s impossible for it to know whether a specific package is going to take 2 seconds or 11 hours. So… just catch it when this happens and kill the build with systemctl stop. That said, when your configuration is correct this should never actually happen.

1 Like

I’m very much a noobix, recently switched distribution and experienced the same problem. I assumed the nixpkgs channels would only roll forward on a successful Hydra build, any updates would then use those cached binaries, unless overridden. Effectively making the cache equivalent to repositories on other distributions.

But having looked on Hydra, the electron package had failed to build, the channel presumably still got updated. So the autoUpgrade pulled the channel updates with references to that electron derivation with no cached binary, which triggered a local build. The electron package now appears to have built successfully so updates are now pulling the cached binary.

I’m sure I must be wrong, but it seems that the channel update (of which I’m on nixos-23.11) isn’t transactional with a full successful build, but can get updated before dependent packages have successfully completed. Which potentially makes the autoUpgrade option a big overhead locally.

2 Likes

Ah, yep, that’s probably what happened in this case. Updates will not be stalled indefinitely if singular builds fail (don’t want to wait 11+ hours to push security updates because chromium is failing to build), though in general the channel/branch will only advance once hydra has finished. This was probably an unlucky one-off.

1 Like

Thank you very much for your thorough reply and your suggestion, @TLATER! Because this never happened before, I didn’t think about systemctl stop nixos-upgrade . I’ll do that in the future, if necessary.

My config for automatic updates looks like this:

  /* ---- SYSTEM MAINTENANCE ---- */
  # Automatic Upgrades
  system.autoUpgrade = {
    enable = true;
    flake = "/<path>/<to>/flake.nix";
    flags = [
      "--update-input"
      "nixpkgs"
      "--recreate-lock-file"
      "--commit-lock-file"
    ];
    allowReboot = false;
    dates = "Saturday 09:00";
    randomizedDelaySec = "15min";
  };

It should be fine, right?

That’s why it completely caught me by surprise: So far I haven’t used overrides or overlays of any kind, nor functions like fetchurl or fetchFromGithub .
All of the programs are installed by just adding the package name either to the environment.systemPackages or the home.packages list or via the option programs.<app_name>.enable = true . And everything is pulled from 23.11 stable branch.
The only two exceptions are 1Password and Spotify:

a) 1Password gets installed by

programs._1password-gui = {
  enable = true;
  polkitPolicyOwners = [ "<my_username>" ];
  package = pkgs-unstable._1password-gui;
};

where pkgs-unstable is defined in my system flake (referring to nixpkgs-unstable from the inputs section mentioned above):

let
  /* ---- PKGS ---- */
  pkgs = import nixpkgs {
    inherit system;
    config = { allowUnfree = true; };
  };
  pkgs-unstable = import nixpkgs-unstable {
    inherit system;
    config = { allowUnfree = true; };
  };
in {

b) Spotify is installed via Home Manager with

home.packages = with pkgs; [ ... ] ++ (with pkgs-unstable; [ spotify ]);

With this setup, I didn’t expect having to build something locally. At least now I know what to do if it happens again at an inconvenient time…

On a sidenote (and somewhat related to the original question), I now wonder if for simplicity’s sake I should also get 1Password and Spotify from the stable branch like everything else and just get completely rid of the nixpkgs-unstable input? Are there any advantages in changing this, like reduced overall disk usage, reduced chances of having to compile something again or reduced risk of breakage?

Good point, considering Electron seems to be the cause for my build as well.

Thanks for this clarification!

Yes. Nix packages all have their own dependency trees; only if you happen to have the same dependency version between two different packages they’ll end up in the same place and thereby be de-duplicated.

If you use unstable and stable, any unstable packages will have a full second copy of everything from glibc up. This can be a lot of data, especially for GUI packages like that.

You also have less breakage on stable, by definition, since only non-breaking changes should be merged into stable (with the occasional very specific exception). Fewer updates also means less chance of build failures causing hydra to skip a package.

That said, those things aren’t really that severe. Disk space is relatively cheap, breakage actually having user-visible effects is rare, and it’s not particularly common for builds to fail (and even if they do, chromium is kind of a worst case, most builds you would barely even notice).

The downside of stable is that you won’t always have the latest and greatest, sometimes your packages will end up 6 months “out of date” (though security updates are backported).

As a heads-up, this is a bit of an anti-pattern. Prefer nixpkgs.legacyPackages.${system} where possible.

To allowUnfree for your main nixpkgs, use the NixOS option instead. Or better yet, use the allowUnfreePredicate so you know when you’re about to install something unfree.

For any additional nixpkges, or if you use unfree packages outside your NixOS config, you can use this crutch until flakes are extended in some way to allow unfree packages properly: GitHub - numtide/nixpkgs-unfree: nixpkgs with the unfree bits enabled

I forget the details, but iirc this will cut evaluation times a bit with thr eval cache. Not terribly important for leaf flakes like yours probably is, but it’s good to stick to best practices so you’re already doing it right where they are important.

Yep, that looks good, though I’d still suggest changing operation to boot. That way your running system won’t randomly have different graphics libraries and your GUI applications stop working.

If you fully reboot at least daily you otherwise won’t notice a difference between switch and boot.

From what you’ve shown your config is probably fine, but I still can’t judge without seeing the exact trace from your inputs to your applications that use chromium and whatnot. But given hydra hiccuped, it’ll probably be that and not your config.

2 Likes

Thanks again for your in-depth explanation, @TLATER! Much appreciated! I will implement your suggestions in my config.