Prevent (automatic) garbage collection for failed automated (re)builds

I have a number of machines that are a allowed to auto-upgrade at night, and have automatic gc enabled for later in the night, giving the system the time to complete its rebuild and then clean up previous generations and such.

On the happy path this works great, but since I’m using unstable nixpkgs sometimes I get a large update, a build failure, and then everything that’s been built and downloaded so far gets garbage collected before I’ve had a chance to intervene.

It would be nice if in those cases either the gc schedule is skipped until a successful build has happened, or if gc roots would be kept for the rebuild so far.

I was thinking along the lines of changing the systemd service for garbage collection to first perform a check, comparing the output of nix build .#nixosConfigurations.<redacted>.config.system.build.toplevel --dry-run --json 2>/dev/null | jq -r .[0].outputs.out to /run/current-system , but if someone can suggest a better way, that would be appreciated.

This seems to work, but it feels a bit inelegant:

  systemd.services.nix-gc = lib.mkIf config.nix.enable {
    script = lib.mkForce '' 
      dry_run="$(${config.nix.package.out}/bin/nix build /etc/nixos#nixosConfigurations.${config.networking.hostName}.config.system.build.toplevel --dry-run --json 2>/dev/null | ${lib.getExe pkgs.jq} -r .[0].outputs.out)"
      current_system="$(${lib.getExe' pkgs.coreutils "readlink"} -f /run/current-system)"
      if [ "''${dry_run}" = "''${current_system}" ]; then
        # from nixos/modules/services/misc/nix-gc.nix
        exec ${config.nix.package.out}/bin/nix-collect-garbage ${config.nix.gc.options}
      else
        echo "Automatic garbage collection delayed because ''${dry_run} != ''${current_system}"
      fi
    '';
  };

running the gc during the upgrade would work, as an on-going build creates temporary gc roots for stuff it needs. The timing would be hard to get right, tough.

1 Like