Patched openssh pkg is constantly being rebuilt

A while ago I noticed that my OpenSSH often has to be rebuilt when I make completely unrelated changes to my NixOS configurations.

TL;DR

  • openssh with a single patch added
    • not overriden globally, just programs.ssh.package
  • patch is stored in flake’s git repo
    • patch store path somehow depends on flake’s store path, but also not
  • constantly have to rebuild this package
    • but it should already be on system or at least on binary cache

My setup
I don’t use nixos-rebuild and instead use:

nix build --eval-store auto --store ssh-ng://"<user>@<hostname>" \
.#nixosConfigurations.<name>.config.system.build.toplevel

The remote store is either the target server or a build server when compiling for a low-power server. All servers have access to the binary cache (Minio S3 bucket with anonymous RO access) that my GitLab CI pushes to.
All store paths for the main branch of my repo are guaranteed to be on that binary cache because everything gets built by the CI before merging. Most of the servers are also automatiacally deployed by the main branch pipeline.

The code
no public repo, sorry
In my flake, I have an overlay that adds an openssh-mptcp package, which is just openssh with a patch for MPTCP. I then set this as programs.ssh.package.

The overlay generally uses infuse.nix, but I’m not fully making use of it for this particular package yet, as per the TODO:

# TODO: figure out how to do this with infuse
openssh-mptcp.__init =
  if final.stdenv.hostPlatform.isLinux
  then
    prev.openssh.overrideAttrs (self: super: {
      patches =
        super.patches
        or []
        ++ (cleanPatches [
          ./openssh-${self.version}-mptcp-support.patch
        ]);
    })
  else prev.openssh;

(I maintain the patch for multiple versions as the overlay is applied to multiple NixOS releases).

What I already tried
Because the patch files are stored in the flake’s repo, I considered that making changes to the flake may also change the store path of the patch, thereby changing the openssh-mptcp package’s store path even though the content of the patch hasn’t changed.

When I looked into it a while ago, I could see via nix repl that the store path of the MPTCP patch in the derivations patches attribute did change when making unrelated changes in the flake. That’s because the patch path is relative to the flake’s store path, which changes whenever any Git-tracked file is modified, added, or deleted.

So I added a cleanPatches function that ensures the patches get the same store path regardless of the flake’s store path:

cleanPatches = patches: map (patchPath: builtins.path {path = patchPath;}) patches;

What I didn’t notice before is that after I remove the cleanPatches function again, even when the patch’s store path changes, the package’s drvPath and outPath remain the same (really!).

I suppose this might be some kind of flake-related magic. I have also never heard of flake’s having to decouple source files or patches from their Nix flake’s path.

The hash changes if one of the build inputs changes (mainly version/their hash), so you likely are rebuilding this on each flake inputs update. To stop that behavior, you’ll want to pin the openssh package specifically that is used.

I don’t think that’s it.

Nixpkgs is pinned via flake.lock; and that my patched OpenSSH package has to be rebuilt when the upstream nixpkgs package is updated (even if just via its dependencies) is what I would expect.

Also, updating flake.lock is a task I rarely do manually anymore since I configured Renovate to do this automatically against my GitLab repo every night.
Its MRs are only merged after CI has passed, which also means all NixOS system derivations - which should contain the package - are on the binary cache already when the updated flake.lock hits the main branch.

My confusion stems from the fact that OpenSSH gets rebuilt when I make changes to the configuration locally unrelated to the package (for example, changing the config file of a particular service). Especially when I build such a configuration on a server that already has the latest config (minus the unrelated changes) deployed.

Just had another rebuild of OpenSSH.

This time, I ran sudo nixos-rebuild switch --flake -L .#<hostname> on a machine that has a few uncommitted, yet unrelated changes to the flake, but is otherwise in sync with the main branch.

I looked up the package’s store path from nixosConfigurations.<hostname>.config.programs.ssh.package.outPath.
I was able to confirm that the path is:

  • on the binary cache:
    $ nix path-info --store https://minio.<domain>/nix-cache/ --json /nix/store/nl4ly35z9gixdz35lclw9fvjr2fvkasc-openssh-10.2p1
    {
      "/nix/store/nl4ly35z9gixdz35lclw9fvjr2fvkasc-openssh-10.2p1": {
        "ca": null,
        "compression": "zstd",
        "deriver": "/nix/store/v46wmfinwhsvi89gcc27pkm1xwxaaz1b-openssh-10.2p1.drv",
        "downloadHash": "sha256-eCCPBDb0ECmVI6g3SC+ocOsnbTx/ZMIFJGm0w2JEKKs=",
        "downloadSize": 2307850,
        "narHash": "sha256-GcGzxmGdtTuktQJ/OKqkl827qluaa147cxk4BbmEbMM=",
        "narSize": 9323744,
        "references": [
          "/nix/store/1kf1awzg5ag8cjd16dy346apr3jlf12v-ldns-1.8.4",
          "/nix/store/3n15dmd4y0dhplwbx655w4nj1imgr5z7-libfido2-1.16.0",
          "/nix/store/iwa7i46bbw0mnq7k7bfsrq5zcc781ab9-libedit-20251016-3.1",
          "/nix/store/k0wfscy8mjfzkhrzm4r6yy8bxs3v4s5w-openssl-3.6.0",
          "/nix/store/l6i35y2hlmdz0hvz690h3k4ilq9ahhzy-zlib-1.3.1",
          "/nix/store/nl4ly35z9gixdz35lclw9fvjr2fvkasc-openssh-10.2p1",
          "/nix/store/p711vqjvd78jz7q6ryzdp5jnnnb48s6j-linux-pam-1.7.1",
          "/nix/store/wjxpaix0cdpww0bldvzsq2d1bjc6g62b-glibc-2.40-66"
        ],
        "registrationTime": null,
        "signatures": [
         "<domain>:<something>"
        ],
        "ultimate": false,
        "url": "nar/1ar88iic7d394h2w4r3z7injgsvhm0plhdx84fajj47l6q28y83q.nar.zst"
      }
    }
    
  • not somehow marked as unavailable because of a previous fetch error (cleared ~/.cache/nix)
  • on the system (!!!), via nix build --no-link --print-build-logs <store path>, which immediately returned

My theories so far have been:

  • my binary cache is to blame
    • no, because the rebuild is happening even when the path is already on the system
    • sometimes, Minio stops responding to requests entirely and I have to restart
    • sometimes, when Nix is querying and pulling many paths for larger updates, it misses a few individual, random store paths. But the problem now is that it’s consistently rebuilding one particular package
  • nixos-rebuild somehow builds the system’s ssh package
    • no, because it’s also happening with a plain nix build
    • there’s a mechanism like that to first build the nix package from the system configuration and then build the rest of the system with it
  • there’s some dependencies on other outputs of the package
    • but these would also be in the system closure and therefore on the system or the binary cache
  • the output from nix path-info above shows the package has a reference to itself
    • the unpatched nixpkgs package on cache.nixos.org has the same. I think this is quite normal

If the store path of openssh is staying the same, then you probably need to set nix.settings.keep-outputs = true. This will stop the dev and man outputs from being garbage collected.

1 Like

The dev output not being available (neither on the build system, the target host, or the CI cache) does seem to be the culprit.

The services.openssh module adds a config check derivation to system.checks that has the configured package in its `nativeBuildInputs’: nixpkgs/nixos/modules/services/networking/ssh/sshd.nix at d03088749a110d52a4739348f39a63f84bb0be14 · NixOS/nixpkgs · GitHub

This has two distinct problems:

  1. The derivations in system.checks are only build-time dependencies of the system derivation, they’re not added to the system closure and therefore not pushed to the binary cache and GC’ed from systems they’re on (and I’ve got automatic GC enabled).
  2. The check derivation depends on the dev output of the package even though it only uses the sshd binary.

I could fix 1. either by setting nix.settings.keep-outputs = true in nix.conf (but that would disable GC’ing of build-time deps in general), or by adding system.checks to system.extraDependencies, which are added to the system closure.

If 1. is fixed, the situation where I have to rebuild the dev output of the openssh package even though the regular out output is available would still arise if I make changes to the SSH configuration that would modify the check derivation.

So ideally 2. would be addressed (only feasible if done upstream). What would be the correct way to do this? Is it possible to only add a particular output to the nativeBuildInputs? Would injecting the full binary path using string interpolation work?

(this should probably be a nixpkgs issue/PR at this point)

Use nix-diff on the .drv to see the differences between derivations. Often that helps finding why things rebuild. Use nix-store --query --deriver to get the .drv for a built store path.

1 Like