Managing Nixpkgs fetcher pins without forgot-to-change-the-hash hazards

The question

Is there a good way of managing pins passed to Nixpkgs fetchers (fetchurl, fetchFromGitLab, etc)?

Context

Nix has several built in fetchers (e.g., fetchTarball). These perform the fetches at evaluation time, which is often undesirable compared to performing the fetch at build time. Supplying a hash to these fetchers is optional (but I always do on reproducibility principles).

Nixpkgs has several more fetchers (e.g., fetchurl, fetchFromGitLab). Happily, these do the fetches at build time, making them preferable to use where possible. These fetchers always require a hash.

There are some other differences between the two types of fetcher, but evaluation- versus build-time is the only one I really care about.

One feature that’s shared by both, however, is that it the built-in fetchers’ cache and the derivations produced by Nixpkgs fetchers are only invalidated if the hash changes, not the fetched URL. This creates a footgun, where you can change the URL (e.g., to fetch a new version of a dependency) but forget to change the hash, and then tear your hair out wondering why things aren’t working, as well as a reproducibility hazard (because things work differently depending on the state of the cache/store). I have already been bitten by this while manually managing pins, it was a terrible experience, and I have no intention of repeating the experience if I can at all avoid it.

For built-in fetchers, using npins seems to be the state of the art in avoiding this footgun while managing pinned dependencies: the tooling makes sure you never change only the URl and not the hash. However, npins does not (presently) support Nixpkgs fetchers.

Other context and related issues:

The question, again

So, is there a good way of managing pins passed to Nixpkgs fetchers (fetchurl, fetchFromGitLab, etc)?

By ‘good’, I mean not subject to the update-the-URL-only footgun above, and which ideally doesn’t require building new tooling.

I can immediately see three options:

  1. Tough it out and just don’t make mistakes while doing things manually (yeah, right).

  2. Build tooling for managing pins passed to Nixpkgs fetchers (which might well be a patch to npins) and always use that instead of open-coded fetcher invocations.

  3. Tolerate unnecessary evaluation-time fetching and do everything with built-in fetchers and npins (which, once caches warm up, may be fine, but that cold-cache behaviour will be recurrently annoying), again always avoiding open-coded fetches.

None of these are particularly great. I would love to be told that I am wrong in this analysis and there is some fourth option.

nvfetcher does a similar job to npins, but uses nixpkgs fetchers.

1 Like

AFAIK there are two options:

  • put (part of) the URL in the pname of the fetchurl call, e.g. using the version parameter
  • output the URL as part of the fetch derivation, then check it in a separate derivation that you use as the actual source
    • the big drawback is that the hash does not match the file anymore…
    • a smaller drawback is that you get two derivations instead of one

A very quick and dirty way to do the second option:

let checkedFetchurl = args: (
  let src = (fetchurl (args // {
    downloadToTemp = true;
    recursiveHash = true;
    postFetch = ''
      mkdir -p $out
      echo '${args.url}' > $out/.url
      cp $downloadedFile $out/source
    '';
  }));
  in pkgs.runCommand (builtins.baseNameOf args.url) {} ''
    if [ "$(cat ${src}/.url)" != "${args.url}" ]; then
      echo "Error: URL mismatch" >&2
      exit 1
    fi
    ln -s ${src}/source $out
  '');
in
stdenv.mkDerivation rec {

  pname = "hello";

  version = "2.12";

  src = checkedFetchurl {
    url = "mirror://gnu/${pname}/${pname}-${version}.tar.gz";
    hash = "sha256-oyuj33EoLjkbfx2vyAz6XVudSGuk2fLcRdDm5JbY470=";
  };

}

Nixpkgs fetchers · Issue #124 · andir/npins · GitHub hacky hack at the bottom might be interesting to you.