Detecting outdated URLs in Nixpkgs?

Dead links in Nixpkgs come every now and then. Two examples:

Although, as you can see, those were fixed pretty quickly, this still poses a question: How do we deal with these dead links?

https://cache.nixos.org has these fixed-output derivations, so the casual users that overrided some configuration options or patches do not notice that these links become dead. It seems to me that it only becomes a noticable problem for those using Nix with a non-/nix/store store directory, which is definitely a minority.

But even though this problem might not affect the majority of the users, it does mean the quality of Nixpkgs rots as links become outdated.

Is there anything we could do about these links beyond manual checking? For example, can we periodically check fixed-output derivations to see if they still build fine? Or maybe we can track calls to fetchurl and log all URLs and hashes, and re-download and check them?

3 Likes

There is tarballs.nixos.org, which is a content-addressable mirror used automatically by fetchurl. Unlike cache.nixos.org it also works if you use a different store prefix.

Note: the script that adds files to the mirror from the Nixpkgs stable branch was broken for the last few months, but it is running again now. (Tarball mirroring broken · Issue #76 · NixOS/infra · GitHub)

2 Likes

I was just bit by this boehm-gc/ncurses issue too.

For example, can we periodically check fixed-output derivations to see if they still build fine?

I definitely want this! It’s not hard to determine which derivations are fixed-output (look for the hash), but I’m not sure how to make a list of all the derivations inside nixpkgs. We can get a list of all the top level paths and try looking at them plus all path.src to see if they are fixed-output. I think that will catch most derivations.

There is tarballs.nixos.org , which is a content-addressable mirror used automatically by fetchurl .

Good to know that exists and I hope we can add some monitoring to it!

tarballs.nixos.org indeed helps with non-/nix/store builds. Does the mirroring script also monitor for dead links? If not maybe we could implement some periodic checking of tarball URLs?

Approching this problem the other way around, we need to ensure that
every fixed hash refers to valid content. Hashes are easy to find for
there are only a few ways to encode them.
We cannot check the validity of urls per se, that’s also why we have
several mirrors for some. But we should remove old irretrievable hashes.
That would also help finding non-reproducible fixed output derivations,
like the ones sprouting around with gradle for example.

Of course, this is less automatable than finding src = attributes, but
could lead to some good practices, like adding some comments before hashes.
Example of what such annotations could look like in factorio/default.nix

   # NB `experimental` directs us to take the latest build, regardless 
of its branch;
   # hence the (stable, experimental) pairs may sometimes refer to the 
same distributable.
   binDists = {
     x86_64-linux = let bdist = bdistForArch { inUrl = "linux64"; inTar 
= "x64"; }; in {
       alpha = {
         # build with `pkgsx86Linux.factorio-stable.src`
         stable        = bdist { sha256 = 
"0b4hbpdcrh5hgip9q5dkmw22p66lcdhnr0kmb0w5dw6yi7fnxxh0"; version = 
"0.16.51"; withAuth = true; };
         # build with `pkgsx86Linux.factorio.src`
         experimental  = bdist { sha256 = 
"0b4hbpdcrh5hgip9q5dkmw22p66lcdhnr0kmb0w5dw6yi7fnxxh0"; version = 
"0.16.51"; withAuth = true; };
       };
     ...
     };
     i686-linux = let bdist = bdistForArch { inUrl = "linux32"; inTar = 
"i386"; }; in {
       alpha = {
         # build with `pkgsi686Linux.factorio-stable.src`
         stable        = bdist { sha256 = 
"0nnfkxxqnywx1z05xnndgh71gp4izmwdk026nnjih74m2k5j086l"; version = 
"0.14.23"; withAuth = true; nameMut = asGz; };
       };
     };
   };