opened 01:57AM - 08 Mar 23 UTC
feature
**Is your feature request related to a problem? Please describe.**
I think ju…st about everyone who writes nix derivations has tripped over this problem at least once:
- You change a source url in a fetcher (perhaps indirectly, by changing another field such as `version`), but forget to change the hash.
- The build silently succeeds using the previous data, and then the problem shows up down the line, either because you find you're not actually using the version you think you are, or because the derivation fails to build on another machine later.
This problem should be conquerable, without any real compromises. It's just going to take some work.
**Describe the solution you'd like**
Nix should accept an extra field in fixed output derivations which contains a string, opaque to nix, whose hash will be stored in the nix database, associated to the output store path, when the FOD is successfully built. Nix will not blindly reuse the store object by name if the hash of the extra field specified in the FOD being evaluated does not match any of the stored field hashes associated to that path in the nix database. It will run the FOD instead, adding the new field hash to the database if it succeeds. Nix *can* throw away the result of the FOD, and reuse the existing store path, but only after running it to verify that it does produce an output with the same hash as what is already present.
If this field is not specified, then no entry is produced in the nix database, nor are the existing ones checked, just like the current behavior. This is important for certain fetchers such as `requireFile`. Similarly, it's important to modify the various prefetching commands to produce the relevant nix database associations, so that the fetchers they are meant to be compatible with will reuse the prefetched objects without complaint.
FODs that wish to make use of this feature must determine a string that is appropriately identifying for the purposes of this determination. For most FODs, it seems quite straightforward to do this. `pkgs.fetchurl`, for example, should use the url itself for this field (probably prefixed with `fetchurl:` or something, just to avoid any accidental collision between different fetchers). For some FODs, it might make more sense to serialize a datastructure of some kind into the string. This determination does not have to be 100% rock solid, as it only comes into play when nix expression writers make a mistake. A 99% solution is plenty here.
This design should ensure bi-directional backward compatibility, since new expressions on an old version of nix should simply set an environment variable which is not used, and a new nix evaluating old expressions will act identically due to the field not being present (assuming we name it something that avoids collision well enough). New prefetchers should also work with old expressions, since the extra database entry won't hurt anything. However old prefetchers may not work correctly for new expressions, causing the FOD to be re-run, wasting resources, but still succeeding if it should have done so.
The interaction with substituters needs some more thought. Somehow, the associated field hashes need to be communicated from the cache to the substituting nix instance, before substitution, so that it can determine whether it's acceptable to use the store object from the cache or not. I don't have a specific plan for how to communicate this, but it seems like a clearly solvable problem.
Most nix expression writers would never need to know about this change at all, as they simply use nixpkgs fetchers, and those could implement this internally. The problem would simply be fixed.
**Describe alternatives you've considered**
- Including additional information in the `name` field of FODs instead. This causes the store paths of derivations that depend on the FOD to change when the download source is changed, which is undesirable.
- Using the input hash of the FOD as the identifying information. This is overly specific, however, causing frequent spurious redownloads of, for example, `fetchurl`s when a new version of `curl` comes out (or even just a new version of one of its dependencies).
- Various less comprehensive heuristic solutions might be able to manage to warn the user when a possibly unintended reuse is taking place, but I believe a more comprehensive solution that "just works" is far more appropriate.
**Priorities**
Add :+1: to [issues you find important](https://github.com/NixOS/nix/issues?q=is%3Aissue+is%3Aopen+sort%3Areactions-%2B1-desc).