A friend of mine recently made me aware of https://github.com/NixOS/rfcs/pull/133 which introduced git’s tree hashing scheme as a valid hashing mode in Nix that results in a sha1 hash.
We discussed this in the context of Robotnix where we need to lock thousands of git repositories as FODs.
Pre-fetching 1000s of FODs (some of which are GiBs in size) is extremely time-consuming which massively slows down the update process. My friend’s idea was that we could cheaply fetch the tree hash of the remote heads without needing to nix-prefetch-git.
We played around with the git-hashing experimental feature together yesterday and were able to create a FOD that hashed the tree a la git and it was trivial to make fetchgit use that mode too, so even fetching actual stuff from the internet works just fine.
What we weren’t able to do is use a known git tree hash as the FOD hash. I suspect this is due to FODs not actually being content-addressed but being the hash of some special derivation representation that also includes e.g. the derivation name.
Is there a way to make this idea work?
The RFC mentions fetching sources from the software heritage foundation as a mirror but those must be FODs too though, right? How would that work?
I noticed that it was being weird about $NIX_BUILD_TOP/tmp not existing but wanted to report that separately.
Indeed that seems to be hashing that rather than $out. When I change $out, the hash doesn’t change and when I change $NIX_BUILD_TOP/tmp, it does change. I’ll get to testing whether that corresponds with the git tree hash.
Funnily, this allows you to produce a weird state where you have a non-deterministic FOD that does not fail before the --check determinism check because the hash “matches”.
I think that «weird states» don’t count when the feature is clearly not finished yet.
I guess your workaround could be called a «preview» or something, given that upgrading to a correctly working git hasher should be transparent (and doing a completely-extra bunch of persistent writes is how everything is done with Nix anyway)
Yeah, I haven’t tried push it for prime time yet (I went and finished a first version because of potential good interactions with the fetchers). But I am glad people are trying it out!
So that’s where the bug leved (I tried to find out where actual hashing happens, looked at all the cross-references, decided that everything is happenning somewhere else, and just blackboxed the diagnosis).
OK, I am going to be a little more optimistic. Please use this, and if we find no bugs, we should be should be ready to stabilize it.
This are changes/extension I thought about, like a trick analogous to the “case-insensative file unpack nars hack” for unpacking submodules without getting a different tree hash, but they could always be a separate feature rather than a breaking change.