Content-addressed Nix − call for testers

I was just giving ca-derivations a try, and after adding experimental-features = ca-derivations, I had to add the outputHashAlgo stuff, and after that I discovered that there is no out path (well there is one, but it’s “”). Eventually I need to know the out path in order to make a link to it! I noticed that nix-store -r will print it out, and it makes sense that you can no longer predict it in advance, but is there a programmatic way to know what it is, aside from run nix-store -r and look at its stdout? I’m foreseeing another problem with that, which is that nix-store -r with multiple drv inputs doesn’t give any guarantee about the order of outputs (that I know of), which means I can’t tell which output corresponds to which input.

Also, should this stuff be updated on Ca-derivations - NixOS Wiki? I see some references to some bugs and fixes but I’d be interested to know if I have to upgrade to a particular version of nix beyond 2.4 to get those, or if there are still outstanding known problems.

1 Like

Here is my failed attempt to become remote adventurer:

; nix shell \
  --experimental-features 'nix-command' \
   --store /tmp/my-ca-nix \
  --trusted-public-keys '' \
  --substituters https://cache.ngi0.nixos.org/ \
   /nix/store/yvk5yl9fid0zlxqk1xvvzn787d8gbh00-emacs-27.2 \
   -c emacs --version
error: hash mismatch importing path '/nix/store/dlq98171jblxqcigqhldh64mcra9bxld-attr-2.4.48';
         specified: sha256:1wa8sg66qzbj2mx4p3kkv3gci3hvwvarxxc9398pwgc1gcfxvcqq
         got:       sha256:0rj3imwbccrvs4w4nwsmcybfhxgyr4w98az6if6bm1xbxkqd6sas
error: some references of path '/nix/store/yvk5yl9fid0zlxqk1xvvzn787d8gbh00-emacs-27.2' could not be realised
; nix --version
nix (Nix) 2.9.1

Any ideas what I am doing wrong?

1 Like

Nothing actually, I did.
The work on CA derivations exhibited a bug in the way the Nar hashes were computed and we had to fix it, which changed the hashes of the CA drvs outputs (so the very old stuff built before the fix like that one are not valid anymore). I’ve re-pushed the closure with the correct hashes, so once all the caches are cleared is should be better.

Or (and I’ll update the first post for that) you can try a more freshly built emacs with the correct hashes like /nix/store/ih1ish76pdmzcqbdcdd09z007f6bxjrf-emacs-28.1

2 Likes

Thanks. With updated hashes, it worked for me.

1 Like

I am getting the following for every jobset on a ca hydra. Anyone any idea?

in job ‘bind’:
error: not an absolute path: 'ARRAY(0x5781bc0)/3f5cjwlb77y0zgk82i20d41spr5368wl-microvm-cloud-hypervisor-bind.drv'
1 Like

I’m having trouble getting ca-derivations to work with a remote cache. Specifically, they never hit the cache. From looking at http logs it looks like they need to do an extra query like /realisations/sha256:d0cba4146da895178856db309b41c3a10a25acbbaa905469960605608887e409!out.doi presumably to get the drv->out mapping, but those never get uploaded.

From the nix source, it looks like registerDrvOutput is supposed to upload them. It’s called in many spots, but the most promising is store-api.cc copyPaths, since that is what happens when you upload via nix copy --to .... A bit of debug logging and it looks like it only does the register when the RealisedPath::raw has a Realisation, but they all seem to have OpaquePath instead. nix realisation info --json seems to indicate that both CA and non-CA out paths are opaquePath. The comment implies OpaquePath is non-derived stuff like nix add-to-store but that’s clearly not the case because derived paths have it too.

Before I dig further, is it documented (or can someone explain?) how those /realisations/.. are supposed to be uploaded, so I can figure out why it’s not happening?

Also, lately we’re seeing a lot of segfaults from nix, sometimes accompanied by double free or corruption (out), sometimes not. Also some error: coroutine has finished. I don’t know if it’s related to enabling ca-derivations or just coincidence, maybe someone else has noticed something like this?

thanks!

2 Likes

Yes, that’s something that dreadfully needs to be documented.

Indeed, the realisations are copied by nix copy. Passing in a store path to nix copy will always treat it as an opaque path (for the simple reason that you can have several realization for the same path, and Nix can’t know which one you’re interested in), but you can pass it any installable, and if that installable is something that evaluates to a derivation, then Nix will copy both the outputs of that derivation and the associated realizations. And in particular you can nix copy /nix/store/...-foo.drv to copy both the output paths of that derivation and the realizations that produced them.

If you’re using a post-build-hook to copy stuff to the binary-cache, you can access $DRV_PATH in it to do that (like it’s done in the nix test suite).

Mh… There used to be some issues with that a couple releases ago (the coroutine one in particular was very symptomatic of a specific bug), but IIRC these got fixed, and I don’t recall anything like that recently

Oh, I think this got it for me! I think there are some more details, which also explains why it didn’t work for me even though I already do upload the drv. Let me know if this is accurate:

Back in the old days, nix copy /nix/store/path.drv copied the drv. Now, it does not copy the drv, but copies its outputs, along with the “doi” link from drv->outs. nix copy --derivation now copies only the .drv, but not the outs and not the link.

If that is the case, then the upload code should now upload only the drv, but both with and without the --derivation flag.

BTW, is there some documentation about what an “installable” is? This seems to be a new term that all the man pages assume I know about.

[ segfaults ]

Hmm sounds like upgrading to 2.8 or 2.9 may resolve this then. Thanks!

Hmm there’s one other detail I didn’t think about. That is, when you do a remote build, even though the remote builder is building using the drv, somehow the drv file itself does not wind up in the builder’s /nix/store. I don’t know why this is and it surprised me when I discovered it, but it is the case. This means that builders can no longer upload their outputs directly, but they have to copy the outputs back to the coordinator, which then has to upload, since only the coordinator has the drv. This is workable, but less efficient, especially since the major upload cost is compression, and we have (as is probably typical) small coordinators and big builders. So, another thing to be aware of when switching to ca-derivations.

Actually the real reason I was uploading from builders is that a remote build bug truncates build logs around 8k and so the complete log is only available on the builder to upload, but that’s a whole other thing…

That’s indeed a bummer. Technically there’s no need to have the drv to copy stuff around, but the cli interface doesn’t allow anything else afaik.

Technically you can probably save most of the bandwidth by copying the output paths from the builders and the realisations from the coordinator. That’s a bit more convoluted than what it should be of course :confused:

The below is taking 225GB 387GB of memory according to macOS, and rising:

> nix store make-content-addressed --all                                                                                      
warning: dumping very large path (> 256 MiB); this may run out of memory

If my machine weren’t a maxed out Macbook Pro (it’s an M1), this would have crashed long ago. I’m on nix (Nix) 2.10.3. Assuming this will fail, does anyone have any hints for debugging what’s going on? (or is it simply expected to fail when run on the entire store at once?)

That sounds like virtual memory which does not really matter. Most haskell programs have that set to 1TB.

1 Like

The OOM killer got it eventually, but in mainline operation it fluctuated between 11GiB and 40GiB of real memory.

I’m getting an odd error, “incorrect output”:

$ nix store make-content-addressed --all
error: derivation '/nix/store/hi3j7zj1ig2x1qnva8csxmq4ry88gzym-hscolour-1.24.4.drv' has incorrect output '/nix/store/fmas9wlcibxc59c7dq7z6rhxjz57gfdg-hscolour-1.24.4-data', should be '/nix/store/v87cgbhgbxsmw2dzs0a3ysgd8hbcp4qd-hscolour-1.24.4-data'

nix store verify --all is happy though?

Also,

$ nix --version
nix (Nix) 2.11.0
$ nix show-derivation /nix/store/hi3j7zj1ig2x1qnva8csxmq4ry88gzym-hscolour-1.24.4.drv
error: path '/nix/store/hi3j7zj1ig2x1qnva8csxmq4ry88gzym-hscolour-1.24.4.drv' is not a valid store path

but the file exists, so presumably the store didn’t add it to the db?

1 Like

I am having the exact same problem:

nix store make-content-addressed --all
error: derivation '/nix/store/7z63ndpzk3rvp3hhqp5mb15g7yn75mga-linux-headers-static-5.19.drv' has incorrect output '/nix/store/cz934gbnggvh1hi3gcfqw8gppgh38hrm-linux-headers-static-5.19', should be '/nix/store/viwsyykm4lninzhpi43gc7dv9d656ql4-linux-headers-static-5.19'

> nix show-derivation /nix/store/7z63ndpzk3rvp3hhqp5mb15g7yn75mga-linux-headers-static-5.19.drv
error: path '/nix/store/7z63ndpzk3rvp3hhqp5mb15g7yn75mga-linux-headers-static-5.19.drv' is not a valid store path
> exa -l /nix/store/7z63ndpzk3rvp3hhqp5mb15g7yn75mga-linux-headers-static-5.19.drv
.r--r--r-- 3,0k root  1 Jan  1970 /nix/store/7z63ndpzk3rvp3hhqp5mb15g7yn75mga-linux-headers-static-5.19.drv

I have tested config.contentAddressedByDefault = true; in my flake based setup on three x86_64 machines last week and reverted again today. My experience was mixed:

So all in all I was really happy to see it worked, but in its current form I cannot use it in production.

4 Likes

I get the following error with config.contentAddressedByDefault = true;

error: opening file '/nix/store/naf68a3znxdc8zpwba1prwiyjqz7z09v-bash-5.1-p16.drv': Too many open files                                                                                                            
corrupted double-linked list                                                                                                                                                                                       

this went away after a few nixos-rebuilds. Now I am crashing with

error: derivation '/nix/store/zpc7fx6jkycw57jnigvi7b7dgyckcdnl-flac-1.4.1.drv' doesn't have expected output 'bin' (derivation-goal.cc/resolvedFinished,realisation)                                                

I had this hundreds of times. Just retry the build and it should hang at the next output not found

now I am failing with

error: creating pipe: Too many open files                                                                                                                                                                          
corrupted double-linked list