Content-addressed Nix − call for testers

Hmm, I had high hopes for this suggestion, but it doesn’t seem to have changed anything for me. I just tried running my main build with the change, and I still got it conking out with:

$ nix build [...]
error (ignored): error: closing file descriptor 18: Bad file descriptor
error: substitution of 'sha256:20545e6f79997c892085ecb5f7181b9b2b6130f2d96ff89d1c8962ffe82fc261!out': read failed: Bad file descriptor

Any other thoughts for what could be going wrong here?

It’s worth trying again with an increased limit; I can’t imagine there’s much harm in doing so. I thought to increase the limit based on this where ::close() fails with EBADF, which I suspected was caused either by hitting the FD limit or by a race condition. Increasing the limit fixed the issue for me so I didn’t look into the latter but it does seem like a more likely scenario.

Excited to try this out! However I just added experimental-features = ca-derivations to my /etc/nix/nix.conf and rebooted, and I’m not seeing that the feature is actually enabled:

❯ cat /etc/nix/nix.conf
build-users-group = nixbld
trusted-users = root ubuntu
experimental-features = ca-derivations

❯ nix show-config | grep experimental
experimental-features = nix-command

❯ nix-shell -p nix-info --run 'nix-info -m'
 - system: `"x86_64-linux"`
 - host os: `Linux 5.13.0-1019-aws, Ubuntu, 20.04.4 LTS (Focal Fossa)`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.6.1`
 - channels(ubuntu): `"home-manager, nixgl"`
 - channels(root): `"nixpkgs"`
 - nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixpkgs`

What am I doing wrong here?

1 Like

Maybe you have experimental-features = nix-command set in you user’s nix.conf? (~/.config/nix/nix.conf). That would override the global setting.
If that’s the case, you can keep both by setting extra-experimental-features instead to append to it rather than overwrite it.

2 Likes

Indeed that’s the issue! This seems like a significant footgun to me. There’s no warning that settings are getting overridden or anything.

Anyways, thanks for the pointer!

i am trying to build the flake below but i get

$ nix build --extra-experimental-features 'ca-derivations'
warning: Using saved setting for 'store = /home/artturin/ca-flake/ca-store' from ~/.local/share/nix/trusted-settings.json.
error: experimental Nix feature 'ca-derivations' is disabled; use '--extra-experimental-features ca-derivations' to override
(use '--show-trace' to show detailed location information)
$ nix --version
nix (Nix) 2.8.0pre20220322_d5d4d98
{
  inputs = {
    nixpkgs.url = "github:NixOS/nixpkgs/master";
  };

  nixConfig = {
    store = "/home/artturin/ca-flake/ca-store";
 };

  outputs = { self, nixpkgs }:
  {

    nixosConfigurations.vm = nixpkgs.lib.nixosSystem {
      system = "x86_64-linux";
      modules = [
        ({ pkgs, lib, ... }: {
          nixpkgs.config.contentAddressedByDefault = true;

          users.mutableUsers = false;
          users.users.root = {
            password = "root";
          };
          users.users.user = {
            password = "user";
            isNormalUser = true;
            extraGroups = [ "wheel" ];
          };
        })
      ];
    };
    # So that we can just run 'nix run' instead of
    # 'nix build ".#nixosConfigurations.vm.config.system.build.vm" && ./result/bin/run-nixos-vm'
    packages.x86_64-linux.default = self.nixosConfigurations.vm.config.system.build.vm;
    apps.x86_64-linux.default = {
      type = "app";
      program = "${self.defaultPackage.x86_64-linux}/bin/run-nixos-vm";
    };
  };
}

I had to add ca-derivations to the global config for it to work.

Jep, IIRC the daemon needs to have it enabled too, and it won’t automatically enable it when running the CLI. The error message is confusing, unfortunately.

in my case I have MacOS Monterey M1 processor and set

 experimental-features = nix-command
 extra-experimental-features = flakes

nix-building a different project but seeing bad file descriptor error no mention of too many open files tho. Could it be related to this setting?

I was just giving ca-derivations a try, and after adding experimental-features = ca-derivations, I had to add the outputHashAlgo stuff, and after that I discovered that there is no out path (well there is one, but it’s “”). Eventually I need to know the out path in order to make a link to it! I noticed that nix-store -r will print it out, and it makes sense that you can no longer predict it in advance, but is there a programmatic way to know what it is, aside from run nix-store -r and look at its stdout? I’m foreseeing another problem with that, which is that nix-store -r with multiple drv inputs doesn’t give any guarantee about the order of outputs (that I know of), which means I can’t tell which output corresponds to which input.

Also, should this stuff be updated on Ca-derivations - NixOS Wiki? I see some references to some bugs and fixes but I’d be interested to know if I have to upgrade to a particular version of nix beyond 2.4 to get those, or if there are still outstanding known problems.

1 Like

Here is my failed attempt to become remote adventurer:

; nix shell \
  --experimental-features 'nix-command' \
   --store /tmp/my-ca-nix \
  --trusted-public-keys '' \
  --substituters https://cache.ngi0.nixos.org/ \
   /nix/store/yvk5yl9fid0zlxqk1xvvzn787d8gbh00-emacs-27.2 \
   -c emacs --version
error: hash mismatch importing path '/nix/store/dlq98171jblxqcigqhldh64mcra9bxld-attr-2.4.48';
         specified: sha256:1wa8sg66qzbj2mx4p3kkv3gci3hvwvarxxc9398pwgc1gcfxvcqq
         got:       sha256:0rj3imwbccrvs4w4nwsmcybfhxgyr4w98az6if6bm1xbxkqd6sas
error: some references of path '/nix/store/yvk5yl9fid0zlxqk1xvvzn787d8gbh00-emacs-27.2' could not be realised
; nix --version
nix (Nix) 2.9.1

Any ideas what I am doing wrong?

1 Like

Nothing actually, I did.
The work on CA derivations exhibited a bug in the way the Nar hashes were computed and we had to fix it, which changed the hashes of the CA drvs outputs (so the very old stuff built before the fix like that one are not valid anymore). I’ve re-pushed the closure with the correct hashes, so once all the caches are cleared is should be better.

Or (and I’ll update the first post for that) you can try a more freshly built emacs with the correct hashes like /nix/store/ih1ish76pdmzcqbdcdd09z007f6bxjrf-emacs-28.1

2 Likes

Thanks. With updated hashes, it worked for me.

1 Like

I am getting the following for every jobset on a ca hydra. Anyone any idea?

in job ‘bind’:
error: not an absolute path: 'ARRAY(0x5781bc0)/3f5cjwlb77y0zgk82i20d41spr5368wl-microvm-cloud-hypervisor-bind.drv'
1 Like

I’m having trouble getting ca-derivations to work with a remote cache. Specifically, they never hit the cache. From looking at http logs it looks like they need to do an extra query like /realisations/sha256:d0cba4146da895178856db309b41c3a10a25acbbaa905469960605608887e409!out.doi presumably to get the drv->out mapping, but those never get uploaded.

From the nix source, it looks like registerDrvOutput is supposed to upload them. It’s called in many spots, but the most promising is store-api.cc copyPaths, since that is what happens when you upload via nix copy --to .... A bit of debug logging and it looks like it only does the register when the RealisedPath::raw has a Realisation, but they all seem to have OpaquePath instead. nix realisation info --json seems to indicate that both CA and non-CA out paths are opaquePath. The comment implies OpaquePath is non-derived stuff like nix add-to-store but that’s clearly not the case because derived paths have it too.

Before I dig further, is it documented (or can someone explain?) how those /realisations/.. are supposed to be uploaded, so I can figure out why it’s not happening?

Also, lately we’re seeing a lot of segfaults from nix, sometimes accompanied by double free or corruption (out), sometimes not. Also some error: coroutine has finished. I don’t know if it’s related to enabling ca-derivations or just coincidence, maybe someone else has noticed something like this?

thanks!

2 Likes

Yes, that’s something that dreadfully needs to be documented.

Indeed, the realisations are copied by nix copy. Passing in a store path to nix copy will always treat it as an opaque path (for the simple reason that you can have several realization for the same path, and Nix can’t know which one you’re interested in), but you can pass it any installable, and if that installable is something that evaluates to a derivation, then Nix will copy both the outputs of that derivation and the associated realizations. And in particular you can nix copy /nix/store/...-foo.drv to copy both the output paths of that derivation and the realizations that produced them.

If you’re using a post-build-hook to copy stuff to the binary-cache, you can access $DRV_PATH in it to do that (like it’s done in the nix test suite).

Mh… There used to be some issues with that a couple releases ago (the coroutine one in particular was very symptomatic of a specific bug), but IIRC these got fixed, and I don’t recall anything like that recently

Oh, I think this got it for me! I think there are some more details, which also explains why it didn’t work for me even though I already do upload the drv. Let me know if this is accurate:

Back in the old days, nix copy /nix/store/path.drv copied the drv. Now, it does not copy the drv, but copies its outputs, along with the “doi” link from drv->outs. nix copy --derivation now copies only the .drv, but not the outs and not the link.

If that is the case, then the upload code should now upload only the drv, but both with and without the --derivation flag.

BTW, is there some documentation about what an “installable” is? This seems to be a new term that all the man pages assume I know about.

[ segfaults ]

Hmm sounds like upgrading to 2.8 or 2.9 may resolve this then. Thanks!

Hmm there’s one other detail I didn’t think about. That is, when you do a remote build, even though the remote builder is building using the drv, somehow the drv file itself does not wind up in the builder’s /nix/store. I don’t know why this is and it surprised me when I discovered it, but it is the case. This means that builders can no longer upload their outputs directly, but they have to copy the outputs back to the coordinator, which then has to upload, since only the coordinator has the drv. This is workable, but less efficient, especially since the major upload cost is compression, and we have (as is probably typical) small coordinators and big builders. So, another thing to be aware of when switching to ca-derivations.

Actually the real reason I was uploading from builders is that a remote build bug truncates build logs around 8k and so the complete log is only available on the builder to upload, but that’s a whole other thing…

That’s indeed a bummer. Technically there’s no need to have the drv to copy stuff around, but the cli interface doesn’t allow anything else afaik.

Technically you can probably save most of the bandwidth by copying the output paths from the builders and the realisations from the coordinator. That’s a bit more convoluted than what it should be of course :confused:

The below is taking 225GB 387GB of memory according to macOS, and rising:

> nix store make-content-addressed --all                                                                                      
warning: dumping very large path (> 256 MiB); this may run out of memory

If my machine weren’t a maxed out Macbook Pro (it’s an M1), this would have crashed long ago. I’m on nix (Nix) 2.10.3. Assuming this will fail, does anyone have any hints for debugging what’s going on? (or is it simply expected to fail when run on the entire store at once?)