Hmm, I had high hopes for this suggestion, but it doesn’t seem to have changed anything for me. I just tried running my main build with the change, and I still got it conking out with:
$ nix build [...]
error (ignored): error: closing file descriptor 18: Bad file descriptor
error: substitution of 'sha256:20545e6f79997c892085ecb5f7181b9b2b6130f2d96ff89d1c8962ffe82fc261!out': read failed: Bad file descriptor
Any other thoughts for what could be going wrong here?
It’s worth trying again with an increased limit; I can’t imagine there’s much harm in doing so. I thought to increase the limit based on this where ::close() fails with EBADF, which I suspected was caused either by hitting the FD limit or by a race condition. Increasing the limit fixed the issue for me so I didn’t look into the latter but it does seem like a more likely scenario.
Excited to try this out! However I just added experimental-features = ca-derivations to my /etc/nix/nix.conf and rebooted, and I’m not seeing that the feature is actually enabled:
Maybe you have experimental-features = nix-command set in you user’s nix.conf? (~/.config/nix/nix.conf). That would override the global setting.
If that’s the case, you can keep both by setting extra-experimental-features instead to append to it rather than overwrite it.
$ nix build --extra-experimental-features 'ca-derivations'
warning: Using saved setting for 'store = /home/artturin/ca-flake/ca-store' from ~/.local/share/nix/trusted-settings.json.
error: experimental Nix feature 'ca-derivations' is disabled; use '--extra-experimental-features ca-derivations' to override
(use '--show-trace' to show detailed location information)
$ nix --version
nix (Nix) 2.8.0pre20220322_d5d4d98
{
inputs = {
nixpkgs.url = "github:NixOS/nixpkgs/master";
};
nixConfig = {
store = "/home/artturin/ca-flake/ca-store";
};
outputs = { self, nixpkgs }:
{
nixosConfigurations.vm = nixpkgs.lib.nixosSystem {
system = "x86_64-linux";
modules = [
({ pkgs, lib, ... }: {
nixpkgs.config.contentAddressedByDefault = true;
users.mutableUsers = false;
users.users.root = {
password = "root";
};
users.users.user = {
password = "user";
isNormalUser = true;
extraGroups = [ "wheel" ];
};
})
];
};
# So that we can just run 'nix run' instead of
# 'nix build ".#nixosConfigurations.vm.config.system.build.vm" && ./result/bin/run-nixos-vm'
packages.x86_64-linux.default = self.nixosConfigurations.vm.config.system.build.vm;
apps.x86_64-linux.default = {
type = "app";
program = "${self.defaultPackage.x86_64-linux}/bin/run-nixos-vm";
};
};
}
Jep, IIRC the daemon needs to have it enabled too, and it won’t automatically enable it when running the CLI. The error message is confusing, unfortunately.
I was just giving ca-derivations a try, and after adding experimental-features = ca-derivations, I had to add the outputHashAlgo stuff, and after that I discovered that there is no out path (well there is one, but it’s “”). Eventually I need to know the out path in order to make a link to it! I noticed that nix-store -r will print it out, and it makes sense that you can no longer predict it in advance, but is there a programmatic way to know what it is, aside from run nix-store -r and look at its stdout? I’m foreseeing another problem with that, which is that nix-store -r with multiple drv inputs doesn’t give any guarantee about the order of outputs (that I know of), which means I can’t tell which output corresponds to which input.
Also, should this stuff be updated on Ca-derivations - NixOS Wiki? I see some references to some bugs and fixes but I’d be interested to know if I have to upgrade to a particular version of nix beyond 2.4 to get those, or if there are still outstanding known problems.
Nothing actually, I did.
The work on CA derivations exhibited a bug in the way the Nar hashes were computed and we had to fix it, which changed the hashes of the CA drvs outputs (so the very old stuff built before the fix like that one are not valid anymore). I’ve re-pushed the closure with the correct hashes, so once all the caches are cleared is should be better.
Or (and I’ll update the first post for that) you can try a more freshly built emacs with the correct hashes like /nix/store/ih1ish76pdmzcqbdcdd09z007f6bxjrf-emacs-28.1
I’m having trouble getting ca-derivations to work with a remote cache. Specifically, they never hit the cache. From looking at http logs it looks like they need to do an extra query like /realisations/sha256:d0cba4146da895178856db309b41c3a10a25acbbaa905469960605608887e409!out.doi presumably to get the drv->out mapping, but those never get uploaded.
From the nix source, it looks like registerDrvOutput is supposed to upload them. It’s called in many spots, but the most promising is store-api.cccopyPaths, since that is what happens when you upload via nix copy --to .... A bit of debug logging and it looks like it only does the register when the RealisedPath::raw has a Realisation, but they all seem to have OpaquePath instead. nix realisation info --json seems to indicate that both CA and non-CA out paths are opaquePath. The comment implies OpaquePath is non-derived stuff like nix add-to-store but that’s clearly not the case because derived paths have it too.
Before I dig further, is it documented (or can someone explain?) how those /realisations/.. are supposed to be uploaded, so I can figure out why it’s not happening?
Also, lately we’re seeing a lot of segfaults from nix, sometimes accompanied by double free or corruption (out), sometimes not. Also some error: coroutine has finished. I don’t know if it’s related to enabling ca-derivations or just coincidence, maybe someone else has noticed something like this?
Yes, that’s something that dreadfully needs to be documented.
Indeed, the realisations are copied by nix copy. Passing in a store path to nix copy will always treat it as an opaque path (for the simple reason that you can have several realization for the same path, and Nix can’t know which one you’re interested in), but you can pass it any installable, and if that installable is something that evaluates to a derivation, then Nix will copy both the outputs of that derivation and the associated realizations. And in particular you can nix copy /nix/store/...-foo.drv to copy both the output paths of that derivation and the realizations that produced them.
If you’re using a post-build-hook to copy stuff to the binary-cache, you can access $DRV_PATH in it to do that (like it’s done in the nix test suite).
Mh… There used to be some issues with that a couple releases ago (the coroutine one in particular was very symptomatic of a specific bug), but IIRC these got fixed, and I don’t recall anything like that recently
Oh, I think this got it for me! I think there are some more details, which also explains why it didn’t work for me even though I already do upload the drv. Let me know if this is accurate:
Back in the old days, nix copy /nix/store/path.drv copied the drv. Now, it does not copy the drv, but copies its outputs, along with the “doi” link from drv->outs. nix copy --derivation now copies only the .drv, but not the outs and not the link.
If that is the case, then the upload code should now upload only the drv, but both with and without the --derivation flag.
BTW, is there some documentation about what an “installable” is? This seems to be a new term that all the man pages assume I know about.
[ segfaults ]
Hmm sounds like upgrading to 2.8 or 2.9 may resolve this then. Thanks!
Hmm there’s one other detail I didn’t think about. That is, when you do a remote build, even though the remote builder is building using the drv, somehow the drv file itself does not wind up in the builder’s /nix/store. I don’t know why this is and it surprised me when I discovered it, but it is the case. This means that builders can no longer upload their outputs directly, but they have to copy the outputs back to the coordinator, which then has to upload, since only the coordinator has the drv. This is workable, but less efficient, especially since the major upload cost is compression, and we have (as is probably typical) small coordinators and big builders. So, another thing to be aware of when switching to ca-derivations.
Actually the real reason I was uploading from builders is that a remote build bug truncates build logs around 8k and so the complete log is only available on the builder to upload, but that’s a whole other thing…
That’s indeed a bummer. Technically there’s no need to have the drv to copy stuff around, but the cli interface doesn’t allow anything else afaik.
Technically you can probably save most of the bandwidth by copying the output paths from the builders and the realisations from the coordinator. That’s a bit more convoluted than what it should be of course
The below is taking 225GB 387GB of memory according to macOS, and rising:
> nix store make-content-addressed --all
warning: dumping very large path (> 256 MiB); this may run out of memory
If my machine weren’t a maxed out Macbook Pro (it’s an M1), this would have crashed long ago. I’m on nix (Nix) 2.10.3. Assuming this will fail, does anyone have any hints for debugging what’s going on? (or is it simply expected to fail when run on the entire store at once?)