Lazy loading of store paths?

Consider building derivation A on remote builder, maybe as part of your CI, and A’s result will mention, and thus depend, on B, which is already present in your substituter’s store path. Clearly, nix will have to download B on the builder before building A.

But often, that is silly. If A is a nixos configuration, and B is the apache binary, then very likely the builder can assemble the nixos configuration without having the content of /nix/store/B – it just needs to know the name of that path, and that’s already known from the derivations. And even if the builds needs some files of the the dependency, it rarely needs all of them.

So it seems silly to have to copy all of B to the builder first. I see that kind of unnecessary and time-consuming transfers a lot in the logs of, say, builds on garnix.io.

I am wondering about two variants to improve this:

  • Network-mounted /nix/store. Most network file systems (NFS, Glusterfs, etc.) are quite good at only loading store contents on demand.

    So in the case of a central nix store server and multiple builders around it, the builders could mount the full /nix/store of the central store, read-only, have an overlay FS on top of it for local modifications. This way, the complete store seems to be always available without extra work, and the build can start right away. If the build needs file from the dependencies, they are loaded individually and on demand (and then, if the file system is good, cached sensibly).

    Of course care needs to be taken that the underlying store doesn’t change under the overlay fs. Rough idea: User btrfs or zfs on the server and only export immutable snapshots to the builders.

  • A custom lazy-substitutor-fs, overlaying /nix/store (not necessarily /nix/var).

    This could be a FUSE-implemented filesystem, with a bit of integration with nix. When nix would otherwise realize a nix store from a substitutor, it would just tell the filesystem that this patch is known to exist, and the filesystem would pretend it’s there, and only when some process actually accesses something within, it would download the NAR as usally, unpack into the underlying real /nix/store, and everything proceeds normally from this point on.

    This wouldn’t support partial store paths, but the advantage is that it would work with ordinary nix caches. I am sure your nix Github Action building your nixos configuration downloads more from cache.nixos.org than it has to :slight_smile:

Did anyone try something along these lines already?

Here is one entrance to your rabbit hole: NixOS with shared nix store among compute nodes

1 Like

The idea is that this would be enabled by Tvix and a builder lazily fetching only the (parts of) store paths that it needs while building.

We do have a fuse filesystem, and partial substitution, and can boot VMs backed off it already, so it’s quite stable.

I spoke a bit about this at NixCon, but the builder parts are still very much in development (so far we only have a PoC)

If you’re interested, reach out!

3 Likes

In nixbuild.net, we do the lazy loading of nar files (or parts of nar files) during builds, by running a FUSE file system that fetches content on demand. We never do any “unpacking” of nar files in nixbuild.net, instead we create an index of the nar files to allow random access. This has been working really well for several years.

We don’t (yet) do lazy substitution, though. All substitutions of build inputs are done before a build is started, so we know we have the nar files available in our storage. However, we are working on supporting lazy substitution too, since it is fairly simple to add that to our current implementation. I will try to report back once we have the lazy substitution in place, to give you an idea on the kind of performance improvements it can bring.

1 Like

That’s pretty exciting, looking forward to some statistics about this!