Speeding up similar builds with OCI layers

vandenoever · May 12, 2024, 8:16am

Whenever I’m writing a package with Nix, a lot of time is spent waiting for 1) evaluation of the expression, 2) rerunning the full list of commands in the derivation that results from the expression. E.g. a change in the installPhase requires a new unpacking and compilation. Since each of these commands is supposed to give the exact same result each time, time can be saved by taking a snapshot after each command.

Docker and Podman store the result of each command in a Dockerfile in a separate layer. When a Dockerfile is changed, the commands in it are only run from the point where the change happened.

Nix has no such mechanism that I know of and I wonder why.

nim65s · May 12, 2024, 9:28am

I think that is because any change in the derivation will change its output path, and that output path could be used anywere. For example, if you change only installPhase, there will be a change in $out, and if $out is used in patchPhase, we must redo this patchPhase, and can’t directly jump to installPhase.

vandenoever · May 12, 2024, 10:51am

If the variable are expanded, the first difference due to $out will usually be only in the installPhase. As a thought (or real) experiment, put the expanded commands seen with nix derivation show nixpkgs#firefox in a Dockerfile, then run that. If $out is used early, the developer will see the full rebuilds and has an incentive to move the use of $out to the end of the build.

polygon · May 12, 2024, 12:55pm

I’d be very much in favor of a lot more tools that assist the creators of Nix derivations. I know the pain of debugging the installPhase of a derivation and having to do pointless recompiles all the time. Snapshotting Nix builds would be a great feature during development. Some other pain points I encounter regularly:

Inspecting the Nix build folder is needlessly complicated, there is --keep-failed which sometimes even seems to work. But why is it basically impossible to just say “please keep the build folder”, no matter if there was a failure or success? I even inserted infinite waits into my builds just so that I can inspect that damn folder.
It is very hard to get an interactive shell inside of a Nix builder, for reasons that fully elude me. If a genericBuild inside a devShell works but the actual Nix build is not, you usually have a big issue. I once had an issue where a part of the build would setup a filesystem watcher to some $HOME-subfolder and I had a lot of fun debugging that one. There is some tools that try to do that, but I had very limited success with these.

vandenoever · May 12, 2024, 7:07pm

With podman mount it’s possible to mount an OCI image to the filesystem to inspect it. If nix would (have an option to) create a layer like Docker does at each instruction, the edit-build cycles would speed up a lot and the created image could be mounted, always, to inspect the intermediate state.

Having a quick turnaround when editing .nix files like one has with Dockerfiles would make the writing of these files more bearable and simply a chore instead of an exercise in patience.

Atemu · May 12, 2024, 9:06pm

The conceptual issue with the approach you suggest is that a Nix build is usually entirely unlike a docker build. The only commonality is that they run pre-determined programs inside of a sandbox.

Docker build is snapshot based. You take the entire filesystem state from the root and snapshot/store it with all its metadata and everything after each and every command execution.

In a “standard” Nix build OTOH, builds are typically ran in some temporary location that is not included in the output. Outputs are then explicitly installed into the output path (i.e. make install PREFIX=$out). All commands required to produce that output path are ran inside the same sandbox instance.

Storing the intermediary state as Nix drvs is theoretically possible but runs into practical limitations:

There is no such concept as basing one step on the output of another as you cannot write to the other store paths. You’d have to copy the state from the previous output path (ro) to the new one (rw) for every command/phase ran.
You cannot store arbitrary metadata in a nix store path. Mtimes, permissions etc. would all be stripped. This would already break make as it relies on mtimes to know which target have and haven’t been reached.
You’d accumulate quite a few in-between derivations. A handful per actual build. They shouldn’t be in the runtime closure (though that is not guaranteed) but it still creates a lot of garbage to be cleaned up.

You’re not the first person to desire the ability to introspect failing builds. See i.e. Why is there no way to run `nix-shell` in a chroot and without the user's .bashrc? · Issue #903 · NixOS/nix · GitHub. I also recently read about plans to migrate the Nix sandbox to bubblewrap which would ease the implementation of such a feature somewhere but I can’t remember where.

Until then, manually running the phases inside of the package’s nix-shell gets you 90% of the way there. The only thing it can’t do is reproduce sandbox issues like undesired internet or filesystem access.

sorrel · May 12, 2024, 9:32pm

A tangential point: people reading this thread may be interested in breakpointHook, which lets you get a shell inside the sandbox if a build fails (or e.g. if you put a call to false at the point you want to inspect): breakpointHook | nixpkgs

I strongly agree that it would be useful to have some way of resuming from partway through a build, though. installPhase failures are hellish to debug.

vandenoever · May 13, 2024, 6:04am

The idea is to make nix build snapshot builds like docker/podman build does. In nix build the order is:

evaluate a Nix expression into a derivation
create a temporary sandbox
run the commands from the derivation
3.1 unpack
3.2 patch
3.3 build
3.4 install
copy the result into /nix/store

When 3.4 fails, the storage of the temporary container is deleted and the whole process has to start again. This can take very long.

To avoid redoing that work, a mechanism like podman/docker can be used for step 2 and 3.

evaluate a Nix expression into a derivation
create a temporary sandbox with docker/podman storage
run the commands from the derivation and store the results in a layer e.g. in /var/lib/podman.
3.1 unpack into layer 1
3.2 patch into layer 2
3.3 build into layer 3
3.4 install into layer 4
copy the result into /nix/store

Now when the installPhase fails and the build needs to rerun, it does:

evaluate a Nix expression into a derivation
create the temporary sandbox with cached docker/podman storage from /var/lib/podman.
run the commands from the derivation and store the results in a layer (not in /nix/store)
3.1 skip to layer 1
3.2 skip to layer 2
3.3 skip to layer 3
3.4 install into a new layer 4
copy the result into /nix/store

The logic of step 3 is already implemented by podman and docker and explained here.

These intermediary states would not be stored in /nix/store. They would be stored in a separate build cache. That build cache indeed ages into garbage that needs to be cleaned up regularly. I think that’s a worthwhile trade off for saving a lot of time waiting on Nix running build commands again and again while developing an expression.

lewo · May 13, 2024, 6:37am

The difficulty with your proposal is to know when to invalidate the cache. Each step of a derivation build share a context, which is defined by your derivation.

For instance, you can define a environment variable in your derivation. This environment variable can be used during each of these phases but there is no explicit dependency between a particular phase and this environment variable. So, each time you change this environment variable, you need to invalidate all layers. This is the same behavior for buildInputs, compile flags, sources… At then end, you will almost always have to invalidate this new cache.

polygon · May 13, 2024, 8:11am

You are definitely not wrong, but this kind of issue is often present when caching build results and even tools like Make don’t always get it right (probably often due to fault of the Makefile writers). If I start meddling with the build recipe and then resume a previously snapshotted build, I am fully aware that the result I get from that might not be the same than starting a fresh build with that recipe. I am aware that I will need to make another build from scratch at the end to make sure I didn’t screw anything up.

However, for me, the important thing is: I still may have fixed all the installPhase issues and the error accidentally introduced in a fraction of the time because an iteration now takes 1 minute instead of 60 where the project is senselessly recompiled.

This is supposed to be a dev-tool, not something you use in regular builds. It has to be useful and it has to save time, not be perfect. Please give me a hammer, don’t spend time making sure I can’t hit my thumb with it.

Atemu · May 13, 2024, 11:22am

I now understand your proposal better. You don’t want to build this on the existing mechanisms, you want a new mechanism specifically for the purpose of handling the “build state”.

A major problem with the proposal is that the phases (unpack, patch, build etc.) aren’t a real thing as far as Nix is concerned; their concept only exists within the stdenv. From Nix’s perspective, a drv is the execution of one program with a certain set of arguments and environment variables. The stdenv assembles a big script and that script is the program to be executed by Nix.

You’d have to change this (quite fundamental) definition of what a drv is in Nix in order to achieve what you propose. Not impossible but quite difficult.

Melkor333 · May 13, 2024, 11:39am

The stdenv in it’s current state is IMO a big issue. I’ve been looking at it to try to replace bash with the mostly backwards compatible oils osh. The stdenv relies on a lot of very ugly bash code.

It’s probably going to be extremely hard to incorporate some kind of sandboxing in-between these stdenv-only stages.
Apart from that being hard, it would mean that docker needs to be in the stdenv, which is definitively nothing you’d want.
I guess splitting the stages into multiple derivations - and therefore giving control back to nix - might be the “cleanest” way to solve this, but I can imagine that this requires some crazy overhaul of the stdenv, starting with having to understand all the bash magic happening and being able to change this kind of code without breaking packages…

Atemu · May 13, 2024, 11:51am

While replacing bash with something a bit more modern is something I support, it is unrelated to this proposal.

Their proposal is to not have these coarse phases defined by the stdenv (and therefore inside the same Nix build) but by Nix itself. Nix could very well enforce separation between phases and snapshot build state just like docker/podman do; it’s allowed to do such “impure” things. Whether you use those projects’ code to do so or build it yourself is an implementation detail.

I don’t think it’d require much from the nixpkgs side of things (relatively speaking). All Nixpkgs’ stdenv would roughly have to do would be to supply a list of bash scripts (one for each phase) rather than just one via a new drv API.

Melkor333 · May 13, 2024, 12:06pm

My note on replacing bash was just a “that’s why I did look into it” rather irrelevant, you’re right.

What I’m trying to say is, that there is a lot of bash code in place to magically figure out what kind of phases exist and how they have to be run. Putting this into Nix code (which can then be properly split into separate derivations) would certainly take its time.

kampka · May 13, 2024, 1:26pm

A major problem with the proposal is that the phases (unpack, patch, build etc.) aren’t a real thing as far as Nix is concerned; their concept only exists within the stdenv.

While this is true, there is also not a major reason why they couldn’t be a “real thing”. The builder defining these phases is the “default builder”, which, as the name suggests, is a pluggable thing.
I believe what the OP is looking for is not actually that hard to achieve, you just need to shift your perspective a little from the point where “one build” is equal to “one derivation” when there is no real reason why this dependency has to exist. If you instead construct one derivation per phase, using the previous one as the input to the next one, you’d produce pretty much what is asked for: a “layered” approach to building packages. You’d just need to put some elbow grease into making parts of the builder bend that way for your case. In essence, all the *-unwrapped packages already work this way.

Atemu · May 13, 2024, 2:17pm

This is what I alluded to in my initial comment and discussed the downsides of.

They do not. They build a regular output path without storing any of the build state.

Wrappers merely add some changes ontop of another output path that is usually already in a “finished” state but requires some glue.

vandenoever · May 13, 2024, 8:05pm

If I read the above discussion correctly, the default builder default-builder.sh calls genericBuild which does all the building by calling buildPhase, postInstall, etc. This is different from docker/podman where each line from a Dockerfile is called separately and the environment variables are handle explicitly and selectively. The elaborate variables and function definitions in the derivation are indeed harder to run in blocks. If just the results from the top-level commands would be snapshotted, it would be a big win and save a lot of rerunning of commands.

Making a proof of concept to prove the idea would need some elbow-grease indeed.

iwanb · May 14, 2024, 7:56pm

@Infinisil covered some existing options to do incremental builds in The Nix Hour: https://www.youtube.com/watch?v=1tEur9Tzv9c&list=PLyzwHTVJlRc8yjlx4VR4LU5A5O44og9in&index=7

As pointed out before, the build phases are opaque to nix, so nix itself couldn’t implement what you propose, unless you’d split the build phases into their own derivations (“package”). @Infinisil tries to do that in the video (not sure if he was successful though).

You could however implement a snapshot and cache mechanism in the builder of mkDerivation.

You need a way to make a snapshot, to restore a snapshot, and to decide when you should restore snapshots (the cache invalidation problem…). For example you could create a snapshot by creating an archive of the build outputs and dumping the state of bash (all variables, e.g. declare -p), restore the snapshot by extracting the archive and loading the variables, and use a snapshot (instead of running the phase) based on the inputs of the derivation + the phase name + the phase code.

The snapshots would have to be stored in a location accessible from the sandbox that nix builds in, that can be done with the sandbox-paths option, which requires to build in impure mode: nix.conf - Nix Reference Manual

A similar mechanism is used for CCache support for derivations: CCache - NixOS Wiki

Going further with your idea, you could even replace the builder with some container tooling (buildah?) which really creates layers for each phase and stores them somewhere, so you’d get the same behavior as a Dockerfile, but that couldn’t work for existing packages.