Incremental builds

I’ve seen a few projects from web searching that claim to help with more granular builds than the package level as is default in Nix, however they seem to be in discontinued/inactive states. Dealing with large packages that take hours to build is quite annoying. What do people currently use/what are some standard tools that are used to give incremental builds for Nix?

There’s also Nix + Bazel = fully reproducible, incremental builds but Nix and Bazel seem to be at odds with each other, I’d like to know if there are pure Nix solutions.

9 Likes

For development, I usually just use nix to provide a dev environment. And let the normal toolchain deal with incremental compilation.

For CI/CD, it could be nice to leverage nix as well, but in those environments you almost want to avoid incremental builds to validate that it “works” from a clean state.

4 Likes

Well, if the fragments are built as Nix builds out of subsets of the build tree, something like nix-make could even reach reliability of clean rebuild, but that probably requires either magic static analysis, or build system cooperation, or upstream just using nix-make

3 Likes

Would it be possible to make the various phases “checkpoints”? For instance if I change something in fixupPhase I don’t want to run buildPhase again.

2 Likes

We will need to develop new incremental builders for each language. This is something I am hoping to tackle in the coming year. PM me if you are interested in participating in that problem space.

If Nix is being used as a build system, it will also have to integrate with the IDE, Language server, debugger, … There are also a few language changes and performance improvements to Nix itself that will be needed in order to compete with Bazel and other build systems out there.

In the meantime, the best option is to create a shell.nix and lean on the native language tools during development as Jon mentioned.

14 Likes

I recall not being able to run patchPhase manually when I use a Nix shell, what I am I doing wrong?

1 Like

One of the issues is that if you change the derivation such that it evaluates to a different value (as happens when changing the fixupPhase), the (hash in the) output path changes.

This can be resolved by breaking up parts of the build as separate Nix derivations. On the other hand, the risk of doing that is that evaluation itself becomes expensive because there’s simply more Nix expressions to evaluate.

1 Like

You should be able to, if you don’t use mkShell. But I think you’re going down the wrong path if you have to do something like :

  shellHook = ''
    runHook unpackPhase
    cd $sourceRoot
    runHook patchPhase
  '';

to get your “environment” up… as you’re mutating the code base. Ideally the codebase would be able to inherit certain environment variables which nix-shell could provide.

hello neighbors : )

see my working prototype for cmake/ninja builds in
splitBuildInstall: split buildPhase and installPhase for large packages
im using this, to reduce the installPhase feedback loop from 2 hours to 2 minutes : )
(one minute is wasted to copy 3 GByte of cached source + build files)

related ycombinator: Bazel – Correct, reproducible, fast builds for everyone

Conceptually, your build results should be a pure function of your source tree.

it’s designed so that code generators should act as pure functions from input files to output files … Writing generators to run this way is kind of a pain, actually, sort of like writing code to run in a sandbox. Also, the generators themselves must be checked in, and often built from source. But we consider the results worth it.

Build artifacts are cached … The problem with maven and gradle is that their build actions/plugins can have have unobservable side effects.

This approach is more ‘pure functional’. You have rules which take inputs, run actions, produce outputs and memoize them. If inputs don’t change, then you use memoized outputs and don’t run the action.

As long as your actions produce observable side effects in the outputs (and don’t produce side effects which are not part of the outputs, but product state which depended upon in some manner), then you can do a lot of optimizations on this graph.

one challenge with nix:
output paths are variables, which are also included in binary files (*.so etc)
so we need to isolate the compilation objects, and patch them back together …

as a bonus, this would allow distributed object compilation,
instead of just distributed package compilation

this could be the bottleneck with nix …

naive solution

this requires at least 2 builds, ideally we need only 1 build

  • on every compile round: split the build into objects
  • round1
    • compile every object: src + env + nixenv1 → obj1
  • round2
    • compile every object: src + env + nixenv2 → obj2
    • condition: (src + env) is the same as in round1
    • take obj1 and binary-patch the output paths for nixenv2
      • obj1 + diff(nixenv1, nixenv2) → obj1patch2
    • now, if obj1patch2 == obj2, then we have one “anecdote”,
      where we can avoid the expensive recompilation by cheap binary-patching
      and still get a lossless transformation from src to obj
  • round3
    • condition: (src + env) is the same as in round1 and round2
    • now we have one “anecdote” where binary-patching has worked (round2).
      we are feeling lucky, we skip the compilation, and go straight to binary-patching:
      • obj1 + diff(nixenv1, nixenv3) → obj1patch3
      • obj2 + diff(nixenv2, nixenv3) → obj2patch3
    • we generate both obj1patch3 and obj2patch3 to reduce the risk of collisions
    • if obj1patch3 == obj2patch3, then yield the patched object,
      otherwise recompile the object, and yield the compile result

moved to Distributed nix build: split large package into many derivations

Dealing with large packages that take hours to build is quite annoying. What do people currently use/what are some standard tools that are used to give incremental builds for Nix?

TL;DR; Designing your app in a way that its build DAG allows for incremental builds

From my experience, no tool out there can just be enabled to magically allow for incremental builds. Instead, the only way forward is to have a good application architecture where independent components can be built separately (many derivations) and then composed into the final result (a link/merge-all derivation)

When you have many derivations that are truly separated, rebuilding is fast because you only rebuild what changed, Nix takes care of computing the derivations that need to be rebuild, Nix takes care of ensuring cache correctness, Binary caches then substitute globally so no-one builds twice, etc

A good example of this is the Azure SDK for Python:

You can install/build the individual packages, and you can build the top-level packages (which ultimately just install/builds the required individual packages). In any case, it’s incremental by architecture

2 Likes

An example package that I split up into different phases is a Haskell compiler written in C. Instead of using the Makefile that has several targets already, I rewrote the logic in Nix so that builds could be done more incrementally. It’s not always possible to do something like this for more complex projects though (since my Nix logic would have to be kept in sync with the Makefile.)

1 Like

Off-topic, but interesting. the azure meta-package was very quickly hated within the release management team. Trying to encapsulate API changes across 80+ packages was just a nightmare, and if all of the version bumps of the childern propagated to the parents, then there would be a major version bump almost every week. You can now see that it’s been officially deprecated after adopting native namespaces.

On topic: If the project does allow you to break up the monolith. That’s a great a way to go. With RFC#92, we may be able to do something like following completely in nix:

%.o : %.c
    gcc -c $(NIX_CFLAGS_COMPILE) $< -o $@
2 Likes

clang makes this easy. we just need to recompile the clang driver, see https://github.com/milahu/nixpkgs/tree/clang-driver-ccache

this allows us to parse C sources for dependency analysis, and avoid double-parsing like in wrapper-solutions like GitHub - edolstra/nix-ccache: A flake to remotely build and/or cache C/C++ compilation, using recursive Nix

Interested. I believe RFC92 has a part to play here, as well as lang2nix tooling like dream2nix. Either way, could be such an impactful capability that is might be worth putting together a working group.