Why does the NixOS infrastructure have to be hosted in a centralized way?

amjoseph · August 6, 2024, 2:07pm

Thanks I did not know about this! Yes, llvm-ifs --output-elf is probably 95% of what is needed. We will probably need to do additional normalization (e.g. sorting symbols, maybe additional filtering) since bit-exact linkable stubs are a high priority for us but probably not as important to LLVM.

Yes; that stubs.so gets built by a Fixed Output Derivation.

We’re accustomed to FODs being used for fetching stuff over the network, but they can be used for other things. You can run llvm-ifs --output-elf=$out/stubs.so on .so files in another derivation’s outpath, from within an FOD.

Of course then you have to manually copy the hash into your .nix expression, which is a drag. If you have floating output CA-derivations (FLOCADs?) and use those instead of an FOD you don’t have to do this. So floating output CA-derivations make this much more convenient.

Yes, every derivation which has .so dependencies turns into two separate derivations, which I’ll call compile and relink (this can be automated inside of stdenv.mkDerivation).

The compile derivation looks exactly like what we have today, except that it has only the stubs.so of its library dependencies as inputs. So it gets rebuilt only when the symbol table changes in a way that doesn’t get normalized away.
The relink derivation takes the compile derivation as an input, and simply uses patchelf to change any references to the stubs derivation so they point to the real dependency.

When you upgrade a library (say, glibc) in a way that doesn’t change the symbol table, the stubs FOD won’t change, so none of the compile derivations will get rebuilt. These derivations are the ones that involve significant build effort. All of the relink derivations will get rebuilt, but those are trivial – they just run patchelf.

These patchelf runs are what you noticed:

To avoid bloating the binary cache, the straightforward approach is to mark all of the relink derivations with allowSubstitutes=false and exclude them from cache.nixos.org. That’s a very crude sledgehammer, but it works today with no changes to the nix interpreter. There are better solutions but they take longer to describe or need new interpreter features.

This two-derivation-step build process would let us use prelink(8) to get faster startup times, like Red Hat and MacOS do.

Also left to future work is dealing with the situation where a new version of a library adds symbols to the symbol table, but doesn’t change or delete any. Ideally we’d like to avoid rebuilding dependencies that don’t need the new symbols, but the mechanism for detecting whether or not those new symbols are needed and making note of that fact in nixpkgs both need to be developed. This is closely connected to the point that you raise:

I have a feeling that the two problems (deciding which versions to take header files from / avoiding rebuilds when unused symbols are added) are related, and will probably both be solved by the same mechanism at some point. But it’s just a hunch. llvm-ifs --output-ifs is probably useful here.