Thanks I did not know about this! Yes, llvm-ifs --output-elf
is probably 95% of what is needed. We will probably need to do additional normalization (e.g. sorting symbols, maybe additional filtering) since bit-exact linkable stubs are a high priority for us but probably not as important to LLVM.
Yes; that stubs.so
gets built by a Fixed Output Derivation.
We’re accustomed to FODs being used for fetching stuff over the network, but they can be used for other things. You can run llvm-ifs --output-elf=$out/stubs.so
on .so
files in another derivation’s outpath, from within an FOD.
Of course then you have to manually copy the hash into your .nix
expression, which is a drag. If you have floating output CA-derivations (FLOCADs?) and use those instead of an FOD you don’t have to do this. So floating output CA-derivations make this much more convenient.
Yes, every derivation which has .so
dependencies turns into two separate derivations, which I’ll call compile
and relink
(this can be automated inside of stdenv.mkDerivation
).
-
The
compile
derivation looks exactly like what we have today, except that it has only thestubs.so
of its library dependencies as inputs. So it gets rebuilt only when the symbol table changes in a way that doesn’t get normalized away. -
The
relink
derivation takes thecompile
derivation as an input, and simply usespatchelf
to change any references to thestubs
derivation so they point to the real dependency.
When you upgrade a library (say, glibc) in a way that doesn’t change the symbol table, the stubs
FOD won’t change, so none of the compile
derivations will get rebuilt. These derivations are the ones that involve significant build effort. All of the relink
derivations will get rebuilt, but those are trivial – they just run patchelf
.
These patchelf
runs are what you noticed:
To avoid bloating the binary cache, the straightforward approach is to mark all of the relink
derivations with allowSubstitutes=false
and exclude them from cache.nixos.org
. That’s a very crude sledgehammer, but it works today with no changes to the nix
interpreter. There are better solutions but they take longer to describe or need new interpreter features.
This two-derivation-step build process would let us use prelink(8)
to get faster startup times, like Red Hat and MacOS do.
Also left to future work is dealing with the situation where a new version of a library adds symbols to the symbol table, but doesn’t change or delete any. Ideally we’d like to avoid rebuilding dependencies that don’t need the new symbols, but the mechanism for detecting whether or not those new symbols are needed and making note of that fact in nixpkgs both need to be developed. This is closely connected to the point that you raise:
I have a feeling that the two problems (deciding which versions to take header files from / avoiding rebuilds when unused symbols are added) are related, and will probably both be solved by the same mechanism at some point. But it’s just a hunch. llvm-ifs --output-ifs
is probably useful here.