Summary
Expand the dependency tracking system with 3 new primops getRunDeps
, getBuildDeps
and getInstantiationDeps
.
Motivation
Currently, the task of obtaining a complete list of the dependencies of a given derivation is (IMO) needlessly difficult. Existing tools to achieve this like exportReferencesGraph
are both poorly documented and insufficiently granular, leading to persistent difficulties in usefully ‘transmitting’ the results of derivations to resource-constrained machines. In particular, there are several discourse threads (such as this one) and multiple github issues (the most recent of which being this one) detailing the (only partially resolved) problems with getting rebuilds to work without an Internet connection. Another example would be this open 7 year old issue which wants a way to get the run-time closure of only immediate build-time dependencies. Nix should already have this kind of information as it goes about computing closures and doing path rewriting, we just need to make it available to the language user.
Detailed design
Introduce 3 primops, which correspond to what I consider to be the 3 types of dependencies:
-
getRunDeps
- Immediate dependencies any executable in any output needs to run (your standard ${} runtime deps) -
getBuildDeps
- Immediate dependencies required to build the derivation (emphasis on immediate; i.e. not the closure). If I have only the results ofgetBuildDeps
and the.drv
of the relevant derivation in the store, runningnix build .drv^*
should always succeed. This should not include the build dependencies of the rest of the run-time closure (since you can build a package without being able to run it). Additionally, only when retrieved via this function, the resultant derivation has all of its run-time dependencies recursively converted into build-time dependencies (since to be usable in a build, the derivation must have access to its run-time dependencies at build-time). This rewriting occurs in-memory in the Nix evaluator, so there is no impact on the storepath of the actual derivation. -
getInstantiationDeps
- Immediate instantiation dependencies. Necessary to address IFDs, builtin fetchers and RFC 92. Tentative rules would be- dynamic derivers (
A
) are instantiation-time dependencies of their outputs (A.out
) - builtin fetchers and
storePath
are instantiation-time dependencies of any derivation that references them - Import-From-Derivation (IFD) creates some sort of “nix fragment” at an intermediate store path (
.ifd.nix
maybe? TBC). This store path has the actual derivation as an instantiation-time dependency. LikegetBuildDeps
, only when retrieved via this function (getInstantiationDeps <expr>.ifd.nix
), the derivation has all of its build-time and run-time dependencies recursively converted into instantiation-time dependencies. “Instantiation-time” in this case means the final instantation beforenix build
successfully exits, not the intermediate instantations performed in the event of recursive IFDs. This fragment would in turn be the instantiation-time dependency of any derivation that requires its contents to be reachable and well-defined. The purpose of this intermediate is to allow caching of the result of an IFD without needing the whole derivation it came from. This is useful if e.g. you only need a single byte from a 10GB download.
- dynamic derivers (
“Immediate” is important because you can get dependency closures from a function that returns immediate dependencies (via recursion or genericClosure
) but not vice-versa.
Examples and Interactions
- Run-time closure: Equivalent to the result of applying the existing
closureInfo
to a store path. - Build-time closure of the run-time closure: Equivalent to the result of applying the existing
closureInfo
to a.drv
, assuming no use ofunsafeDiscardReferences
. - Immediate build-time dependencies of run-time closure, minus the closure itself: Minimal set of store paths to rebuild a package.
- Immediate instantiation-time dependencies of build-time closure, minus the closure itself: Static build graph (i.e. after dynamic derivations have been resolved).
- Instantiation-time closure of the build-time closure of the run-time closure (henceforth referred to as the “total closure”): All packages built when you build the derivation in question while forbidding substitution.
- Only the FODs in the total closure: All external dependencies in a build, including those needed to instantiate it. When used as an offline binary cache for a rebuild, maximal percentage of packages built from source (everything except nonfree and bootstrap binaries). Build behaves the same way online and offline as long as all instantiation-time derivations are deterministic.
- Only the FODs and Nix fragments in the total closure: When used as an offline binary cache for a rebuild, maximal percentage of packages built from source that still provides unconditional equivalence of online and offline builds (i.e. they will succeed/fail at the same rate). This is only slightly less source-based than the FOD-only subset of the total closure because it’s theoretically possible to sneak a binary through a Nix fragment via
readFile
. When used as a cache, for most intents and purposes it’s source-based enough while being significantly faster to rebuild (by skipping IFDs).
The last two are what I’m ultimately after.
Notably, since these are Nix language primops (that return Nix objects) rather than derivation advanced attributes or CLI commands, we have access to a lot more dependency tracking information via the evaluator, and also conveniently sidestep all of the issues around having to encode a canonical set of dependencies into each store path. Under this proposal, the same derivation (in terms of input or content hash) instantiated differently can have different sets of instantiation-time dependencies. Finally, the results of these functions can be made lazy (since we do not need to track and save the dependencies of everything on the off chance they will be needed later), only the relevant dependency or dependency chains (in the case of closures) that we actually end up needing.
Drawbacks
- Since arbitrary Nix expressions can be introduced at instantiation-time, all current and future Nix objects will require some minimal context to track their Nix fragment of origin, increasing the ongoing maintenance burden.
- High potential for cross-interaction with current and future builtins. Resultant conflicts must be resolved to maintain hermeticity of the obtained closure.
- May limit the extent to which evaluation of a Nix expression can be parallelized or distributed, should that become a goal in the future.
Alternatives
Instead of separating them, add a single primop that returns a lazy attrset containing all 3. It may also be possible to avoid implementing the recursive run->build and run+build->instantiation dependency conversions mentioned above at the cost of end-user complexity in constructing closures (getInstantiationClosure
is now recursive application of [union of getRunDeps
, getBuildDeps
and getInstantiationDeps
] rather than recursive application of getInstantiationDeps
alone), Nixpkgs complexity (getInstantiationClosure
implemented in lib
), primop complexity (the get*Deps
functions return a lazy attrset containing “immediate” and “closure” dependency lists rather than a single list directly) or primop count (2 primops getDeps
and getClosure
that return the 3 dependency types each or 6 primops, one for each combination of dependency type and choice of “immediate” or “closure”).
Prior art
-
exportReferencesGraph
and its wrapperclosureInfo
. Lacks specificity. Either run-time closure or entire build-time closure only, leading to large space inefficiencies. Also fails to capture instantiation-time dependencies which can cause evaluation failures if they rely on being able to download things from the internet. - Internal Nixpkgs maintainer script
find-tarball.nix
and its wrapper nixpkgs-mirror-tarballs. Only finds explicitly declared dependencies. Also seems to miss instantiation-time dependencies. Additionally, does not appear to account for IFDs or dynamic derivations. -
marsnix. Couldn’t find enough documentation to figure out how it’s supposed to be used, but from what little I could discern, likely shares the same problems as
exportReferencesGraph
.
Unresolved questions
The name(s) and number of potential primops are all undecided at the moment.
Do feel free to share any feedback you may have so that I can decide whether to take the time to make this into a proper RFC.