Site intro to nix's repeatable builds point has some clarity issues

makoConstruct · September 26, 2020, 4:19am

Since Nix on the other hand doesn’t install packages in “global” locations like /usr/bin but in package-specific directories, the risk of incomplete dependencies is greatly reduced. This is because tools such as compilers don’t search in per-packages directories such as /nix/store/5lbfaxb722zp…-openssl-0.9.8d/include, so if a package builds correctly on your system, this is because you specified the dependency explicitly.

This “so” just doesn’t make sense to me. How would compilers not looking at nix’s per-package directories be compatible with nix improving their build reliability? It would make sense if it was supposed to be “This is because tools such as compilers do search in per-packages directories such as /nix/store/5lbfaxb722zp…”, but that would be a mildly surprising way for nix to work (I do not yet understand how nix works, hence my reading and actually trying to understand the text here), so

Additionally

Runtime dependencies are found by scanning binaries for the hash parts of Nix store paths (such as r8vvq9kq…). This may sound risky, but it works extremely well.

Could do with some expansion, I think, it does indeed sound risky, it sounds like something you should never need to do. Additional explanations might be helpful.

7c6f434c · September 26, 2020, 8:18am

Thanks for the valuable (and impossible to get from fully committed project members for obvious reasons) outsider perspective on the texts!

Let me try to answer the questions, even if my answers are probably not suitable to be taken out of context and put into documentation.

Since Nix on the other hand doesn’t install packages in “global” locations like /usr/bin but in package-specific directories, the risk of incomplete dependencies is greatly reduced. This is because tools such as compilers don’t search in per-packages directories such as /nix/store/5lbfaxb722zp…-openssl-0.9.8d/include, so if a package builds correctly on your system, this is because you specified the dependency explicitly.

This “so” just doesn’t make sense to me. How would compilers not looking at nix’s per-package directories be compatible with nix improving their build reliability? It would make sense if it was supposed to be “This is because tools such as compilers do search in per-packages directories such as /nix/store/5lbfaxb722zp…”, but that would be a mildly surprising way for nix to work (I do not yet understand how nix works, hence my reading and actually trying to understand the text here), so

I think the difference that is stressed here is that normally compiler looks in the headers for all installed packages (which, in particular, means some packages/versions will invaoidably conflict), and under Nix compiler won’t scan all the instaled packages, but only the explicitly passed packages chosen to be declared as dependencies.

Additionally

Runtime dependencies are found by scanning binaries for the hash parts of Nix store paths (such as r8vvq9kq…). This may sound risky, but it works extremely well.

Could do with some expansion, I think, it does indeed sound risky, it sounds like something you should never need to do. Additional explanations might be helpful.

In reality it only scans for known dependencies (so collisions are pretty unlikely), and it only needs to find one mention per dependency, so missing some (but not all) due to encoding/compression issues is fine.

On the other hand, never-mentioned things (like, hopefully, the compiler used) are not mandatory to include, especially in case of smaller images like containers.

makoConstruct · September 27, 2020, 3:54am

Ahh I think I see what it was trying to say then. It was trying to say that the compiler wouldn’t find any directories that we haven’t exposed to it via its nix packaging? But saying that a program running on nix doesn’t search in nix’s package directories made it harder for me to understand what packages in nix’s package directories are visible to it: There a very real sense in which the compiler is looking into nix’s package directories, even though it doesn’t know that’s where it’s looking, and this is not explained early enough.

I’m guessing, and this is only a guess, that nix, before running a program, puts it in a virtual unix filesystem that exposes particular dependencies as if they had been installed in the expected places. If this had been explained somewhere, I think all of this would be a lot clearer.

Propose changing that paragraph to

Since Nix on the other hand doesn’t install packages in “global” locations like /usr/bin but in package-specific directories like /nix/store/5lbfaxb722zp…-openssl-0.9.8d/include, the risk of incomplete dependencies is greatly reduced. For example, since a compiler would not go looking in /nix/store directories for the headers or libraries it needs, it will only find its dependencies if they have been explicitly made exposed by your package’s nix expression. If the package builds on your system, it will build wherever that nix expression is honored.

For the next one… I still don’t understand. To probe for clarity a bit… what binaries contain nix hashes? What is the set of binaries being scanned over? What would it mean for a dependency to not be known? Why would there be multiple hash mentions mentions per dependency?

7c6f434c · September 27, 2020, 1:27pm

While we kind of have tools like that (generally called something with FHS env), proper Nixpkgs use is not like that.

Proper Nixpkgs use is to pass the compiler explicitly a list of include paths to look at (which stdenv does automatically).

Once you build a package it might contain some way to access the dependencies at run time. Maybe a library path is saved in some executable and used at launch. Maybe an icon path is saved in some config and used to load it if the relevant dialog window is open. Maybe both. Maybe multiple binaries refer to the same library.

In general, if the path to some dependency is present in the output path (maybe more than once), we consider it likely that something from this dependency can be read at run time (and so GC should make sure to keep this dependency). Otherwise the only ways the dependency could be used at run time are some nontrivially encoded references (unlucky case), or Nix store scanning (we… will not like any software that would try this). If the former happens, someone usually adds some workaround to also keep a plain reference to the same dependency.

Full path to a dependency of course contains the Nix hash of the dependency.

jonringer · September 27, 2020, 11:40pm

I think the proposed rewrite is a little clearer. The proper course of action would be to open an issue or PR on GitHub - NixOS/nixos-homepage: Sources for nixos.org.

I apologize if this is more confusing, but I believe it’s closer to what’s going on from a conceptual perspective:

For how nix “works”, there’s one central design pattern, the nix store. All inputs and outputs will exist on the nix store, and this allows for the build and system separation. The creation of store paths is done through “realizing” a derivation, and a derivation is created by “instantiating” a nix expression.

Defining some terms:

Nix expression
- Some nix code which declaratively shows what’s involved in building a package
- This is what people write and contribute to nixpkgs
- Usually generic over many platforms and architectures
- Meant to be human readable
Derivation
- Output from “instantiating” a nix-expression (usually referred to as evaluation)
- Specific to a given platform and architecture
- Can be thought of the “recipe” used by the machine to create a package
- Can be denoted by the file extension “.drv”
Build output:
- Created by “realizing” the derivation
- The nix build daemon will create a namespaced (chroot’d) build environment based on what the derivation declares
- Only inputs declared by the derivation will be available in this context, so you can’t have any “accidental” or impure dependencies.
- The build daemon will essentially follow what the “builder” wants to do, and this is usually just a series of steps defined in bash
- Output could be a file or a directory

I like examples to demonstrate what I mean, let’s take the hello package for example:
The nix expression:

$ cat pkgs/applications/misc/hello/default.nix
{ stdenv, fetchurl }:

stdenv.mkDerivation rec {
  pname = "hello";
  version = "2.10";

  src = fetchurl {
    url = "mirror://gnu/hello/${pname}-${version}.tar.gz";
    sha256 = "0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i";
  };

  ...
}

The derivation:

$ nix show-derivation $(nix-instantiate -A hello)
warning: you did not specify '--add-root'; the result might be removed by the garbage collector
{
  "/nix/store/5rj29hp44c8p6281imxz9h1klqa23ijs-hello-2.10.drv": {
    "outputs": {
      "out": {
        "path": "/nix/store/w9yy7v61ipb5rx6i35zq1mvc2iqfmps1-hello-2.10"
      }
    },
    "inputSrcs": [
      "/nix/store/9krlzvny65gdc8s7kpb6lkx8cd02c25b-default-builder.sh"
    ],
    "inputDrvs": {
      "/nix/store/16bljd3wa1agbwin5h4f2yf94h88m4yp-hello-2.10.tar.gz.drv": [
        "out"
      ],
      "/nix/store/1psqjc0l1vmjsjy4ha5ywbv1l0993cka-bash-4.4-p23.drv": [
        "out"
      ],
      "/nix/store/m15naxf285zafnsnlzfaxy0r10dzlanx-stdenv-linux.drv": [
        "out"
      ]
    },
    "platform": "x86_64-linux",
    "builder": "/nix/store/2jysm3dfsgby5sw5jgj43qjrb5v79ms9-bash-4.4-p23/bin/bash",
    "args": [
      "-e",
      "/nix/store/9krlzvny65gdc8s7kpb6lkx8cd02c25b-default-builder.sh"
    ],
    "env": {
        ...
    }
  }
}

The build output:

$ tree $(nix-build -A hello) -L 2
/nix/store/w9yy7v61ipb5rx6i35zq1mvc2iqfmps1-hello-2.10
├── bin
│   └── hello
└── share
    ├── info
    ├── locale
    └── man

5 directories, 1 file

EDIT:
Determining the runtime dependencies is a little different from “why nix builds are repeatable/deterministic”.

makoConstruct · September 28, 2020, 12:39am

Proper Nixpkgs use is to pass the compiler explicitly a list of include paths to look at (which stdenv does automatically).

I’ve been reading additional documents and yeah I’m starting to see how that is usually done. I guess nix would usually end up generating an executable sets its path to include the libraries it needs?

if the path to some dependency is present in the output path

I had a lot of difficulty understanding this. You meant present in the files which are in the output directory, which for me was two steps away. To me “path” means the string identifying the directory. It might have been easier if you’d directly answered the questions.

I’m Uncomfortable with this way of keeping track of ongoing dependencies but I am unable to refute its efficacy so I guess I’ll get used to it x]

Is there a way of telling Nix “don’t scan these files, we don’t actually depend on the paths in here”?

makoConstruct · September 28, 2020, 12:39am

I probably will do a pull request towards the end of this discussion btw.

jonringer · September 28, 2020, 2:12am

you’re looking for disallowedReferences (there’s no good anchor, but it’s listed on https://nixos.org/manual/nix/stable/#sec-advanced-attributes)

essentially, the build will fail if there’s a reference which still exists in $out

useful in ensure closure bloat doesn’t happen.

makoConstruct · September 28, 2020, 3:28am

No, this appears to be a means of failing a build if it turns out to contain an already known reference. I am asking if there’s a way of preventing nix from identifying {whatever yet-unknown references are mentioned in a particular output file} as runtime dependencies of its package.

jonringer · September 28, 2020, 4:28am

Could you give an example? Because in general, nix makes very few assumptions as to allow for the greatest number of scenarios. We use tools like makeWrapper to ensure programs have certain needs met at runtime, but it’s usually up to the package maintainer to ensure those runtime dependencies and assumptions are met.

7c6f434c · September 28, 2020, 5:34am

Yes

Yes, you are right, sorry.

The logic is «to use something you need to know its name». It doesn’t work properly when the only reference is inside, say, a JAR file, or a compressed man page. These cases might need some slightly unfortunate workarounds.

No. On the other hand, the builder can scan the files in question and replace the store path references found in these files with definitely non-existent ones, e.g. with hash part of all «e» (Nix encoding of hashes does not use letter «e»)

7c6f434c · September 28, 2020, 5:36am

A realistic example that actually happens is some detailed build information file having a reference to compiler never used in runtime.

jonringer · September 28, 2020, 6:52am

I guess I was getting tripped up on the “yet-unkown”, as all dependencies are well defined. I would consider references to the compiler as known, but unintentional.

ajs124 · September 28, 2020, 4:38pm

Which is generally achieved with this.

makoConstruct · October 11, 2020, 10:05am

pr made made the complete dependencies paragraph in features.tt a lot clearer for new users by makoConstruct · Pull Request #615 · NixOS/nixos-homepage · GitHub