I think I figured it out (see below) but would you comment on what I did get wrong and maybe answer the parts that I’m unsure of? Thanks in advance.
UPDATE: Amended this post after @jtojnar 's extensive review, and included comments from @jonringer . Tried to archive the original post but Internet Archive played tricks on me…
0. Introduction
patchShebangs
is indirectly mentioned in the Nixpkgs manual when describing the phases of the generic builder of the Nixpkgs standard environment, stating that the fixup phase at one point
rewrites the interpreter paths of shell scripts to paths found in
PATH
. E.g.,/usr/bin/perl
will be rewritten to/nix/store/some-perl/bin/perl
found inPATH
.
→
It is important to note that (paraphrasing @jonringer’s comment), “the patchShebangs
command is only available during the build if you source the $stdenv/setup
setup hook” (more on that below) “provided by stdenv
’s (the Nixpkgs standard environment’s) default builder (you get this by default when using stdenv.mkDerivation
), which is why the starting point of almost all nix expressions is import <nixpkgs> {}, stdenv.mkDerivation
, or something similar.”
1. Where is patchShebangs
defined
The file patch-shebangs.sh
in the Nixpkgs repo (also documented at 6.7.4. patch-shebangs.sh) defines the patchShebangs
function, which in turn is used to implement patchShebangsAuto
, the setup hook that is registered to run during the fixup phase.
2. Why are shebang rewrites needed when building Nix packages?
According to the comment at the top of patch-shebangs.sh
:
# This setup hook causes the fixup phase to rewrite all script
# interpreter file names (`#! /path') to paths found in $PATH. E.g.,
# /bin/sh will be rewritten to /nix/store/<hash>-some-bash/bin/sh.
# /usr/bin/env gets special treatment so that ".../bin/env python" is
# rewritten to /nix/store/<hash>/bin/python. Interpreters that are
# already in the store are left untouched.
# A script file must be marked as executable, otherwise it will not be
# considered.
IMPORTANT NOTE: The criterion above that the “script file must be marked as executable, otherwise it will not be considered” is an important one.
The line in a shell script starting with #!
is called shebang (among others), and it is an interpreter directive to the executing shell as for what program to use to decipher the text below; the characters after #!
has to consitute an absolute path that points to this executable. For example, #!/usr/bin/python3
will expect to find the python3
program there to carry out the commands in the body of the shell script written in the Python programming language.
Using shell scripts during package build phases becomes problematic though because
When Nix runs a builder, it initially completely clears the environment (except for the attributes declared in the derivation). For instance, the
PATH
variable is empty. This is done to prevent undeclared inputs from being used in the build process. If for example the PATH contained/usr/bin
, then you might accidentally use/usr/bin/gcc
.
→
The quote above is from the Nix manual but the builder, that is shown there as an example, uses $stdenv/setup
- a shell script that sets up a pristine sandbox environment for the build process, unsetting most (all?) environment variables from the calling shell, and only including a small number of utilities. (This is done to make builds reproducible, as much as possible.)1
$stdenv/setup
is usually called implicitly when using stdenv.mkDerivation
with the generic builder (i.e., when the builder
attribute is left undeclared) but one can write their own builders and invoke it explicitly during the build process.
TIP: This answer shows one way to find where a certain Nix function is defined (although it is not infallible).
As a corollary, the programs pointed to by the shebang directives won’t be at those locations (or unavailable to reach from the sandbox), but they are actually around (or will be) in the Nix store so the paths will need to be re-pointed to their location in there.
NOTE: The generic builder populates PATH from inputs of the derivation so one must make sure that these are included as a dependency.
3. How to use
3.1 Implicitly
As mentioned above,patchShebangs
is automatically invoked by the patchShebangsAuto
setup hook during the fixup phase whenever a package is built - unless one opts out of this by setting the dontPatchShebangs
variable (or the dontFixup
variable for that matter) (see Variables controlling the fixup phase in the Nixpkgs manual).
Reminder to self: 6.4 Bash Conditional Expressions.
3.1.0 What scripts is patchShebangs
used on when invoked automatically?
Usually on scripts installed by packages (for example to $out/bin
).
Or the ones provided default by the Nixpkgs standard library? I presume that these have to be generic enough to run on different platforms so that (1) the template is built, and (2) scripts shebangs are patched in the end. (@jtojnar confirmed this conjecture, but this section needs references, hence the small case.)
3.1.1 How to use the variables controlling a build phase?
Pass it to mkDerivation
like any other variable controlling the builder.
stdenv.mkDerivation {
#...
dontPatchShebangs = true;
#...
}
3.2 Explicitly
Historical note: Originally, patchShebangs
was not externally callable, but it was later extracted to make its functionality re-usable in other build phases as well.
Again, from the comments in the implementation:
# Run patch shebangs on a directory or file.
# Can take multiple paths as arguments.
# patchShebangs [--build | --host] PATH...
# Flags:
# --build : Lookup commands available at build-time
# --host : Lookup commands available at runtime
# Example use cases,
# $ patchShebangs --host /nix/store/...-hello-1.0/bin
# $ patchShebangs --build configure
It needs to be run on scripts that are to be executed directly (shell scripts included) during build time. These may be
- coming from the source of what is being packaged
- written by one to be used as helpers during the build process2
Specific examples from around the web:
- In Nix, how can I build a package that has a Python post-install script? (Unix & Linux Stackexchange)
- hard-coded bin path and NixOS (Stackoverflow)
- [QUESTION] Alias and symlinks in NixOS derivations (Reddit)
- This systemd-specific issue on IRC
… and quoting @jtojnar:That is exactly the use case for the explicit
patchShebangs
call. Meson build system expects to runsrc/shared/generate-syscall-list.py
so it calls it. But that fails because/usr/bin/env
does not exist in the build sandbox. And it only gets confusing because kernel/libc/something else reports that the script does not exist, even though it was the interpreter from the shebang which does not exist.
Footnotes
[1]: TODO: Find out more about how the sandbox(es) are built exactly and what are barred and what are allowed. Quoting @jtojnar to bring one example:
/usr/bin/env
, which is not available in sandbox either. (NixOS only has that in user space for convenience but that does not carry over to Nix sandbox..
[2]: @jtojnar’s comment: “Right, you will not need to use it explicitly for scripts that are only executed at run time, since those will be handled by the implicit call.”
All links in this thread have (hopefully) been saved to the Internet Archive. (The soundtrack of the thread is this gem.)