What is the `patchShebangs` command in Nix build expressions?

I think I figured it out (see below) but would you comment on what I did get wrong and maybe answer the parts that I’m unsure of? Thanks in advance.

UPDATE: Amended this post after @jtojnar 's extensive review, and included comments from @jonringer . Tried to archive the original post but Internet Archive played tricks on me…


0. Introduction

patchShebangs is indirectly mentioned in the Nixpkgs manual when describing the phases of the generic builder of the Nixpkgs standard environment, stating that the fixup phase at one point

rewrites the interpreter paths of shell scripts to paths found in PATH. E.g., /usr/bin/perl will be rewritten to /nix/store/some-perl/bin/perl found in PATH.

It is important to note that (paraphrasing @jonringer’s comment), “the patchShebangs command is only available during the build if you source the $stdenv/setup setup hook” (more on that below) “provided by stdenv's (the Nixpkgs standard environment’s) default builder (you get this by default when using stdenv.mkDerivation), which is why the starting point of almost all nix expressions is import <nixpkgs> {}, stdenv.mkDerivation, or something similar.

1. Where is patchShebangs defined

The file patch-shebangs.sh in the Nixpkgs repo (also documented at 6.7.4. patch-shebangs.sh) defines the patchShebangs function, which in turn is used to implement patchShebangsAuto, the setup hook that is registered to run during the fixup phase.

2. Why are shebang rewrites needed when building Nix packages?

According to the comment at the top of patch-shebangs.sh:

# This setup hook causes the fixup phase to rewrite all script
# interpreter file names (`#!  /path') to paths found in $PATH.  E.g.,
# /bin/sh will be rewritten to /nix/store/<hash>-some-bash/bin/sh.
# /usr/bin/env gets special treatment so that ".../bin/env python" is
# rewritten to /nix/store/<hash>/bin/python.  Interpreters that are
# already in the store are left untouched.
# A script file must be marked as executable, otherwise it will not be
# considered.

IMPORTANT NOTE: The criterion above that the “script file must be marked as executable, otherwise it will not be considered” is an important one.

The line in a shell script starting with #! is called shebang (among others), and it is an interpreter directive to the executing shell as for what program to use to decipher the text below; the characters after #! has to consitute an absolute path that points to this executable. For example, #!/usr/bin/python3 will expect to find the python3 program there to carry out the commands in the body of the shell script written in the Python programming language.

Using shell scripts during package build phases becomes problematic though because

When Nix runs a builder, it initially completely clears the environment (except for the attributes declared in the derivation). For instance, the PATH variable is empty. This is done to prevent undeclared inputs from being used in the build process. If for example the PATH contained /usr/bin, then you might accidentally use /usr/bin/gcc.

The quote above is from the Nix manual but the builder, that is shown there as an example, uses $stdenv/setup - a shell script that sets up a pristine sandbox environment for the build process, unsetting most (all?) environment variables from the calling shell, and only including a small number of utilities. (This is done to make builds reproducible, as much as possible.)1

$stdenv/setup is usually called implicitly when using stdenv.mkDerivation with the generic builder (i.e., when the builder attribute is left undeclared) but one can write their own builders and invoke it explicitly during the build process.

TIP: This answer shows one way to find where a certain Nix function is defined (although it is not infallible).

As a corollary, the programs pointed to by the shebang directives won’t be at those locations (or unavailable to reach from the sandbox), but they are actually around (or will be) in the Nix store so the paths will need to be re-pointed to their location in there.

NOTE: The generic builder populates PATH from inputs of the derivation so one must make sure that these are included as a dependency.

3. How to use

3.1 Implicitly

As mentioned above,patchShebangs is automatically invoked by the patchShebangsAuto setup hook during the fixup phase whenever a package is built - unless one opts out of this by setting the dontPatchShebangs variable (or the dontFixup variable for that matter) (see Variables controlling the fixup phase in the Nixpkgs manual).

Reminder to self: 6.4 Bash Conditional Expressions.

3.1.0 What scripts is patchShebangs used on when invoked automatically?

Usually on scripts installed by packages (for example to $out/bin).

Or the ones provided default by the Nixpkgs standard library? I presume that these have to be generic enough to run on different platforms so that (1) the template is built, and (2) scripts shebangs are patched in the end. (@jtojnar confirmed this conjecture, but this section needs references, hence the small case.)

3.1.1 How to use the variables controlling a build phase?

Pass it to mkDerivation like any other variable controlling the builder.

stdenv.mkDerivation {
  #...
  dontPatchShebangs = true;
  #...
}

3.2 Explicitly

Historical note: Originally, patchShebangs was not externally callable, but it was later extracted to make its functionality re-usable in other build phases as well.

Again, from the comments in the implementation:

# Run patch shebangs on a directory or file.
# Can take multiple paths as arguments.
# patchShebangs [--build | --host] PATH...

# Flags:
# --build : Lookup commands available at build-time
# --host  : Lookup commands available at runtime

# Example use cases,
# $ patchShebangs --host /nix/store/...-hello-1.0/bin
# $ patchShebangs --build configure

It needs to be run on scripts that are to be executed directly (shell scripts included) during build time. These may be

  1. coming from the source of what is being packaged
  2. written by one to be used as helpers during the build process2

Specific examples from around the web:


Footnotes

[1]: TODO: Find out more about how the sandbox(es) are built exactly and what are barred and what are allowed. Quoting @jtojnar to bring one example:

/usr/bin/env, which is not available in sandbox either. (NixOS only has that in user space for convenience but that does not carry over to Nix sandbox..

[2]: @jtojnar’s comment: “Right, you will not need to use it explicitly for scripts that are only executed at run time, since those will be handled by the implicit call.


All links in this thread have (hopefully) been saved to the Internet Archive. (The soundtrack of the thread is this gem.)

5 Likes

When talking about patchShebangs, we need to distinguish three things:

  • patch-shebangs.sh – the setup hook itself, it actually is described in the manual but we could have more interlinks.
  • patchShebangs – a function defined by the setup hook, that actually replaces shebangs in files passed to it as arguments. You can call this manually in packages.
  • patchShebangsAuto – a function the setup hook registers as fixupOutput hook so it will be run at the beginning of fixupPhase.

To be pedantic, the shell/kernel will just run the /usr/bin/env python3 path-to-file [arguments…].

The issue has actually nothing to do with PATH during build – the first word of shebang is always an absolute path so either something like /usr/bin/python3, which is not available in sandbox, or /usr/bin/env, which is not available in sandbox either. (NixOS only has that in user space for convenience but that does not carry over to Nix sandbox.)

Not just shell scripts – any scripts that need to be executed directly.

Right, you will not need to use it explicitly for scripts that are only executed at run time, since those will be handled by the implicit call.

Nix Pills is also a nice resource. It guides through thecreation of a simple builder resembling the Nixpkgs’s generic builder (standard environment).

This sounds like a misunderstanding. By the time a package is built, all its dependencies have been already built or substituted from cache. There will be no missing store paths or looking into the future – patchShebangs just looks up names of the programs on PATH when it is called. (the generic builder populates PATH from inputs.)

Usually on scripts installed by packages (for example to $out/bin).

Yes, this is another (less common) use case with the same goal.

See above, we need better linking.

Just pass it to mkDerivation like any other variable controlling the builder.

That is exactly the use case for the explicit patchShebangs call. Meson build system expects to run src/shared/generate-syscall-list.py so it calls it. But that fails because /usr/bin/env does not exist in the build sandbox. And it only gets confusing because kernel/libc/something else reports that the script does not exist, even though it was the interpreter from the shebang which does not exist.

5 Likes

Thank you so much for the detailed review! Will incorporate your suggestion as soon as I can to hopefully make it reflect reality even better. (Fun fact: your U&L Stackexchange post was the first one I found in this topic:)

Excellent explanation! Can we have this as part of the manual on how nix builds work under the hood?

1 Like

I believe this is what nix pills tries to achieve to a large extent. Nix has surprising little included in it. Most of what we think of “nixpkgs” is actually the “nixpkgs repository”, which is why the starting point of almost all nix expressions is import <nixpkgs> {}, stdenv.mkDerivation, or something similar. For example, patchShebangs command is only available during the build if you source the setuphook provided by the stdenv default builder (you get this by default when using stdenv.mkDerivation).

1 Like

@jtojnar , thank you again for your detailed comments; I updated the original post, hope that it is more technically accurate.

@jonringer , thank you for the tidbit regarding when patchShebangs is available - this should have been obvious to me but failed to put 2 and 2 together.

@fricklerhandwerk thank you for the praise, but as both @jonringer and @jtojnar points it out, all this is there in the manuals and other resources (e.g., Nix Pills). I wrote this because I get hopelessly lost in the details and it might help me in the future. Also, patchShebangs is an infinitesimally small detail in the grand scheme of Nix things, but this research put some things into context for me.

If you want to dig even deeper, here is where Nix starts the builder (process that builds the derivation, see NixOS - Nix Pills for introductory explanation of derivations):

Sandbox is instantiated there, especially below:

It also takes some configuration from.

2 Likes

This is exactly my point. We could argue that of course everything is there, it’s in the code! But you had to cherry-pick it together to make sense of it, including actually finding and reading the implementation bits. Even if the documentation is addressed to software developers, having it this way is not very effective communication. Why don’t we save all our future readers the effort and add right what you already compiled to the appropriate spot? The work is done, and for the next few days we have a unique situation where a handful of people have the subject at the top of their mind.

Alright, then it should be in the nixpkgs manual, either right there to extend the passage on fixupPhase or in another sub-section next to it. I thought of the Nix Pills, too, when writing that comment. But their narrative (or lecturing) style does not really fit the pattern of randomly looking up certain aspects of a software (e.g. when solving a problem) like you would from a reference manual. This piece here is great reference material!

1 Like

I used to be very mad at the state of Nix manuals, and made extensive notes (about the parts that I understood) because there is a circular dependency between them1, what a “better” structure would be, etc., but it is a huge and complex topic, the internals are moving fast; there would have to be a dedicated person (or a small team) to keep the docs updated and represented in a matter that is easy to consume. The latter is a scientific field itself (1, 2, and even though it would be my dream job (and probably others as well), reality does not cater for such indulgences…

I also used to disparage Nix Pills, but the more I read it, the more I have become to respect the effort by the author and the community maintaining that text. I don’t like the style, which is only a personal preference anyway, but it breaks down the basics into its constituent atoms. Let’s take the chapter on override design pattern: Just looking at the implementation of makeOverridable does not convey the same clarity to me (and makes me lose hope), and I admire how it has been boiled down to its bare skeleton, lowering the barrier to understanding. (callPackage is another thing that I keep over-mystifying in my head…) Yes, one could go down the rabbit hole with git blame but someone already took the effort.

I also figured out in the meantime that I am mostly angry to myself for grasping things so slowly, and finally want to get productive with Nix because I see its philosophy as a way to go.2 Just because I’m slow on the uptake doesn’t mean that people have the responsibility to make every bits of knowledge super-digestible to me - that is on me, so I will continue these posts in the future when time permits (until asked to refrain spamming the list).

Anyway, this got really off-topic fast, I apologize.


[1]: For example, section 14.2. Build Script in the Nix manual starts with source $stdenv/setup, that refers to the Nixpkgs standard environment - so where should one start?:slight_smile:

[2] I’m still upset about the fact that there are such great Nix tools out there that I was only able to discover by buried comments or by accidental google searches (e.g., deployment tools: NixOps, Kreps, Morph, Colmina, deploy-rs, and these are just at the top of my head3). Something like hex.pm for Elixir/Erlang, cpan.org for Perl, etc. would be nice (although that just means that they would be listed there, albeit in a structured way, one would still have to discover them.

[3]: The diversity! But with so little documentation for the lay or semi-lay Nix person… Most are touted as “simple” (either stateful or stateless) wrappers around nix-* commands, but what that really means is that one would have to go to the source again, and pick up expertise with those commands to understand what is going on (isn’t that always the case though?..:slight_smile: )

2 Likes

Personally, I found it hard to “mentally digest” what nix pills is trying to teach me until after I had a working knowledge of nix. After I had some intuition, then it was very rewarding.

In the rust ecosystem, they have the rust book, and it’s a great read from start to finish without going into extreme depth into a particular topic, but enough to get you an intuition about the language. I view nix pills more like the rust nomicon where they cover the more “dangerous” and less likely to be used parts of the language. How often will we need to supply our own builder? likely never.

6 Likes

I created a PR: docs: expand explanation of patchShebangs hook by fricklerhandwerk · Pull Request #121015 · NixOS/nixpkgs · GitHub

3 Likes
Hosted by Flying Circus.