Are hooks abused as a module system for nix builds?

DavHau · May 16, 2021, 6:48am

It seems like in some parts of nixpkgs hooks are intensively used to make builds modular.
Is this the purpose that hooks were originally invented for?
If hooks are our module system for builds, why is this framework implemented in shell script. Shouldn’t we use the nix language to define such framework instead?

Current problems I see:

Find failing hook: How can I find out which hook is the source of a failing command? Some hooks are nice and print a “starting hook: …”, but not all of them do that.
Discoverability: If a hook is nice enough to announce its execution, how can I find its definition in nixpkgs?
- The announced name is often different than the attribute name and different than the file name. There is no clear relation.
- Some hooks are general build-support hooks, some hooks come from language frameworks, etc. There is no clear place to start looking.
Predictability: How can I make reliable assumptions on what hooks will be running during my build, or find out why a specific hook is activated on my build?
- Hooks can be registered by any shell script of any layer at any time of the build. That seems chaotic.
- Some hooks are enabled/disabled conditionally during build time depending on state of the build (environment variables for example)
- Some hooks are only ran if other hooks decide to run them

The model seems to be full of side effects, everything can affect everything else, and the dependency tree is unclear.
Isn’t that contradictory to the idioms of nix and in some way re-creating the madness that nix originally tries to solve?

jonringer · May 16, 2021, 10:50pm

Another point I would like to add:

Hooks are only relevant when using the standard builder from stdenv.mkDerivation. Suprising when a package may have a simple builder.sh to unpack something, but you wonder why something like qt didn’t wrap anything.

doronbehar · May 17, 2021, 8:11am

I can relate to most of what you write, but I think the fact hooks are shell interpreted, makes them something easy to grasp when you are new to nixpkgs, and it makes new nixpkgs contributors’ experience easier.

Compare that to what you see in Guix’ source code, where everything is scm code, and that language is much harder then Nix if you ask me. I was new to both of them at the same time and reading Nix was a breeze in comparison to Guix’ scm.

The drawback indeed is that it’s hard to get information about how a derivation is built, from a pure Nix code. And this is something that would bother I guess someone like you who writes an awesome tool such as match-nix,

I guess this can be designed much better, especially with structured attributes. The key goal should be to keep Nix code easy to read.

DavHau · May 23, 2021, 2:45pm

Interesting. I have never tried Guix, so maybe I just don’t know how happy I should be with our hooks model But I have the feeling that it was one of the major hurdles to be able to understand/debug things in nixpkgs. The first hurdle was to understand how to navigate nixpkgs and understand the basic structure. But even if this is mastered, the hooks still felt like this hidden stuff that happens during builds which is hard to control. I mean, this is also a good thing. Abstractions should hide complexity. But when something goes wrong, there needs to be a clear way of finding the element that causes the problem. The current model I feel lacks that. One really has to understand all layers of abstraction in nixpkgs to be able to grasp what processes/hooks are involved.

For evaluation errors there is a stack trace. For build time errors there is not. But there could be.

Other features that I imagine would help:

inspect the full build script of a derivation

Basically all our builds just consist of a number of shell commands that are invoked after each other. If build modules (hooks) would be assembled by nix and rendered into a flat script, then it would be quite simple to print out the resulting build script. Right now this would be hard to do.

inspect the full execution tree of all hooks before building

This is basically just a more compact (zoomed out) view of the last point. That would be easy to do if hooks would be registered during evaluation time instead of build time.

manually execute the build step by step (hook by hook)

I know, for phases, this is hypothetically already possible with nix-shell. I can manually go there and call each phase manually. But for me this always fails on simple things like not knowing which phases are defined and which phase to run next. It would be nice to have some functions like WhichPhaseIsNext etc. Also often phases consist of several hooks that are defined in completely different places and call each other during runtime which makes everything harder. It would be good to have a system that can pause between each single hook and tells you which hook is next.

If we just controlled the registration of hooks via nix, the hooks would still be written in shell. It doesn’t mean that the whole build scripts has to be written in nix. I feel we could have the best of both worlds.

I think some of the mentioned problems could already be solved by improving the current model a bit. I guess the framework could be extended to have a specific interface for registering, activating, and executing hooks which will allow to add some more advanced debugging functionalities and to provide standardized debug messages across all hooks instead of leaving it up to the hook to announce its status.

danieldk · May 23, 2021, 4:12pm

I agree that having a hook API with reporting would go a long way. It would make consistent reporting easier. Using such an API, we could inform a user better on what hooks actually do. For example, some hooks add themselves to a phase, some hooks replace a phase. If you’d see something like:

maturinBuildHook: replaced default buildPhase

it’s much clearer why your buildPhase is not executed – because a hook replaced it.

Sometimes hooks also need to pass information between different phases or even between different hooks. We are usually passing these through environment variables, which gives implicit global state. If these variables were registered and used through a hook API instead, then we could provide the option of a verbose mode which, among other things, reports what state is being set or retrieved by a hook.

Currently, the best way to debug hooks is to do a set -x in an early -enough phase. But as expected, it’s very chatty and makes it difficult to pick out the interesting bits.

asymmetric · May 23, 2021, 4:19pm

Just want to chime in and say that all of the suggestions in the last 2 posts above would greatly improve (my) packaging experience. They would shift a big chunk of tribal knowledge (which is what distinguishes experienced vs non-experienced Nixers right now) into codified and documented steps.

I think we would all benefit greatly from such a standardization effort!

samuela · May 23, 2021, 11:52pm

I’d also like to add my support for getting rid of hooks or at least offering a better developer experience with them.

The current model with hooks is isomorphic to text-based metaprogramming. That’s not really a good thing. Yes, it does work and a lot of people have done a really good job making this system usable and effective for the rest of us. But I think long term we’d be better off if we instead focused on exposing a “standard library” of common build functions, and then just empowered users to write their own build scripts with those functions. Just like conventional programming.

Hooks are very confusing because they lead to questions like: What are all of the available hooks and phases? What order will the hooks be run in? Where is hook X defined? If I don’t specify a value for X hook what will it do by default? Some hooks are disabled by default… which ones are they? Does a custom derivation – eg buildRustPackage – define its own hooks or override existing hooks? How can I know? Why does nix assume that all builds of all software have 17 specific, God-given phases?

There’s just too much “magic” going on with hooks IMHO. We’re gonna need a better programming model eventually.

danieldk · May 25, 2021, 3:03pm

I disagree. Hooks are why you can currently have something like nativeBuildInputs = [ cmake ] or nativeBuildInputs = [ meson ] and the configure or build phases do the right thing. It is a lot more tedious and error prone if you’d need a custom configurePhase or buildPhase where you’d call some function to use respectively CMake or Meson. Hooks are nice, because they do the right thing for you in 95% of that cases and remove a lot of boilerplate.

Hooks also reduce the amount of impenetrable string interpolation and concatenation in Nix. E.g. before the buildRustPackage was hookified, it relied on a lot of stringed-together strings. Now at least we have nice separate shell scripts that can be shellchecked and run in isolation.

To me the issue is more that hooks are currently written in quite an ad-hoc fashion and there is no hook API to do things in a standardized way. A related issue is that there is no way to automatically enumerate hooks and documented their APIs.

blaggacao · May 25, 2021, 11:51pm

I have a sense a steppable scripring language with an appropriate debugger could get us a long way to make those things discoverable. Maybe there is a language that interacts nicely with bash.

Or maybe http://bashdb.sourceforge.net/ would already get us as far as we could wish for? Then, we’d have to figure out how to set breakpoints. We could set predefined meaningful breakpoints throughout all relevant bash scripts. And then we could step through a build, breakpoint-by-breakpoint.

samuela · May 30, 2021, 12:49am

Honestly the distinction between hooks and phases is not obvious to me. But I’m actually envisioning a system that does away with both. So instead of having configurePhase/buildPhase/etc, I’m envisioning a single script that does the entire build start to finish. In place of hooks and phases, there would be utilities to automate the process. For example,

#! /usr/bin/env nix-shell
#! nix-shell -p buildUtils.cmake buildUtils.meson

# This script would be eval'd in the nix build dir with all the assets fetched.
autoUnpackStuff  # same logic as unpack phase

smartCmakeBuild --outdir foobar    # same logic as cmake hook
smartMesonBuild --custom-args --flags --whatever
make test
mv build/* out/

This way, package maintainers have infinite flexibility to use whatever scripts and tools work for them. Most importantly, the actual execution is directly apparent by reading the code. An ecosystem of build tools could be developed, extended, and integrated into nixpkgs (as in buildUtils.xxx). Using nix-shell script also opens up the packaging to be written in whatever your favorite scripting language is… Python, JavaScript, whatever you like.

FRidh · May 30, 2021, 7:45am

I’ve been playing a bit in the past with a Python stdenv where all phases would be implemented as hooks. A hook could define the relative order of its phase(s). The topological sort then gives the order in which the phases are executed. For convenience some reference points (build, install, phase) are part of the hooks runner.

tomberek · May 30, 2021, 3:50pm

I enjoyed Sander’s blog post and experiment: Sander van der Burg's blog: Layered build function abstractions for building Nix packages

blaggacao · May 30, 2021, 11:40pm

Thank you!

Now, it is completely clear (to me) how running builds through bashdbg instead of bash could work.

I also think the conclusions of factorizing the stdenv.mkDerivation might be a good preparation chore for addressing the setup-hooks-issues exposed by the OP.