Lessons for a Nixpkgs reimplementation

I’m considering a complete reimplementation of nixpkgs. Following are some issues I’ve identified with nixpkgs. Note: I’m not complaining; I’m very grateful for nixpkgs. I just want to improve some things.

  1. stdenv cruft: stdenv was designed with C in mind and is quite opinionated towards C. Over the years things have been tacked onto it, and it feels much larger and more complex than it needs to be.
  2. lib standardization: The lib feels quite unstandardized, and with many holes. Many items started with sub-optimal design, but then weren’t changed due to the refactoring cost of all the code that depends on them.
  3. Barrier to entry: For anything other than package definitions and simple NixOS modules, the nixpkgs repo is quite difficult to breach for those unfamiliar with it.
  4. Evaluation perf: This is untested but it seems to me that the recursive style of the overlay/module system and callPackages unnecessarily bloats eval time.

Some solutions I’m considering:

  1. Semantically versioned stdenv and lib: This would allow continuously maintaining stdenv and lib with more freedom and greater frequency without continuously breaking all of the package definitions. Updating to more modern stdenv and lib versions could be made the responsibility of the package maintainer, reducing load on core maintainers.
  2. Versioned lang-specific stdenv extensions: Even just for Rust there’s naersk, cargo2nix, crane, etc. If versioned, this role could be adopted by the reimplementation, and result in a more unified ecosystem.

To be completely honest I’m not highly familiar with nixpkgs deep internals, so I would love to hear other lessons that have been learned about the current nixpkgs which should be addressed in a reimplementation.

4 Likes

I’d also look at what it does well, and hopefully preserve that.

Things done well, in my opinion:

  • Global namespace, global fixpoint, and global reusable patterns

    Being able to grab just about anything and compose it with what you’re doing is the unique value proposition of Nixpkgs and NixOS.

  • Modular configuration with definition merging

    While spooky action at a distance can be a double-edged sword, composing arbitrary parts of a configuration from arbitrary places is mostly a superpower and only sometimes a footgun (which is largely alleviated by thorough location tracing). It think it’s pretty much unique in the entire software world.

What a rewrite could easily do better:

  • Types

    Not just language-level types — of course, please ditch the Nix language for literally anything that offers, apart from static (ideally dependent) types, first class…

    • file system paths
    • string interpolation and string contexts
    • row polymorphism for modules

    I primarily mean the domain specific modeling though, which is an architectural issue independent of the implementation language. Nixpkgs is very very confused about what a “package” is, to the point I‘d say no one actually knows.

    Nix only gives us what originally was called “components”, file system trees with arbitrary contents, possibly computed from other components. But for the domain problem of constructing software you care a lot what those files mean: are they executables, can those executables be run standalone (how often have you searched for a “package” just to find you can only use the application as a service?), or is it a library (and for which language ecosystem, since each has its own formats), a man page, a web document, …? Specifying that and making compositions correct by construction would be the language-level responsibility of something like Nixpkgs, but we barely have any of that. It think this is where most of the friction, which users and contributors experience, comes from.

    Note that this was a known problem already in Eelco’s thesis, but for some reason nothing was ever really done about that.

  • Evaluation performance

    This is mainly an implementation-language-specific problem but has some language-independent architectural aspects to it. You want this huge database of software construction knowledge to CRUD very efficiently in order to allow for quick iteration so people can focus more on semantics than on fighting their machines.

  • Extend the type-level modeling, the knowledge base, to all interfaces

    Nixpkgs knows a lot about compilation, NixOS knows a lot about configuring and running services, and they both make for unified interfaces to extremely heterogeneous ecosystems. What they don’t do yet is unifying program invocation, i.e. how we think about command lines. Right now we accept that executables are interfaced with by throwing a bunch of characters at them, but deep down we all known there’s underlying structure we could leverage for greater safety and easier composition.

5 Likes

Often forgotten and often critically important part is overrides. And actually if you get overrides right, you won’t need any analogue to the module system until much later (or never, it can be another project’s problem on top, for system composition as opposed to a collection of packages and daemon starters),

Note that it is easy to accidentally design something unsuitable for makeing cross-builds work well.

You probably need to look at the content-addressed-store and builds-generating-derivations work. A sad performance tradeoff we have now is some work could be done during build, but then any changes to the process (even preserving the result) make all rev-deps to get rebuilt, and if you do it in Nix eval, well, it needs to be evaluated every time Nixpkgs is evaluated. I guess if you could structure things so that you have content-addressed cache of partial evaluation (so that changing implementation of some stdenv eval-time calculation is a better contained cost), you might have a clear reason to be worth switching from current Nixpkgs or to port your design into Nixpkgs.

6 Likes