Maybe let's sort and/or categorize all-packages?

I’m just trying to add a new top-level package and don’t know where to add it in the humongous file pkgs/top-level/all-packages.nix. This made me ask myself why the entries in this file are neither sorted nor categorized. The only message I can find is the friendly but unhelpful comment at the beginning of the file:

/* The top-level package collection of nixpkgs.
 * It is sorted by categories corresponding to the folder names
 * in the /pkgs folder. Inside the categories packages are roughly
 * sorted by alphabet, but strict sorting has been long lost due
 * to merges. Please use the full-text search of your editor. ;)
 * Hint: ### starts category names.
 */

It is unhelpful because it is untrue. The packages jump wildly between folder structures. A snippet:

  aether = callPackage ../applications/networking/aether { };

  alda = callPackage ../development/interpreters/alda { };

  align = callPackage ../tools/text/align { };

  althttpd = callPackage ../servers/althttpd { };

  among-sus = callPackage ../games/among-sus { };

There is some vague alphabetic sorting going on, but it’s not kept throughout the file. I tried to search with Issues · NixOS/nixpkgs · GitHub, but didn’t find anything helpful.

I don’t understand why we don’t split this big file up. We have a great categorisation in form of the folder structure already. Why doesn’t every folder simply contain a file e.g. tools/text/all-packages.nix which contains all packages of that subcategory, and further up the tree we simply merge these sets?

Some advantages I can see:

  • Maintainers will immediately know where to put their packages
    • Great for newcomers
    • No need to scroll and search and ponder endlessly
  • Github will be able to display all-packages.nix
  • My guess: Merge conflicts are reduced, because adding new packages is split up on many files, sorted by topic
  • We could even add the category to the meta attribute in an orderly way, in the long run allowing people to filter on that information e.g. in the NixOS Search search

Some disadvantages:

  • It’s a tiny bit harder to search for an existing package. Rebuttal: You probably have a text editor capable of searching for the string my-package =, and you might even know the category of your package.
  • Extra evaluation when merging the sets. Are there some reliable estimates how much that would be?
  • Might take some time to convert to this new structure. Rebuttal: Work can maybe be automatized in a simple way. In any case it can be split efficiently between categories.
6 Likes

Nix is a relatively slow language and I wonder if such ‘multi-file large attrset merge’ will have a noticeable impact :thinking:

As for the idea itself - sounds great but I personally care if it is a basic ‘callPackage’ call or overwrite (or some other additional logic).
So may I suggest splitting it into 2 or 3 files instead, one file is expected to contain ‘dumb’ package definitions and another one should contain everything else. (And maybe the whole stdenv stuff should be split into a third file?)

Not sure if anyone agrees with such separation, but these are my two cents :slight_smile:

Such a separation has been done in some places, typically when there is a big automatically generated list of “basic” packages, and a manually maintained list of overwrites. For example Haskell works like that in nixpkgs. Whether it is a good idea here I’m not sure, but anyways I think it’s orthogonal to what I’m proposing. Either idea can work without the other.

1 Like

How would one go about testing that?

You may mean broader process/methodology, but in case you mean lower-level I recall seeing an env for eval stats: https://github.com/NixOS/nix/blame/323e5450a1a6e4eb97ba1c9aeba195187cfaff37/doc/manual/src/command-ref/env-common.md#L111-L113

I’m not otherwise familiar with any of this, so I’m not really sure if the stats it has are meaningful in this case. Here’s the underlying source https://github.com/NixOS/nix/blob/6f46434f3226784e809158a04a8067036f9e6291/src/libexpr/eval.cc#L2051-L2166. I don’t see many references in the wild, but:

2 Likes

Tbh I think just removing ###-sections and automatically re-sorting everything on each release would go a long way. It’d be an easy first step and does not leave space for bikeshedding

3 Likes

I guess by separating packages and comparing evaluation with NIX_SHOW_STATS output that is mentioned by @abathur