Improvements to nixpkgs

Is there some place to track planned major overhauls to nixpkgs? I’ve heard talk of making libraries alphabetical, for example pkgs/development/libraries/gtk3/ becomes pkgs/development/libraries/g/gtk3/, but I couldn’t really find any news of it actually happening. It seems like a reasonable change that should be done. I’ve also seen talk of making a way to call packages externally in nixpkgs (callPackageTwice).

Is there any place where these changes could be tracked? Maybe people can comment changes that they’d like to see on this post.

I’ll start with my own idea.

It would be nice if there was some way to update packages individually in nixpkgs. Each package would run at its own latest version. The simplest way to do this would be to have a pinned version of nixpkgs for each package. If you upgrade just gnome-terminal, it can pin gnome-terminal to the new version while everything else remains at the old commit.

I think this would really shine in the binary cache. Imagine being able to have a channel where all of the latest working packages exist. If gnome-terminal 3.28 compiles but 3.30 doesn’t, but gnome-shell 3.30 works fine, the channel would have gnome-terminal 3.28 and gnome-shell 3.30. It could be called the ‘rolling’ channel.

I have no idea how exactly it would be implemented. I was thinking it’s possible to download individual nix files from nixpkgs. Let’s say you want package foo as it is in nixpkgs commit 39rja4, but nixpkgs at the unstable channel is at commit 74dea8. What Nix can do is go to https://channels.nixos.org/rolling/39rja4.../pkgs/foo/default.nix, download it, and map pkgs.foo to this nix file. The URL can basically act like rawgit in providing these files. The rolling branch would have an index with the latest working commit for each file. This index would be created by Hydra depending on the how the builds go. For example, let’s say Hydra clones nixpkgs 74dea8 and builds pkgs foo and bar. Both builds succeed. Hydra writes foo = 74dea8; bar = 74dea8; into the index file. Then, a new commit is posted to nixpkgs. Hydra clones nixpkgs 39rja4 and builds foo and bar. Foo’s build succeeds, while bar’s build fails. Hydra writes foo = 39rja4; bar = 74dea8; to the package index. This way, a local machine can simply download the index and only obtain nix derivations for packages it needs.

Let my try putting it in other words. Right now, nixpkgs is one big git repo that is cloned in with definitions for all packages. I propose that (at least on the local machine level) nixpkgs becomes a list of packages and their latest commits. If you need a derivation, you download the package definition first and then evaluate the downloaded file. This is very possible if Nix can lazily download URLs.

Maybe I can prototype such a system at some point

This is probably confusing to understand. I’m not the best at explaining what I’m thinking. Feel free to ask questions. Thanks!

1 Like

Personally, I don’t think the “alphabetical” would be useful. Just because autocompletion?

I would like something more reasonable would be refactor all-packages.nix. A 21k-line, 65k-word source file is huge, by any conceivable metric.

I didn’t understand your idea yet, but I would like something like a “pluggable pkg files”. Today, in order to insert a new nikpkg, we need to edit all-packages and insert it in the directory hierarchy. I am a bit avert to boilerplate coding…

It would be best if one just could put a new directory containing the package definition files.

The alphabetical can be useful on GitHub for example. If you open pkgs/development/libraries on GitHub, it refused to show all of the libraries because of server load. If each library was in its own directory based on name, this issue wouldn’t occur.

I agree. all-pkgs.nix should be cleaned up.

My idea is that instead of downloading all of nixpkgs onto the system, you download a list of all of the packages. The list gives a package name, a path where to fetch it, and a latest commit id where it compiles successfully. When you try to reference a package, it will fetch the file at the path for the commit, download default.nix for the package, and eval it. This would allow you, for example, to override the commit id for each package individually and pin different packages to different versions of nixpkgs individually

I might just need to make a prototype for it to make sense…

2 Likes

It appears to me like the Slackbuilds or AUR on Slackware systems. You just download a script, execute it and voilà! - it produces a tar-gz.

That’s it?

Yeah pretty much. NixOS right now would be the equivalent of downloading all of the scripts for all of AUR and only executing the ones you need. It’s wasteful

all-packages.nix has a reasons to exist: it serves as an index for packages. Like index in DBs, it allows much faster package lookup. Without index you have to either

a) traverse directory tree to find package

or b) you have to provide full path to package

However, if we remove hierarchy of directories in pkgs/ folder and map directory names 1-to-1 to package names (current attribute names), then we can remove all-packages.nix index.

What consequences would be except removed index? I can think of:

  1. The directory of 12k directories will be non-displayable on Github. Which can be solved by some clever sub-partition (hierarchy like l/libX/libX11)

  2. It requires Nix changes to be efficient.

  3. Instead of grep xxx all-packages.nix you’ll use now ls -l | grep xxx

  4. Each package expression should now contain all instantiations of it’s package (like it’s done with PG https://github.com/NixOS/nixpkgs/blob/master/pkgs/servers/sql/postgresql/default.nix#L95-L127). Or should we create separate directories for jre and jreHeadless packages?

  5. Package taxonomy (categorization) is lost. We can move taxonomy to meta of each pacakge, or even reuse it from AUR or Repology or whatever else provides it. Personally, I’ve used several times Nixpkgs directory structure to discover new packages (like, what are logging tools available)

That categorization often seems arbitrary and hard to follow to me anyways. I regularly find myself using something like cd nixpkgs; cd $( find . -iname '<pkgname>' ). So in my opinion a flat structure would be an advantage (with categorization/tags in metadata).

Overall however I think this change wouldn’t be worth the history loss. History won’t strictly be lost, but git blame and git log <file> would be very much less useful if we started mass-moving folders.

all-packages.nix has a reasons to exist: it serves as an index for > packages. Like index in DBs, it allows much faster package lookup.
Without index you have to either

a) traverse directory tree to find package

or b) you have to provide full path to package

I can’t tell for other people, but I only ever install packages by their
attribute name, not by their package name.

As such, if there was an implicit mapping between the firefox I’m
typing, and eg. fi.firefox, I wouldn’t care. For performance matters
it might (might) even be better, as that’d transform parsing a 65kloc
file into walking down a directory tree. Not totally sure about that,
though.

However, from my reading the most important proposal here was not to
split all-packages.nix (that was an aside), but mostly to always have
the latest-working version of packages. Which would mean no longer
blocking the whole channel when a core package doesn’t build, and only
blocking this specific package’s version.

Which sounds like a much better design to me indeed.

That said, it would likely require quite some engineering to actually
perform that efficiently, without having to download the equivalent of
multiple copies of nixpkgs. (Thinking about something like the
auto-nix-build FUSE filesystem by edolstra, but for auto-downloading,
maybe?)

And then with such an auto-downloading filesystem, comes the problem of
downloads. And maybe it could make sense to have the core packages still
be in a nixpkgs-like block (that would be as restricted as possible and
always green), but I’m not 100% sure about whether that’d be needed. But
anyway what’d be important would be to be able to eg. rebuild the config
to add a proxy without internet access.

Just my 2¢

Oh, and another potential issue coming up to my mind right now:
sometimes, even though nix tries to be pure, it can’t really be, because
components do interact (example: SMTP and IMAP servers, nginx and the
thing it rproxies, etc.).

These interactions are currently tested by nixos tests. Independently of
what one can think about nixos tests, they have the huge merit of
existing. And I can’t see how they could still exist in a scheme where
packages are upgraded independently.

However, from my reading the most important proposal here was not to
split all-packages.nix (that was an aside), but mostly to always have
the latest-working version of packages. Which would mean no longer
blocking the whole channel when a core package doesn’t build, and only
blocking this specific package’s version.

So, something like this:

publish it in channel “rollling”

  • now “rolling” channel consists of 45+K packages for different architecture. Those should be formatted in some nice way, for example:

packageAttrName = {

name = …;

out = …;

version = …;

nixpkgs_rev = …;

system = …;

};

or whatever format is understandable by nix-env

2 Likes

How do you prevent a chain from breaking?
Example:
Package A requires Package B.
Package B is updated and builds successful.
Package A breaks with the updated Package B.

Do you now have two Package B versions?

Yeah I guess that would be the solution. It can be optimized for some cases. For example, if the package’s definition didn’t change but it suddenly broke, it’s safe to assume that a dependency is too new and an old dependency can be included. Or, if a package definition did change and something broke, maybe Hydra can try to rebuild with last-working dependencies. If that fixes the problem, Hydra can set a flag for the old deps to be included. Nix already supports multiple installs of the same program. Why not use this for dependencies?

Eventually, maybe Nixpkgs’s dependency system can change a little bit and handle things such as locking certain dependencies to certain commits. Something like so:

buildInputs = [
    (lib.withVersion pkgs.foo 946bk3)
    (lib.withVersion pkgs.bar 536br0)
];

Or, alternatively, it can use the version option many packages include.

buildInputs = [
  (lib.withVersion pkgs.foo 2.5.0)
  (lib.withVersion pkgs.bar 21.0.6)
];

Good point.

But there are other “indexes” in Nixpkgs already - namely, Haskell (and other langs) packages. In fact, some of that expressions are directly written in the files, without a dedicated directory; perl-packages.nix is a typical example.

We already have a ‘distributed index’; why not distribute the all-packages one?

So, something like this:
[snip]

Yes, exactly this! I wonder how much of this could be done today with
current nix+hydra, and how far it could be automated.

I encourage you all to participate to https://github.com/nix-community/NUR as it’s a sandbox to experiment with these ideas. There are multiple ways in how this could evolve but I see a potential future where nixpkgs is stripped to a core of packages and then external package sets would be able to extend it independently.

1 Like