Nixpkgs's current development workflow is not sustainable

There’s two other suggestions that can help short term:

  • As suggested above, add your packages, or major packages that would exercise similar codepaths, to passthru.tests in your major dependencies, especially if that dependency doesn’t have any dependees at all in its tests. This can help highlight when seemly innocuous version bumps affect downstream packages.
  • Review your package’s dependency closure and see if there’s unused inputs. The fewer transitive dependencies your package has, the less it’s going to be affected by breakages. This will also help the packages you remove: the fewer transitive dependents they have, the smaller the fallout when they break.

These suggestions don’t address the fundamental issues, and I’m not familiar enough with the python / ML ecosystem to tell how effective they are going to be in your situation, but they could help.

3 Likes

Forgot to add, passthru.tests will also help you be notified of incoming breaks from PRs, according to https://github.com/NixOS/rfcs/blob/8bb86f8bddd98deb3c03c5698d5eff0b9072d0a7/rfcs/0088-nixpkgs-breaking-change-policy.md#procedure, at least in theory, so you can either suggest fixes to the incoming PR, or have a fix PR ready to merge immediately after the incoming PR is merged.

2 Likes

Incidentally does anyone know if there’s plans to automate / bolster those package break notifications? I use a lot of niche packages / lower tier platforms that shouldn’t block PRs being merged, but I would love to be notified when a PR breaks them so I can fix them immediately instead of only finding out when I try to update my pins.

2 Likes

I ended up rolling my own solution for this problem (GitHub - samuela/nixpkgs-upkeep: Auto-updates and CI 🤖 for Nixpkgs) which has been great so far. I don’t know how other maintainers survive without these notifications…

1 Like

I do genuinely appreciate this suggestion from yourself and @bjornfor. It’s not something that I had considered previously. But it also is exactly the kind of thing that should be covered by nixpkgs-review/CI anyhow… Part of being a responsible committer is testing the downstream effects of your changes before merging.

It may be the only practical solution atm, but it just feels a bit more like a band-aid than a systemic fix IMHO.

1 Like

Yeah I saw that, that looks awesome! I’ll try it out and see if I can add all the packages I use to it.

Then I don’t have any other idea than massively reducing the size of pythonPackages.

None that I know off but if we don’t do that right a temporary build failure of a core package could send 1000s of emails.

That works for packages with less then a few hundred reverse dependencies, but not for packages with sometimes thousands of reverse dependencies .
If a core package like requests gets updated we carefully read the changelog for major breaking changes and if there are none staging gets targeted and we hope for the best. There is no other liable way to get this done in a meaningful amount of time. We just cannot guarantee that everything down the entire dependency chain will 100% continue to function. Something 20 packages down the stream can always pin requests for no good reason and break.

2 Likes

Also if we would enforce this no one would willingly maintain any package that has more than a few hundred reverse dependency. In addition to that most maintainers don’t “agree” to the reverse dependencies. The package just gets used and with one commit you can suddenly become responsible for 5000 packages if some other deeply integrated package starts to depend on your package.

7 Likes

As discussed above, this is exactly the sort of problem that merge trains solve. Why not just use a merge train?

2 Likes

I don’t see how a merge train would solve our problems. The typical PR to master is mostly unrelated to the ones that are merged next to it. The chance of breakage there is (very) low. A merge train for those PRs would mostly excercise ofborg and GitHub actions. And it wouldn’t solve the problems you described in this post.

The typical PR to staging on the other hand rebuilds over 2500 (5000) packages. I don’t think it is realistic or feasible to rebuild all changed packages with every PR to staging. If we use the current CI system then breakages would only be discovered by later PRs and the big rebuild would need to be grouped at the start. Also not really a good solution. And right now we do one big PR with hundreds of commits which also wouldn’t work to well.

One takeaway we could do is that for python-updates branches all packages that where touched would need to build. I think that is a realistic target.

So what am I missing? Should we rebuild every changed package with every commit/PR?

2 Likes

What about codeowners? That would probably be what you wanted to achieve with your PR. Why didn’t we think about this earlier?

I don’t know how other maintainers survive without these notifications…

PR’s targeting master should be green to green; as it’s reasonable to expect most people able to build ~1-20 packages on consumer hardware.

The main entrypoint for a lot of breakages is the staging workflow. Early on (and continuing to do so) I just did something like GitHub - jonringer/nix-test-staging-next: Nix file used to validate `staging-next` in Nixpkgs where I just have a giant list of all of the packages I care about. After about 24-48 hours of the staging-next PR being opened, the cache has most of the annoying-to-build packages already populated.

The python-updates workflow gets merged into staging (although I think the beginning of a staging-next round would be best for freshness), so breakages that it introduces should also be apparent in staging-next.

1 Like

So i know this will not fly well and may be a pita but… why not get rid of most language specific package set… we can keep tooling to make it easy to package a high level package, but not bring all the libraries. If someone want to maintain a package set for a language, let’s do it out of tree.

And yes, that means in theory we could have separate dependency tree for 2 package that use python with 2 different version of the same lib. So be it.

I do think that this is the point at which mixing language package management and system level package management break. There may be a world in which it is easier than today, the Python ecosystem is after all quite painful at the language library management level. But this is the point where it fundamentally break.

You need e.g. Python applications to package numerous other applications. These would have to be removed as well, then. IoW we’d stop being a software distribution then.

9 Likes

Creating a package set where all packages are compatible with each other is a lot of work, but it also provides a lot of value. See Haskell for all: The golden rule of software distributions for more context, it’s haskell centric, but it’s applicable for any language ecosystem where dependency versions are resolved using constraints (which as far as I know is all the major ones except C/C++ and go).
If the python nixpkgs community finds value in the set of python libraries we have in it, both for packaging python binaries in nixpkgs and to build their own python code, I don’t think we (as in the broader nixpkgs community) should take that away.

8 Likes

[RFC 0109] Nixpkgs Generated Code Policy by Ericson2314 · Pull Request #109 · NixOS/rfcs · GitHub is now made more conservative / less ambitious, to the point where I don’t think there would be any objections nor a need to coordinate changes Nix or Hydra itself, yet it could still be a significant step towards making Nixpkgs easy to maintain.

I urge you all to give it (another) look!

6 Likes