Nixpkgs's current development workflow is not sustainable

Yes, absolutely! To clarify: there will absolutely need to be some form of blacklisting packages. But that ought to be straightforward. The vast majority of packages are well behaved.

Why not?

2 Likes

Because even before the fact that «Always Green» will indeed grind Nixpkgs progress to a halt, you won’t come close to clearing the reliability bar where «red is don’t merge» is a viable norm.

1 Like

Just to understand: There are no downsides coming with bors, or some other merge train, right? It doesn’t incur substantially more CI runs, it just prevents accidental semantic merge conflicts by running the CI on the branch-soon-to-be-merged, and not the current branch? In that case it can’t hurt to just use it, regardless of the “always green” debate.

3 Likes

Well, the question then becomes under what condition to promote/revert the batch…

You could as well say that master/unstable-channel is already a merge train setup in the weakest definition.

4 Likes

For starters, all checks that ofborg does (including those triggered by PR reviewers).

2 Likes

Some people (including me) would actually love to see hydra bumping nixos-unstable being replaced by a bors-like system.

There is basically no drawback (no commit will appear in unstable unless hydra passes anyway, the biggest drawback I can see would be that nixos-unstable-small would no longer exist). And it would mean that all PRs ready for merging would actually land even if another PR were to break hydra, thanks to bors’ dichotomy-based bad PR identification strategy. This incidentally means that the need for nixos-unstable-small would probably just go away anyway, as I expect such a hydra to move much faster as it’d basically have automated rollback.

Adding other checks would probably make sense later on, but as a first step it seems to me like this would be a net win, plus it’ll prepare the infrastructure for when we will actually want to implement more checks later.

The one big issue is actually infrastructure, as I seem to understand plugging into hydra is not actually easy, and we probably don’t want to be rebuilding all these things twice. But maybe building enough to check hydra would be green is not actually that expensive and it could reasonably be done outside hydra?

11 Likes

For starters, all checks that ofborg does (including those triggered by PR reviewers).

Note that triggered builds failing is reported as gray/«neutral» in the checks.

2 Likes

This doesn’t solve the root problem for the distro as a whole, but as an individual, what I do for a few things is:

  1. Stay on NixOS YYMM for my packages
  2. Use overlays to selectively bump versions of packages I care about when I need something newer
  3. Go through the 2x a year batch fix process when bumping NixOS versions, which usually turns out to be less painful than expected, because the community does a great job coming together for Zero Hydra Failures.

That’s a little more manageable than bumping and potentially fixing every day. But a nice thing about nix is that you can try the bump for a pkg build, and it if fails, just keep using the old channel / nixpkgs commit that you know works.

8 Likes

This incidentally means that the need for nixos-unstable-small would probably just go away anyway, as I expect such a hydra to move much faster as it’d basically have automated rollback.

No, because whatever you do, a security stdenv rebuild takes more of a forever for full nixos-unstable than for nixos-unstable-small.

(Some people also value nixpkgs-unstable as separate from nixos-unstable for yet other uses)

2 Likes

That’s a little more manageable than bumping and potentially fixing every day. But a nice thing about nix is that you can try the bump for a pkg build, and it if fails, just keep using the old channel / nixpkgs commit that you know works.

Worse, you can even have a single environment with packages imported from multiple different checkouts (requires care, there are some catches with plugins etc.)

2 Likes

While I agree with you in principle, nixos-unstable-small has been red for days over the last month https://status.nixos.org/ ; which is probably (hydra doesn’t easily give the build time for an eval) worse than the situation where all nixos-unstable tests would need to be performed. As for nixpkgs-unstable, TBH seeing how it’s more red than nixos-unstable while it’s supposed to have fewer tests run, I’m even less sure it’s a good thing to even keep it.

Anyway, there definitely will be use cases that would be less well-served after such a change (https://xkcd.com/1172/ comes to mind), but I don’t think security fixes would be one of them, because if master is red then security updates currently get delayed forever.

On the other hand, IIRC rust-lang’s bors has a “high-priority merge” command that allows attempting a PR merge straight from master by pausing all non-high-priority merges (so if one fails there’s less dichotomy to do), which would likely bring security updates faster than current nixos-unstable-small on average, and not slower.

9 Likes

My primary issue with breakages due to staging-next changes is that it doesn’t even feel like there’s an effort made to involve the maintainers currently. It’s very common as a package maintainer to discover a staging breakage only hours/days after it gets merged to master, not before that. Why? In cases where this happened to my packages, Hydra was clearly showing the breakage before the merge, but there was no attempt made to ping me and give me a heads up. We just merge broken stuff and fix breakages one by one after the fact when users notice. Of course, we likely can’t block on maintainers fixing new breakages while keeping the current velocity, but I suspect giving a 5-7 days heads up before merge would significantly improve the state of master.

But I suspect that many of the problems mentioned here really aren’t about staging but more about the mass-python-module-upgrades that regularly happen there. I maintain a few Python modules (not as much as I used to…) and frankly I just don’t understand how the Python packages set is maintained in nixpkgs. Someone will randomly bump the version of my package as part of a bulk update that goes in staging, without going through the standard review process, without asking for my review as the maintainer of the package, and usually without even checking that it still works. It’s incredibly demotivating when you feel like you do a careful job as a Python module maintainer (making sure you don’t break your dependencies, etc.) but then someone just goes and breaks your work without even talking to you. I don’t understand why we bulk-edit anything that has a maintainer without giving them a few days/weeks to do the work themselves.

Finally, there’s also a vicious circle at play: staging is so regularly broken that I just hate having to do any kind of work there. If it was only having to do a lot of rebuilds, fine, whatever, I’ll waste a bit of compute power. But usually it’s 6h of rebuilds only to notice that something completely obvious has been broken for 3 days and wasn’t reverted, blocking you from doing any kind of work on staging. I suspect that this could be improved by more reverting of broken or semi-broken commits on staging, but I don’t have the experience there to judge solutions.

24 Likes

@delroth, I couldn’t have said it better myself. 100%.

To give just one example from earlier today:

  1. Person A submits pull request (targeting master!) affecting a package I maintain.
  2. I request changes on the content of the PR, and point out build failures.
  3. Person A responds (literally) “I am lazy and normally no one takes a look anyway.”
  4. Person A ignores my requested changes, merges their own PR on my package without my consent (or any other approval reviews!) and before CI has completed.

It’s enough to make anyone go crazy. Why be a maintainer if people are just going to bulldoze my work anyways?

12 Likes

@Sandro removed the changes which caused the most rebuilds from the pr before merging Python fixes by SuperSandro2000 · Pull Request #169681 · NixOS/nixpkgs · GitHub

I wasn’t aware that changes had been removed before merging. Also, I was intending to not call out the exact PR/individual, but yes that was the PR.

2 Likes

I guess there is also «a specific named thing is a problem, a number of things is a statistical measure» effect. To get an answer when breakages to specific things are removed from a PR, it might help to make requests about the specific things before naming or not naming aggregate counts.

I think that the community ought to consider packaging and pinning as separable tasks with separate workflows.

Correct me if I’m mistaken (?) but I feel that currently there’s an (implicit?) rule/guideline that there should be one version of every package. However this seems to lead to PRs that cause a lot of contention.

For example, take my recent PR bumping pytorch-bin from 1.10.0 to 1.11.0: pytorch-bin: 1.10.0 -> 1.11.0 by rehno-lindeque · Pull Request #164712 · NixOS/nixpkgs · GitHub

I’m an infrequent contributor to nixpkgs, and to be honest when I do contribute I’m kind of slow and can’t keep up with the rapid changes:

  • It’s pretty easy for me to contribute new/bump existing packages. E.g. adding pytorch-bin_1_11_0 when pytorch-bin = pytorch-bin_1_10_0; already exists
  • But you may not want me to do pinning because:
    A. I don’t really know much about some other packageset and how its build systems work. It takes a sizable effort to get up to speed.
    B. I’m just never really going to have a huge amount of motivation to dive into the downstream effects of my change that I’ll never experience first hand.
    C. Even when I do make the effort to make sure everything keeps building consistently I might not have enough experience to do a good job of testing and knowing what I’m missing.

The current situation seems to me leads to huge PRs which has the following problems:

  • long comment threads and lots of context that you have to read through
  • much of that context is already stale by the time you read it
  • there’s a cadence mismatch between authors, maintainers and reviewers - different people work at different times etc
  • large amounts of back-and-forth and hopping between tasks leads to mental fatigue
  • nixpkgs itself is a moving target which changes frequently

Instead I’d much rather contribute separate granular PRs for pytorch-bin_1_11_0, torchvision-bin_1_11_0, torchaudio-bin_1_11_0, etc. Then, finally, when all that work is done it’s possible to make a much more focussed effort to roll forward the entire pytorch based ecosystem, involving many more stakeholders.

(I know this “kicks the can down the road” so to speak, and I’m an infrequent contributor as I said. However, I personally think that you eat an elephant one bite at a time.)

9 Likes

Because there are so many people involved that you can’t even ping them in one GitHub comment.

Which only happen because people don’t bump their maintained modules and way to many python packages are not even maintained by the maintainer.

First question is: How long was the update out before it got bumped? If it was out for multiple weeks then you’re not very fast updating your packages. Second: Feel free to do PRs against python-updates and link them in the PR or on matrix. Even a revert is fine if the update is just breaking to much yet and the ecosystem still needs to caught up. Third: For making sure the package still works: Make sure you are executing tests and have a pythonImportsCheck. If the Pypi tarball ships no tests just fetch straight from source. python-updates is being build on hydra and regressions are checked and being taken care of if they affect more than one or two packages.

There are some updates which just need to happen and you need to see what is broken and fix it afterwards. For example when glibc, binutils or gcc is updated. You can’t revert those updates just because some library is broken.


Yes, like any PR should that is not rebuilding 500+ packages. I moved the pkgconfig adoption out of it because based on a quick grep I thought that is not going to rebuild that much which didn’t turned out differently.

of packages that where already broken on master and which I didn’t touch in the PR. We cannot expect people to fix per-existing failures of reverse deps.

Yeah, I moved pkgconfig from buildInputs to nativeBuildInputs because I was in the process of adopting pkgconfig and enabling tests on which where disabled with a comment to enable them soon on the next update over a year ago. First I wanted to fix reverse deps but then I went into a rabbit hole of not very well maintained packages which where broken in multiple ways and gave up.


Yes, because python is limited in that way and having different versions of the same package in the same environment causes massive issues which are hard to detect. e.g. packages end up empty because Python thinks they are already installed which technically they are.
Quick fix if you have an application without reverse deps: Move it out of pythonPackages and use older versions there.

1 Like

I think this is exactly to @delroth and @rehno’s point though: We need more stability, and those are exactly the kinds of packages that we ought to just have multiple versions of. We (NixOS/cuda-maintainers) have multiple versions of cudaPackages and it works great.

4 Likes

I’ve been waiting for over two years for this day. I can help solve the discoverability, and we make it better than anything pre-flake packages could’ve dreamed of.

I’ve opened up a separate discussion to talk about discoverability

4 Likes