Nixpkgs's current development workflow is not sustainable

I don’t know. Maybe someone would if they would know this was needed. Maybe we should mark the package as broken the next time it breaks and make it a requirement to have a new maintainer to mark it unbroken.

Lots of unknowns and we don’t really know what would be a good solution.

I also doubt most people will look through all the python packages they use and update them and add themselves as maintainers like I recently did.


My suggestion would be to start with a script which finds packages which are failing to build for a long time and mark them broken and then the ones which are marked broken for a long time and we remove them. Other than we can only encourage people to better maintain things they care about and don’t rage when something is broken in master.

10 Likes

Half the python packages I “maintain” are usually from bumping an existing package, which added yet-another-dependency™.

6 Likes

I would be curious to know more details about the problems @samuela is facing. Maybe that would help us come up with more targeted solutions.

1 Like

Allowing for automated control of flake inputs via some mechanism is something I’ve been iterating on. Some ideas being tossed around are to allow some limited computation (fromTOML,fromJSON), programmatic update of the inputs (via some tricks involving self-reflection), and allowing full control over attributes via the url - then exposing that nicely at the use-sites. I’m not sure about how that scales up to something as large as Nixpkgs though… needs some testing and experimentation.

Once I’ve got a better understanding of what should work or not I was planning to RFC it.

6 Likes

Packages that I rely on, directly or indirectly, that have been indiscriminately broken in the last two weeks:

  • jaxlibWithCuda (essential for using JAX)
  • tensorflowWithCuda
  • pytorchWithCuda
  • tensorflow-bin
  • cudaPackages.nccl
  • magma
  • sentry-sdk
  • wandb
  • dm-haiku
  • tensorflow-datasets
  • flax
  • cupy
  • optax
  • augmax
  • aws-sdk-cpp
  • google-cloud-cpp

and those are just the ones that come off the top of my head!

I love nix but I just can’t live like this. What’s the point of contributing to nixpkgs if someone is just going to come along and break everything I’ve built anyways?

9 Likes

Can you explain to me what this merge train you envision would entail? I don’t believe it would be able to solve the problem you want it to. We simply lack the resources to build every commit of nixpkgs for rebuild heavy changes and even for smaller changes checking all reverse dependencies on CI is not done atm (and would be difficult I think). Simple changes can go to master and are easy to test, but hard to automatically judge due to the diversity of nixpkgs: What about this ofborg timeout? Is it supposed to fail (to evaluate) on this platform? …

One area it can help with is for the eval checks ofborg does which always need to succeed. They are quite an annoyance if you fix up something trivial and want to merge right after, as they take a good 20min to complete. A merge train could help us eliminate the need to check back after half an hour here. As far as I know, @domenkozar has applied for GitHub’s merge train beta feature with this use case in mind, but I don’t know what the state of the application is.

We do via the channel blocking jobs (e.g. nixos:trunk-combined:tested and nixpkgs:trunk:unstable). Usually merging staging-next is delayed at least until nixpkgs:trunk:unstable succeeds.

Of course this is ignored sometimes, since otherwise necessary or even security/time critical changes are impossible to make. The “Always Green”-mantra is a recurring topic in the community, with the inevitable conclusion being that it would paralyze nixpkgs development. Our workflow of testing (especially risky) changes to a reasonable extend and then resolving unforeseen regressions on a development branch (either staging-next or master (!)) is, I think, a necessary compromise. Our energy should probably be spent to optimize this process.

I think this is actually the key issue. Hydra used to send out emails to package maintainers for failing builds, but due to the amount of jobsets this caused unnecessary noise. I think we need a proper replacement for this. In relation to staging-next and haskell-updates report generating scripts have been invented as a remedy, but we are still lacking an universal solution.

What would be great would be some sort of maintainer dashboard, showing the build status of maintained packages in the various jobsets on Hydra (maybe related to the corresponding branches of nixpkgs). This could be integrated into Hydra which already has a customizable dashboard feature and all the necessary data or implemented separately (although querying the Hydra API is always problematic).

After splitting up nixpkgs into multiple flakes (per package or package group) I can see two outcomes:

  1. If we force consistent flake versions across all inputs (i.e. overriding the input flakes’ lock files using follows), it’ll become an unmaintainable mess where making any change may break the flakes provided by any number of different repositories. Then making a nontrivial change involves making the change and then figuring out which repositories are affected, preparing PRs, waiting for them to be reviewed and merged, and eventually updating the nixpkgs lock file.
  2. If we accept all input flakes’ lock files, we would probably end up with as many versions of glibc and the whole userland as packages. This would mean an explotion in size of any nixpkgs config or Nix user environment – and Nix already is already space hungry.

So I don’t see how flakes could do anything for us, but make our predicament worse.

We are in no position to be able to do this. Marking packages as broking is alright, as it just saves users the time it takes to find this out for themselves. Removing packages confidently is not possible for us in my opinion, as we absolutely no clue about the downstream usage of packages – Nix has no sort of telemetry feature that let’s us get some idea about downstream usage.

My experience has been that GitHub activity is not necessarily reflective of actual usage of a package. I have become more cautious w.r.t. removing stuff from nixpkgs after inadvertently breaking someone’s workflow by removing something I was sure nobody would miss.

12 Likes

Then we should definitely mark more things broken which are failing to build and did that for a while and are definitely not easy to fix. Less things to worry about would be great.

6 Likes

As far as I understand, the idea is to check that a bunch of PRs as a batch taken together do not break too much. Might save on some rebuilds.

If a package has been marked as broken continuously for significant time, then using required pretty careful overriding, and moving the package to a local codebase is not as much of a change as in case with packages just imported from Nixpkgs.

4 Likes

A lot of packages you are listing are unfree or have unfree dependencies which may explain why you are being hit by regressions excessively hard – we have difficulty providing CI for unfree packages.

10 Likes

I’d go even further than this. As a result of issues being harder to observe and summarise (and, I’d say, understand) integration between the components will be even more difficult while the components themselves will be free to move faster - ie, require even more integration work.

The end result will be nixpkgs and NixOS fracturing into irreconcilable shards where all further contributions are either even more work or even more risky than they are now.

To get a real benefit from splitting nixpkgs up (using flakes or any other mechanism) you need abstractions by which the resulting pieces can interact with each other. In other words, compartmentalization is not abstraction.

Rich Hickey’s “Simple Made Easy” (https://www.youtube.com/watch?v=SxdOUGdseq4) is a great talk on this subject. I don’t know if it offers any direct answers that help nixpkgs development but it has a lot of ideas that could perhaps be synthesized into a solution for nixpkgs development.

17 Likes

This sounds more like a licensing issue, that a technical issue, if hydra cannot build these package, how can CI fail.

Nix likes source code, in fact it really likes it, so providing close source or pseudo open source code, then it’s a challenge!

I didn’t know what a merge train was, but I’ve looked it up, seems like a lot of coordination and effort.

Maybe that effort can be better used to write strongly worded letters to companies proving pseudo open source projects and getting them to actually be open source… (fat chance).

@ samuela , i can sympathize with your situation… but don’t give up, there must be a way around this, technical, social, political or a combination of all them.

For the mean time, I’ll stick to the ‘soul train’. https://www.youtube.com/watch?v=lODBVM802H8

1 Like

What is necessary in order to get CI for all packages?

I understand that there’s hesitation when working with new systems people don’t yet understand. However, how can it be that merge trains paralyze development when they are used by the most successful software eng organizations in the world, known for having the most forward-thinking infrastructure without paralyzing their own development?

FWIW I’m only proposing that we offer a merge train for those that are interested in using it. The existing staging workflow would still function as usual. This gives contributors optionality: put your changes onto a merge train to have the confidence that you’re not breaking anything, or use traditional channels and acknowledge that you may be the cause of breakages.

This is correct. Merge trains amortize CI cost over many PRs.

6 Likes

As stated, it is impossible. We have packages for which it is literally impossible to provide fully automated CI, as they have requireFile. I guess if there is a well defined area of unfree packages someone wants to cover, an experimental bot like the old ofborg (only posting comments when summoned) could be provided first. Once its bheaviour and usefulness is demonstrated over time, maybe it would be accepted as another «status check» provider. (Nope, build failure as red failure won’t be accepted, though)

You might also look at the recent discussion about changing the macOS tier in terms of how it would work out organisationally if provided.

It is plausible that any universal «Always Green» thing will break Nixpkgs processes beyond usability. That’s what people were talking about.

And Nixpkgs is different in the sense of the ratio of actual «own» development to the external changes reflected into the project.

Yes, you are right, if external build capacity is found to provide opt-in merge train functionality, and the throughput situation is easily publically viewable, the use will self-balance so that it is a pure good.

(I can see some risks if it is really hard to predict how long will it take to process a batch, but any kind of a public dashboard visibly linked solves them)

4 Likes

Yes, absolutely! To clarify: there will absolutely need to be some form of blacklisting packages. But that ought to be straightforward. The vast majority of packages are well behaved.

Why not?

2 Likes

Because even before the fact that «Always Green» will indeed grind Nixpkgs progress to a halt, you won’t come close to clearing the reliability bar where «red is don’t merge» is a viable norm.

1 Like

Just to understand: There are no downsides coming with bors, or some other merge train, right? It doesn’t incur substantially more CI runs, it just prevents accidental semantic merge conflicts by running the CI on the branch-soon-to-be-merged, and not the current branch? In that case it can’t hurt to just use it, regardless of the “always green” debate.

3 Likes

Well, the question then becomes under what condition to promote/revert the batch…

You could as well say that master/unstable-channel is already a merge train setup in the weakest definition.

4 Likes

For starters, all checks that ofborg does (including those triggered by PR reviewers).

2 Likes

Some people (including me) would actually love to see hydra bumping nixos-unstable being replaced by a bors-like system.

There is basically no drawback (no commit will appear in unstable unless hydra passes anyway, the biggest drawback I can see would be that nixos-unstable-small would no longer exist). And it would mean that all PRs ready for merging would actually land even if another PR were to break hydra, thanks to bors’ dichotomy-based bad PR identification strategy. This incidentally means that the need for nixos-unstable-small would probably just go away anyway, as I expect such a hydra to move much faster as it’d basically have automated rollback.

Adding other checks would probably make sense later on, but as a first step it seems to me like this would be a net win, plus it’ll prepare the infrastructure for when we will actually want to implement more checks later.

The one big issue is actually infrastructure, as I seem to understand plugging into hydra is not actually easy, and we probably don’t want to be rebuilding all these things twice. But maybe building enough to check hydra would be green is not actually that expensive and it could reasonably be done outside hydra?

11 Likes

For starters, all checks that ofborg does (including those triggered by PR reviewers).

Note that triggered builds failing is reported as gray/«neutral» in the checks.

2 Likes