Nixpkgs's current development workflow is not sustainable

sternenseemann · April 22, 2022, 10:21am

Can you explain to me what this merge train you envision would entail? I don’t believe it would be able to solve the problem you want it to. We simply lack the resources to build every commit of nixpkgs for rebuild heavy changes and even for smaller changes checking all reverse dependencies on CI is not done atm (and would be difficult I think). Simple changes can go to master and are easy to test, but hard to automatically judge due to the diversity of nixpkgs: What about this ofborg timeout? Is it supposed to fail (to evaluate) on this platform? …

One area it can help with is for the eval checks ofborg does which always need to succeed. They are quite an annoyance if you fix up something trivial and want to merge right after, as they take a good 20min to complete. A merge train could help us eliminate the need to check back after half an hour here. As far as I know, @domenkozar has applied for GitHub’s merge train beta feature with this use case in mind, but I don’t know what the state of the application is.

We do via the channel blocking jobs (e.g. nixos:trunk-combined:tested and nixpkgs:trunk:unstable). Usually merging staging-next is delayed at least until nixpkgs:trunk:unstable succeeds.

Of course this is ignored sometimes, since otherwise necessary or even security/time critical changes are impossible to make. The “Always Green”-mantra is a recurring topic in the community, with the inevitable conclusion being that it would paralyze nixpkgs development. Our workflow of testing (especially risky) changes to a reasonable extend and then resolving unforeseen regressions on a development branch (either staging-next or master (!)) is, I think, a necessary compromise. Our energy should probably be spent to optimize this process.

I think this is actually the key issue. Hydra used to send out emails to package maintainers for failing builds, but due to the amount of jobsets this caused unnecessary noise. I think we need a proper replacement for this. In relation to staging-next and haskell-updates report generating scripts have been invented as a remedy, but we are still lacking an universal solution.

What would be great would be some sort of maintainer dashboard, showing the build status of maintained packages in the various jobsets on Hydra (maybe related to the corresponding branches of nixpkgs). This could be integrated into Hydra which already has a customizable dashboard feature and all the necessary data or implemented separately (although querying the Hydra API is always problematic).

After splitting up nixpkgs into multiple flakes (per package or package group) I can see two outcomes:

If we force consistent flake versions across all inputs (i.e. overriding the input flakes’ lock files using follows), it’ll become an unmaintainable mess where making any change may break the flakes provided by any number of different repositories. Then making a nontrivial change involves making the change and then figuring out which repositories are affected, preparing PRs, waiting for them to be reviewed and merged, and eventually updating the nixpkgs lock file.
If we accept all input flakes’ lock files, we would probably end up with as many versions of glibc and the whole userland as packages. This would mean an explotion in size of any nixpkgs config or Nix user environment – and Nix already is already space hungry.

So I don’t see how flakes could do anything for us, but make our predicament worse.

We are in no position to be able to do this. Marking packages as broking is alright, as it just saves users the time it takes to find this out for themselves. Removing packages confidently is not possible for us in my opinion, as we absolutely no clue about the downstream usage of packages – Nix has no sort of telemetry feature that let’s us get some idea about downstream usage.

My experience has been that GitHub activity is not necessarily reflective of actual usage of a package. I have become more cautious w.r.t. removing stuff from nixpkgs after inadvertently breaking someone’s workflow by removing something I was sure nobody would miss.