Nixpkgs's current development workflow is not sustainable

SergeK · April 28, 2022, 12:57pm

What’s necessary in order to get hydra to build unfree packages?
@samuela

The idealistic answer is: a separate hydra. To build the old cudatoolkit, one essentially has to curl | sh a runfile, which is a threat. To build pytorchWithCuda, one has to use nvcc in buildPhase, and later run the checkPhase that would eventually import torch: all of which invoke code originating in a blackbox binary.

E.g., your NixOS core system might follow nixos-unstable just fine, but your Python environment with scientific computing packages will follow another channel, independently
@FRidh

I think the idea of maintaining channels that target specific user-groups is worth a shot in some of its possible forms. I doubt, however, that we should encourage users to simultaneously follow multiple channels for different subsystems, if that’s what you’re suggesting: we would loose that benefit of having pieces in sync, for which we pay with centralization. The question “which nixpkgs and overlays are you using?” would turn into “which core, which python, which haskell, and which texlive have you pinned?” . Having single source of truth is nice both for maintainers and for users, albeit expensive

Sandro · April 28, 2022, 2:23pm

I am really looking forward to that and I am hyped. If you need help testing or brainstorming please hit us up on matrix.

jeff-hykin · April 28, 2022, 5:56pm

What if we can have our cake and eat it too. Or in this case; modularize nixpkgs without actually splitting up nixpkgs.

I do reinforcement learning research, and a key lesson is when two agents/actors are evolving simultaneously; neither agent can learn well. As agent#1 learns something, agent#2 “evolves” and breaks/invalidates what agent#1 just learned. But if they take turns, they can actually learn quickly and converge.

We’ve got that same situation here:

nixpkgs unstable is changing; breaking torch (or whatever package)
and torch is changing; breaking stuff in the latest nixpkgs unstable

Even if we cleverly reduce the number of inter-dependencies, it’s not going to categorically change this problem. Upstream/downstream dependencies are branching; O(b^n) even if we change b or n, it’s still an exponential equation and the inputs get bigger every year.

Goals

I can see your point now (@many-people) about how both a monorepo and fewer breakages are important.

We want:

intercompatible (all green) packages
maintainable, localized, latest, updates

Fundamental limitations

Nixpkgs unstable (or staging) can not be both all-green AND be the source of the latest versions of everything. There’s not enough compute power in the world to recursively run all the tests on every version bump. So let nixpkgs handle our #1 (intercompatiblity) and not worry about having the latest of literally everything.

On the flip side (the title of this thread) package maintainers can not update their package every time a dependency changes. There’s not enough developer free time in the world to check every upstream change and every downstream consequence. So let individual packages handle our #2 desire; staying up to date and maintainable and not worry about upstream changes, or downstream consequences.

Eating & Having Cake

So let’s consider this; a mono repo and multi repo, treating them like the two-actor coordination problem at the begining.

The multi-repo:

Assumption: Let’s say the torch (or whatever) maintainer treats nixpkgs as frozen by pinning to a specific nixpkg commit (like the 21.11 release).
Updates: Now that nixpkgs is not changing, it becomes realistic for a maintainer to attempt getting new versions of torch working. The ground is no longer collapsing from under them.
Flexibility: However, whenever there is a problem; like if torch needs a new version of GCC, the maintainer has the flexibility to unpin and move up and down the nixpkg timeline (like pinning to the 22.05-pre release) to make this version of torch work.
Testing: The torch maintainer effectively runs unit tests; only testing torch, without worrying about downstream breakages or the upstream daily nixpkg-unstable changes.

The mono-repo:

Assumption: On the flip side, nixpkgs can assume that torch is stable by pinning to a version of torch (like the 1.11.0 release), and using overlays to make torch use the latest nixpkgs. This could, but doesn’t have to be done with submodules (submodules could protect against breakages from a multi-repo suddenly changing its url)
Updates: Just like the multi-repo, the “torch is stable” assumption let’s nixpkgs start upgrading itself without the ground turning into liquid. Instead of packages updating randomly, imagine a Nixpkg upgrade as wave. It begins with bottom/foundational-packages like glibc or openssl. We update glibc, if nothing breaks then create a git tag “wave-1.1” . Then update cmake, if that doesn’t break anything downstream, we finalize a “wave-1.2” tag. Once a tag is finalized, it indicates an all-green set of packages. Eventually it’s torch’s turn to be updated and checked as part of wave-1. But only after torch’s dependencies are all-green.
Flexibility: Just like the multi-repo, once there is a problem, like glibc breaking downstream stuff, nixpkgs has the flexibility to pick any commit from a package timeline. So if glibc breaks torch, maybe the torch repo already has a fix waiting (torch has been updating itself independently). Ask torch for a “wave-1.1” version, or try the latest stable torch release. If torch is still broken, keep running tests, and file an issue/PR on the torch repo requesting “wave-1.1” support. Only once torch, and the other downstream stuff is fixed, can the wave-1.1 (glibc) be finalized.
Testing: Even if a minor wave is held-up by a broken package like torch, that doesn’t mean the next major wave can’t start. Wave-2.1 starts as soon as a new glibc or other foundation package is available. All waves are progressively becoming “more green” until they’re finished. A mature wave, even if unfinished, would be pretty stable in theory. E.g. we can each customize how much stability to sacrifice in exchange for cutting-edge-ness. The monorepo acts like one giant integration test that takes months to complete; because an unfinished wave necessarily means something is either broken or untested (unstable). Waves (minor number) could be as small as a single gcc update, assuming that the update breaks literally 0 downstream packages.

This can also be done in hierarchy, with each python package having a mini-repo, and pythonPackages being a monorepo, with nixpkgs using pythonPackages instead of individual packages.

Result

We could get bleeding-edge versions of any individual package, since the package would be using an all-green or mostly-green foundation (pretty stable for bleeding edge). The caveat is if you need multiple bleeding edge packages; they might not play nice together. But if you need everything to play nice together (and don’t want to fix bleeding edge packages yourself) then the only solution is to use the most recent all-green set of packages (which inherently takes a long time to curate and won’t be bleeding-edge).

We can’t magically keep it all green and all bleeding-edge.

Sandro · April 29, 2022, 1:42pm

I just read your pretty lengthy post 3 times and I still don’t understand it.

My conclusion is that you are describing hydra with notifications and an easier way to move along the timeline for each package.

Thats not what we want to do. We could pick between different releases but tracking an unstable commit for a repository that has 45000 commits and the first page of commits is at most one day old sounds scary and can probably only go south.

Please don’t. Building torch is expensive and blindly trying 25 commits a day to hope that some problem is fixed is not efficient.

We don’t need submodules. flakes, overlays and fetchers can do everything git can do with submodules but better integrated into nix.

They still couldn’t be imported in the same process but if I can run two python scripts which different dependencies in the same profile without interfering with each other even if one is execed from the other it would be great.

I am not sure if hydra has the same problems as me but every time I download CUDA the download times out and fails even on a Gigabit connection. Are we maybe missing a EU mirror? Why is the downloaded blob even over 1 GB?

jeff-hykin · April 29, 2022, 4:15pm

Okay, I edited the previous post to try to address your points better. Maybe I’ll be able to make a diagram that’s both short and more clear.

Oh I 100% agree. I didn’t mean it quite that literally/strictly.

In practice I imagine a standard tag would be used; like nixpkg looks for a nix-wave-6 tag on the torch repo.
exists? => test it
broken or not-exists? => file an issue.
Done: O(1) per package.
It would be the torch maintainer (as a member of nixpkgs) who could unpin torch and try different commits/releases. But having a nix-wave-6 tag would probably be a better way to achieve the same outcome. The torch maintainer could even just say “yeah there’s no version of pytorch that works with this set of upstream dependencies” and the wave would just permanently mark pytorch as broken on that wave.

The conceptual point was; nixpkgs can freeze to any specific commit of torch, without freezing the progress of torch versions for everyone.

I hate submodules, so I’m happy to avoid them haha. I only included them encase people had concerns about availability (fetchers don’t work when the host is down).

That’s not a totally inaccurate summary, but it does miss the main point

Yes, hydra would be the integration test that could auto-file issues.
And yes, easier to move along the timeline
But the real point was @samuela’s problem “The current system is simply not sustainable for downstream package maintainers, […] no prior notice, no migration plan, and no alerting of failures. How is a package maintainer expected to reliably function in this environment?”.

The multi repo + nixpkgs waves hopefully is that^ environment we could reliably function in.

Once a nixpkg wave “hits” torch, the upstream dependencies should be frozen & green
Torch is fixed/repaired one wave at a time, instead trying to stay green on unstable
Most importantly; breakages for a wave are no longer as urgent/stressful for a maintainer, because its still practical for people to get stable bleeding edge versions from the repo directly.
Testing bleeding-edge torch releases (not a part of any wave) let maintainers keep their sanity, while partly pre-testing for the next wave.

Yes, the same thing is true for torch pinning against nixpkgs; Ideally pin against releases, or waves, but can pin against any commit.

As a side note, I’ve been pinning against random nixpkgs commits for years. Even for big cuda+python+node+ruby+zsh+rust+cpp+docker nix-shell projects. It’s worked great. I often need very specific versions of tools and pinning is the only way to get them (I use lazamar’s tool). But even outside of that, all the time I’ll just go to github, get the latest nightly commit and use it for grabbing the bleeding edge of a single package. I’ve probably got +150 different simultaneous nixpkg tarballs, and my nix store is still smaller than 1 modern video game.

milahu · May 2, 2022, 8:23pm

edit: sorry, misunderstanding.
to jeff-hykin, waves flow from provider to consumer. (“top-down pinning with controlled propagation”)
to me, expensive consumers should pin their dependencies. (“bottom-up pinning”)
(i assume that only few packages are expensive consumers = long build times)

“bottom-up pinning”
“multversion” pattern

somepkg-2 and otherpkg-2 are NOT compatible with each other
but we want to have the newest version of both

solution: make more use of the “multversion” pattern
= maintain multiple versions of one package in one version of nixpkgs
= give more packages the “privilege”
to pin their dependencies
to fix breakages (or to prevent expensive rebuilds)

these newest versions “lead” the waves (aka dependency trees)

challenge:
python env’s allow only one version per package
→ only one scope/env

compare:
node allows nested dep-trees by default
→ different nodes can use different versions of the same package
→ every node has it’s own scope/env

again: Allowing Multiple Versions of Python Package in PYTHONPATH/nixpkgs

the goal is to pin transitive dependencies to old versions
while allowing to import new versions in the root scope

edit: on the python side, this “one package, one version” approach is considered a feature (not a bug), as this allows passing data between libraries. when using different versions of one package, we must convert data formats on runtime, which is slow

I’d say its it’s much more common than not that when two packages depend on one or more of the same packages deeper in the stack (Numpy, SciPy, Pandas, Matplotlib, Cython, Sympy, xarray, Dask, etc), there’s some form of direct or indirect data interchange, or other cross-dependency between at least one pair of them, if not most. In particular, when they are used, numpy arrays, pandas dataframes, xarray objects are routinely exchanged, and code compiled with different Cython versions (if they actually merited a hard dependency non-overlap) may well be ABI-incompatible and cause a C-level hard crash.

As such, it seems likely that this will further break as many packages as it will fix, and in ways that can be far harder to debug and recover from than a simple dependency conflict on installation, this does not really seem to be a viable solution, relative to other strategies.

SergeK · May 2, 2022, 9:41pm

node allows nested dep-trees by default

Just a brief comment, although I do not claim to have already understood you, and I definitely haven’t yet comprehended @jeff-hykin’s post: npm’s uncontrollable “explosion” of the dependency graph seems to be explicitly a situation that nixpkgs wants to avoid.

Sure there are more reasons, and other maintainers can provide many more examples, but I can at least affirm that mismatches in cuda versions brought into scope by different dependencies have been a source of issues with e.g. pytorch in nixpkgs. Now cudaPackages provide multiple versions of same packages, but they are used in a very constrained manner: for example the downstream derivations consume the whole cuda package set, rather than individual packages, which is intended to limit the overrides to “meaningful” combinations. And they are small. And they are one of the exceptions

jeff-hykin · May 3, 2022, 11:22pm

Here’s a diagram that will hopefully explain the orignal idea better
(not a response to recent messages, just me catching up)

Pink and green are the main points/purpose
Every circle is a commit (unimportant commits are not visible though)
Curved dotted lines are dependencies.
Purple/Blue = nixpkgs-depending-on-torch
Gray = torch-depending-on-nixpkgs

There’s lots of possible variation I wanted to show, but the visualization gets complicated quick.
I’ll make more visualizations for edgescases we want to discuss. (And I talk about some of them below)

*Observations and Possibilities
- The only edgecase visualized is the blue dot (wave-2-3), because I thought it was important.
  
  In theory, wave-2-3 was a python update that broke torch 1.10.0.
  In this hypothetical, the easiest way to get “all-green” again was just patch the old wave-1-5 tag (purple) and create the patched-commit (the blue dot itself).
  
  This is importatnt because its different from wave-2-5. Wave-2-5 is specifically trying to update torch to the latest version (1.10.3).
  (Wave-2-5 is the wave that “hits” pytorch)
- The part that says “glibc, … (and downstream fixes)” the blue dot/wave-2-3 is an example of such a downstream fix
- Notice nixpkgs never connects (purple line) to torch 1.10.2. That’s intentional.
  (In this example, torch 1.10.2 can only be obtained from the torch repo directly)
- Torch 1.10.1, 1.10.2, 1.10.3, show all their dependencies (gray dotted lines) going to wave-1-5.
  This was visually convienient and is plausible, but is not required.
  In practice, I’d expect them to point at the latest finished wave. For example
  torch 1.10.3 => wave-2-1
  torch 1.10.2 => wave-1-10
  torch 1.10.1 => wave1-6
  “which wave?” would be a choice of the torch maintainer
  (and it would be done based what dependencies torch needs).
- Multiple versions, like we have for LLVM, are fine (or at least are not worse than the current situation).
  For example, torch1_10_3 and torch1_10_0 could both be in nixpkgs.
  Instead of one purple to wave-2-5,
  there would be two purple lines,
  one pointing to the wave-2-5 tag (renamed to “wave-2-5__1.10.3”)
  and the 2nd line pointing at an older commit (with a wave-2-5__1.10.0 tag)

Sandro · May 5, 2022, 7:58pm

keep in mind that misses important security patches and for example mixing multiple glibc is usually bound for trouble.

Sounds cool for a side project for people interested in data science and AI but I personally don’t see the fit for nixpkgs.

jeff-hykin · May 5, 2022, 9:53pm

I know there’s a lot of stuff you handle that keeps you busy, if you get the chance to mention why its a bad fit though, I do want to fix/change whatever is bad about the idea. I’m still afraid unmaintainability is going to cause the eventual death of nixpkgs, and I personally want to address that instead of just being cynical or doing one-off improvements. There’s been some good ideas in this thread, but I don’t think I’ve seen an actual plan to address the original problem.

I don’t see how it would help data science/AI
maybe I should’ve used gimp as an example instead of torch.

Sandro · May 6, 2022, 1:58am

I still have lots of questions:

When do waves start?
When do they stop evolving?
What goes onto a wave when it started?
How many combinations of waves and package versions are going to be build?
What do we do with packages that cause mass rebuilds, release very often und often
break stuff like hypothesis which releases at least once a week and rebuilds almost everything in python packages?

jeff-hykin · May 6, 2022, 4:58pm

Thanks, those are really important problems to address, the last one especially.
@-everyone, while Sandro started these, I hope others will follow up on them

I’ll add to the question list too: how should we handle security patches to root packages?

Some visual references of whats in my head when writing the answers:
If we look at nixpkgs as trees (roots are at the bottom, image from nix-visualize)

Waves start at some root node, and iteratively expand towards all leaf-nodes.

Major wave: wave-1, O(n) * O(minor-wave)
- i=0, updates the bottom yellow node, then starts a minor wave at that node (and waits on it)
- i=1, updates the slightly-less yellow node, then starts a minor wave at that node
Minor waves: O(n), n= number of downstream packages
- given a root node like zlib (orange, left side)
- i=1, zlib is built and tested
- i=2, libssh2 (nearest downstream) is built and tested
- …
- i=?, nix is tested
- (no more leaf nodes: end)

This ends up a little bit like this animation, big block = major wave, small block = minor wave
the idea: as big wave progresses, the small waves get smaller and smaller

ezgif-3-a90439047a

Q&A

When do major waves start?
It’s a hand-picked decision; Wave-X could be started whenever its realistic/desirable to have a new version of a root package, like a major release of llvm.
When do minor waves start? as soon as the previous minor wave is finished.
(having an wave-X-2pre is fine too if the previous minor wave hasn’t finished)
What goes onto a wave when it started?
This is definitely something I should’ve been more clear about, the “glibc, clang, (etc)” was very vague.

There’s nothing stopping wave-X-1 from updating multiple packages.
Ideally a minor wave would update a group of packages that are either
1: packages that are hyper-related
like deno’s rust crates; deno-dev, deno-core, deno-ops, deno-ast (which import eachother).
or 2: packages with already-upgraded upstreams
So hypothesis would need to be upgraded on a wave before numpy, and numpy on a wave before torch. But torch and pyyaml could probably be upgraded on the same minor wave

Hyper related groups need to be hand crafted, which I imagine is what would be done for heavyweights like llvm, clang, glibc.
The already-upgraded upstreams be automated, and better yet, their tests for these could safely be run in parallel.
What do we do with packages that cause mass rebuilds and release often?
Torch represents this kind of package (or at least I meant for it to)

We know rebuilding/testing on every torch bump is entirely impractical
So, looking 3 pictures up (the Torch vs Nixpkgs one)
Take a look at the middle of that orange line in the diagram: the torch 1.10.2 release
Its not used by any nix-wave; it was never built on nixpkgs or integration tested.

That was made to represent torch updating too fast; nix had to skip torch versions.
Replace torch with llvm, gcc, glibc, openssl and its the same story. The world can’t be rebuilt every time they’re bumped, so (to keep an all-green status) versions are inevitably skipped.

The current Nixpkgs is either similar, or must not be all-green.
When do major waves stop?
Its basically irrelevent, but it would stop by simply running out of downstream packages to update: when every package was updated exactly once.
When do minor waves stop?
Its when all tests for downstream packages are resolved as pass/wont-fix/cant-fix
I’m sure this can be fudged/bent a bit, similar to how replace-dependency gets around the rules

Once a package is updated+rebuilt, its not touched again for the rest of the major-wave
So, to minimize build time, we would want lock-in super heavyweights first
This is why the first 10 minor waves could take longer to test than the next 100 or 1,000

=> wave-X-1 intentionally rebuilds the world
(if we’re updating llvm, and we guarentee all-green packages, world-rebuilding seems necessary)
=> wave-X-2 is (approximately) the-world minus packages upgraded in wave-X-1
=> etc
How many combinations of waves and package versions are going to be build?
I’m not exactly sure what this is asking. If this is related to what @milahu brought up, the tldr is probably use whatever current method nixpkgs uses (multiversion), but I’m still working on a full response about that.
How should we handle security patches to foundational packages?
Lets say nixpkgs was working on wave-2-14, meaning openssl was already upgraded and locked-in.
But then a major vulnerability was found in that version of openssl.

There’s at least a few options:
1: Use an existing method like replace-dependency and make all later waves just have the fix without fully testing.
(probably good for small patches)

2: It could be patched on wave-2-14, and the world could be rebuilt
(probably good for medium/big patches that could break stuff)

3: The current major wave is duplicated (including minor waves), but minor waves are marked as unfinished. This way people can use the patch while waiting on tests to catch up.
(probably good for security fixes that require major version updates)

4: Individual environments/users use overrides and pull from the bleeding-edge of the openssl repo
(probably good for 0-day vulns)

amjoseph · May 15, 2022, 6:18pm

Perhaps you meant to say that it defeats the point of NixOS’s shared binary cache.

Overlays are the reason why I use nixpkgs.

PS, this thread could also be titled “Python’s ecosystem is not sustainable”. I am very grateful to the nixpkgs maintainers who are willing to fight with it so I don’t have to. You guys are heroes.

drupol · May 24, 2022, 9:29pm

3blue1brown ! I love those videos, especially the last one!

nixinator · May 25, 2022, 10:25am

‘do not ask what python can do for you, ask what you can do for python’

‘when dependices don’t work thats when the fun begins’.

samuela · July 28, 2022, 10:26pm

I just ran into this all over again with others’ updates causing breakages to packages I maintain:

`python39Packages.apache-beam` build failure on x86_64-linux as of `7554374d` · Issue #183172 · NixOS/nixpkgs · GitHub
`python310Packages.jax` build failure on x86_64-linux as of `7554374d` · Issue #183173 · NixOS/nixpkgs · GitHub

What are we going to do to resolve this?

bjornfor · July 29, 2022, 9:00pm

It’s far from ideal, but how about adding some/more passthru.tests = { inherit dependee_1 dependee_2 ...; } in the dependency expressions?

ryantm · July 30, 2022, 1:01am

This tests attribute is the subject of a recently accepted RFC https://github.com/NixOS/rfcs/blob/8bb86f8bddd98deb3c03c5698d5eff0b9072d0a7/rfcs/0119-testing-conventions.md

Sandro · August 3, 2022, 12:06pm

What you can do:

Have an eye on python-updates and participate in the runs
Have an eye on staging and staging-next and fix issues that come up there
use nixos stable
use one of the many python tools outside of nixpkgs to manage your python dependencies on your paste
don’t create an issue for every failed build if the root cause is identical or just a pinning problem
maintain all the slow moving AI/ML stuff in a separate repository

samuela · August 3, 2022, 11:25pm

If this is the expectation then it is simply not sustainable, hence the OP.