RHEL/CentOS, Fedora, SLES, and OpenSUSE also redistribute CUDA
We would be doing a slightly different thing, which is probably fine, but someone needs to make the call. Note that for mainline Hydra CUDA is nothing special, so the maintainers need to be able to make such calls uniformly and reliably… I think for a few years we only built non-branded Firefox on Hydra because of the minor patches for NixOS layout of stuff that were declared definitely fine as soon as we actually started discussing with Mozilla. Extrapolating from that precedent.
Speaking of nontrivial things: changing the layout during redistribution might make it verbatim redistribution of consituent parts, not verbatim redistribution of the original work in its entirety. I have no idea and I have no desire to bear responsibility for calling it either way.
A separate collaboration of CUDA and MKL users does single out CUDA, can delegate the calls to people having read CUDA and MKL licenses in details, etc.
the CI/CD problem that hydra solves so beautifully for an entire OS
I think CUDA is large enough that you should stop thiniking in terms of entire OS. Nix means that you can have a separate branch that stabilises some set of the related heavy things, and install the development environment for your data pipeline from there — without breaking your system. A bit suboptimal, but any other solution means keeping your ground against the huge churn.
I think this is largely because the maintainer test burden on reverse dependencies is high due to extremely long builds.
Note that this also means that ofBorg would also time out, and nobody would be surprised, and not everyone would care.
I know RFC46 is not yet accepted, and it is about platforms and not individual packages, but I like the terminology we worked on there. Trying to keep everything heavy around Python data crunching green on unstable would mean asking for Tier-2 impact on merges for things that are not really served at the Tier-2 level by existing PR testing tooling. Actually, @timokau achieves this for Sage by doing a huge amount of high-quality work and being always there to help whenever any question about Sage impact arises. In the case of Sage, though, there is a lot of actual investigation to be done.
If there are enough people interested in doing the reviews and investigations and fixes for master
PRs to the things related to the data crunching packages, just having dedicated build capacity with unprivileged accounts for everyone in the response team could be enough, Note that you need people to iterate quickly on proposed fixes, which probably means nix build
is quite useful regardless of a Hydra instance.
Of course, once you have a reputation for keeping up with debugging the relevant PRs, an explicit OK from Nvidia and a binary cache (possibly lagging a few weeks) with a significant user base, you might have an easier time convincing people to make an exception. On the other hand, at that point this exception will be a smaller improvement than it would be right now.
On the other hand, I guess the first steps are the same regardless of the final goal: refactor to increase the chance of legality of redistribution, try to get Nvidia’s comments, organise a debugging team.