Petition to build and cache unfree packages on cache.nixos.org

jonringer · February 11, 2022, 9:19pm

If people wanted to run an inhouse hydra server which creates the unfreeRedistributable jobs, then an example can be found https://hydra.jonringer.us/jobset/nixpkgs/nixpkgs-master-unfree

New jobs: https://hydra.jonringer.us/eval/211?compare=master#tabs-new

samuela · February 11, 2022, 10:22pm

Don’t get me wrong, I’m not a huge fan of NVIDIA either but I don’t think telling people to fuck off is a practical solution. Regardless of whether we like it or not, a lot of people are forced to work with CUDA.

jonringer · February 11, 2022, 10:48pm

I previously worked at Azure ML, so I sympathize. CUDA is largely ubiquitous in the machine learning community.

If you’re doing any model training, you have to use GPU acceleration; and right now that’s only nvidia’s CUDA toolkit.

7c6f434c · February 12, 2022, 9:07am

Long-term, these people do not push open-source upstreams to OpenCL adoption that much.
Short-term, these are the people who throw quite a lot of compute at problems anyway, they should be able to cooperate for a CUDA-specific buildfarm.

ML people addressing either of the points is preferable to the non-ML Nix users to Hydra building CUDA stuff (one because of general free software availability, the other because of appearance of a large and cooperatively-operated example of an independent Hydra and binary cache, presumably with things to learn from them)…

nixinator · February 12, 2022, 5:13pm

The only solution is for nixos foundation to buy Nvida out right, and open source all there stuff, get rid of the reg and pays walls and then just put it on hydra … job done.

samuela · February 12, 2022, 6:31pm

OpenCL vendor implementations are generally not open source. OpenCL is slower than CUDA. Even OpenCL’s own author, Apple, has abandoned it in favor of Metal now. OpenCL/OpenGL/<other OSS toolchain> just don’t cut it these days, esp. in ML.

I feel that there’s this misconception that everyone in ML must be rich and have money/compute to blow left and right. As an ML researcher in academia, I can assure you this is far, far from the truth. I cannot afford to blow 80 CPU-hours recompiling tensorflowWithCuda every time some tiny package in its dependency tree changes. It’s just not sustainable.

ryantm · February 12, 2022, 6:50pm

I’d recommend getting tensorflowWithCuda from a pinned nixpkgs version and caching it in a local cache or Cachix. Then your group can choose if a particular update is worth paying the cost.

samuela · February 12, 2022, 7:04pm

Yes, I do in fact pin the version in my shell.nix files but the ecosystem (jax in particular) moves so fast that I find myself frequently needing to update.

Mic92 · February 12, 2022, 7:29pm

I think @domenkozar had plans to build unfree packages as well with a bigger cachix cache.

7c6f434c · February 12, 2022, 7:35pm

Maybe some of the people in the CUDA thread are still building tensorflow often enough and would be willing to share?

I agree not everyone has enough to compute to rebuild it every time, but I hope that with some sharing it could be a thing solved well enough among those who need it.

brogos · February 12, 2022, 10:27pm

Some libraries like Tensorflow and Pytorch have Cuda enabled packages at Pypi.

samuela · February 12, 2022, 11:15pm

Certainly, but one of the major benefits of NixOS for me is that I can release my code with very precise instructions on how to reproduce results. “Run exactly these dependencies from this commit of nixpkgs” etc etc.

The pip install route doesn’t offer me those same benefits unfortunately.

brogos · February 13, 2022, 2:41am

I just mentioned that to show is possible to distribute to packages with Cuda code compiled.

NobbZ · February 13, 2022, 7:54am

Do the actually come with CUDA or just with libraries that are prebuilt and link to CUDA libs?

Anyway, it has already been said that it is not a legal problem to do this with CUDA and other non-free but redistributable libraries or programs. Now it remains a problem of the infrastructure and ideology as far as I understood the GH issue when I skimmed it.

samuela · February 13, 2022, 7:53pm

Oh I see! Yes it’s def possible from a licensing and legal standpoint!

They come prebuilt and link to CUDA libs AFAIU. The assumption is generally that cudatoolkit and cuDNN are already installed on the user’s machine.

brogos · February 13, 2022, 10:40pm

I saw the Pytorch wheel and they include libcudart (Cuda Runtime) but that is not the case of Tensorflow.

SergeK · February 18, 2022, 7:58am

Because ML/DL community is a pretty big market where Nix would have a lot to offer, even hindered by such things as NVIDIA licenses. The lack of a smooth UX is the blocker for people even to consider switching from pip/conda to Nix

nixinator · February 18, 2022, 10:30am

it’s impossible to smooth UX or DX for build systems that are essential broken in ways that are architectural design errors, rather a ‘a few bugs’ or ‘bad edge cases’. But i concur everything could be better…wanna help out? no time huh… i understand. I had a dream where Nvida gave the nixos foundation almost infinte resources to make this stuff work, and enforce reproducibility especially in critical AI for robots… , with no strings attached, they didn’t even ask for their little green and black logo to be plasted all over nixos.org web site… and then i woke up, and realised it was all a far fetched dream…

Do you think conda exists because the python build system is good, or do you think it exists because the python build systems are close to what i would term ‘technical insanity’ that you can get?.. but for some ‘strange’ reason it works well in a monorepo implementation…strange that. At least it’s not npm, just when things get bad, they can always get worse… npm says’ ‘hold my beer’.

I really hoping that dream2nix can at least tame some of these build systems, and a year from now, the situation of these builds system on traditional distro’s will worsen…

sorry to be the harbinger of doom…

you may find this interest, i know i did.

build system meltdown and reproducibility is endemic in the industry, for those brave enough to admit it. Lucky i’m less grumbly because nix does it different, it may do it right or wrong, but it’s worth a shot…only time will tell. Even if the existence of nix makes other open source ecosystem on other open source operating systems up their game and be better at the entire software development lifecycle stuff, then it has served it purpose and we have not wasted our time.

When i put my security hat on, i can only say ‘no wonder information security is so so bad’ at the moment…we’ve actually engineered it.

Keep nixing, and keep dream2nix’ing. and remember, if you can still read this, then you are part of the (nix) resistance.

zimbatm · February 18, 2022, 11:00am

I haven’t read the whole thread but please get in touch with me if you want to make this happen. Not in the main NixOS project, to avoid putting it at risk, but I think it’s easy enough to automate.

GitHub - numtide/nixpkgs-unfree: nixpkgs with the unfree bits enabled is the MVP. The next step is to add a CI that rebuilds unfree packages and populates the cache. And once it’s ready, I will be moving it to a neutral org.

zimbatm · February 19, 2022, 7:56pm

Today I got the CI up and running and it started pushing packages to https://nixpkgs-unfree.cachix.org/ . Only Linux is supported for now. I’m starting to look at other use-cases like CUDA.