Don’t get me wrong, I’m not a huge fan of NVIDIA either but I don’t think telling people to fuck off is a practical solution. Regardless of whether we like it or not, a lot of people are forced to work with CUDA.
Long-term, these people do not push open-source upstreams to OpenCL adoption that much.
Short-term, these are the people who throw quite a lot of compute at problems anyway, they should be able to cooperate for a CUDA-specific buildfarm.
ML people addressing either of the points is preferable to the non-ML Nix users to Hydra building CUDA stuff (one because of general free software availability, the other because of appearance of a large and cooperatively-operated example of an independent Hydra and binary cache, presumably with things to learn from them)…
The only solution is for nixos foundation to buy Nvida out right, and open source all there stuff, get rid of the reg and pays walls and then just put it on hydra … job done.
OpenCL vendor implementations are generally not open source. OpenCL is slower than CUDA. Even OpenCL’s own author, Apple, has abandoned it in favor of Metal now. OpenCL/OpenGL/<other OSS toolchain> just don’t cut it these days, esp. in ML.
I feel that there’s this misconception that everyone in ML must be rich and have money/compute to blow left and right. As an ML researcher in academia, I can assure you this is far, far from the truth. I cannot afford to blow 80 CPU-hours recompiling tensorflowWithCuda every time some tiny package in its dependency tree changes. It’s just not sustainable.
I’d recommend getting tensorflowWithCuda from a pinned nixpkgs version and caching it in a local cache or Cachix. Then your group can choose if a particular update is worth paying the cost.
Yes, I do in fact pin the version in my shell.nix files but the ecosystem (jax in particular) moves so fast that I find myself frequently needing to update.
Maybe some of the people in the CUDA thread are still building tensorflow often enough and would be willing to share?
I agree not everyone has enough to compute to rebuild it every time, but I hope that with some sharing it could be a thing solved well enough among those who need it.
Certainly, but one of the major benefits of NixOS for me is that I can release my code with very precise instructions on how to reproduce results. “Run exactly these dependencies from this commit of nixpkgs” etc etc.
The pip install route doesn’t offer me those same benefits unfortunately.
Do the actually come with CUDA or just with libraries that are prebuilt and link to CUDA libs?
Anyway, it has already been said that it is not a legal problem to do this with CUDA and other non-free but redistributable libraries or programs. Now it remains a problem of the infrastructure and ideology as far as I understood the GH issue when I skimmed it.
Because ML/DL community is a pretty big market where Nix would have a lot to offer, even hindered by such things as NVIDIA licenses. The lack of a smooth UX is the blocker for people even to consider switching from pip/conda to Nix
it’s impossible to smooth UX or DX for build systems that are essential broken in ways that are architectural design errors, rather a ‘a few bugs’ or ‘bad edge cases’. But i concur everything could be better…wanna help out? no time huh… i understand. I had a dream where Nvida gave the nixos foundation almost infinte resources to make this stuff work, and enforce reproducibility especially in critical AI for robots… , with no strings attached, they didn’t even ask for their little green and black logo to be plasted all over nixos.org web site… and then i woke up, and realised it was all a far fetched dream…
Do you think conda exists because the python build system is good, or do you think it exists because the python build systems are close to what i would term ‘technical insanity’ that you can get?.. but for some ‘strange’ reason it works well in a monorepo implementation…strange that. At least it’s not npm, just when things get bad, they can always get worse… npm says’ ‘hold my beer’.
I really hoping that dream2nix can at least tame some of these build systems, and a year from now, the situation of these builds system on traditional distro’s will worsen…
sorry to be the harbinger of doom…
you may find this interest, i know i did.
build system meltdown and reproducibility is endemic in the industry, for those brave enough to admit it. Lucky i’m less grumbly because nix does it different, it may do it right or wrong, but it’s worth a shot…only time will tell. Even if the existence of nix makes other open source ecosystem on other open source operating systems up their game and be better at the entire software development lifecycle stuff, then it has served it purpose and we have not wasted our time.
When i put my security hat on, i can only say ‘no wonder information security is so so bad’ at the moment…we’ve actually engineered it.
Keep nixing, and keep dream2nix’ing. and remember, if you can still read this, then you are part of the (nix) resistance.
I haven’t read the whole thread but please get in touch with me if you want to make this happen. Not in the main NixOS project, to avoid putting it at risk, but I think it’s easy enough to automate.
Today I got the CI up and running and it started pushing packages to https://nixpkgs-unfree.cachix.org/ . Only Linux is supported for now. I’m starting to look at other use-cases like CUDA.