They are now including CUDA in the prebuilt binaries, which makes it even easier to package. Only downside is that libtorch_cuda.so
is now a 709MiB binary ;).
We now have a derivation python3Packages.pytorch-bin
with CUDA support:
https://github.com/NixOS/nixpkgs/pull/96669
Should help those who want to avoid the heavy build of python3Packages.pytorch
. I also did a PR for libtorch-bin
for the C++ API (which is also used by e.g. the Rust tch
) crate, so hopefully we’ll have that soon as well:
Forgot to add: the upstream builds use MKL as their BLAS library. This should generally give better performance than multi-threaded OpenBLAS, which we use by default as the system-wide BLAS and is thus used by python3Packages.pytorch
by default. Multi-threaded OpenBLAS also does not work correctly if your application uses any kind of threading.
Unfortunately, on a AMD Ryzen CPUs, MKL will use slower SSE kernels. You can force the use of AVX2 kernels with the MKL version that libtorch/PyTorch use, with export MKL_DEBUG_CPU_TYPE=5
.
I’d assume this is probably where the most people affected would reside.
If you’re on 5.9, you’ll need to circumvent the GPL-condom to use nvidia-uvm for CUDA. Spent more time then I’d like debugging the wrong places.
Thanks for the heads-up! I was wondering why I was getting CUDA error: unknown error
errors. Some strace
ing revealed that /dev/nvidia-uvm
could not be opened. Manual modprobe
ing showed an error that reminded me of your comment.
It’s annoying to run into these Linux ↔ NVIDIA licensing shenanigans when you are just trying to get work done .
I haven’t tested it, but there is this patch https://github.com/Frogging-Family/nvidia-all/blob/f1d3c6cf024945e7a477ed306bd173fa6b81d72d/patches/kernel-5.9.patch
Officially, we need to wait a month to Nvidia fix it NVIDIA Doesn't Expect To Have Linux 5.9 Driver Support For Another Month - Phoronix
Luckily NixOS makes it so easy to switch kernels , so it’s not a real problem to stick to a slightly older kernel.
I have tried to use the new BLAS/LAPACK infrastructure to build R with MKL and the resulting R produces incorrect results for matrix multiplication (see my comment Add BLAS/LAPACK switching mechanism by matthewbauer · Pull Request #83888 · NixOS/nixpkgs · GitHub). I have opened an issue here R built with MKL computes incorrect results · Issue #104026 · NixOS/nixpkgs · GitHub which has the code I used to build R.
It is. I have switched to 5.9 a while ago (maybe 2 weeks?) and have been using CUDA a lot with Torch.