Using CUDA-enabled packages on non-NixOS systems?

How does one use NVIDIA/CUDA-enabled packages on non-NixOS systems? Due to How to use NVIDIA V100/A100 GPUs?, I’ve had to give up on NixOS and instead have resorted to an ubuntu system with NVIDIA drivers. However, packages like tensorflowWithCuda/jaxlibWithCuda look for cudatoolkit and driver libraries in /var/run/opengl-driver/lib instead of whatever the location is on the non-NixOS OS (on ubuntu it appears to be /usr/lib/x86_64-linux-gnu/libcuda.so.1). I tried linking libcuda.so.1 and friends into /var/run/opengl-driver/lib but that only gets me new problems: https://github.com/google/jax/issues/9644.

So how does this work? How does one use CUDA-enabled software in nixpkgs on non-NixOS machines?

1 Like

I was successful with NixGL on CentOS and A100 GPUs.

See also:

2 Likes

Rather random questions, but:

  • Do you only symlink libraries that come from cudatoolkit, or do you also link things from the nvidia x11 driver (like libnvidia-ml.so or something)
  • Have you tried running with LD_DEBUG=libs and observing which libraries jax is trying to search for at runtime?
  • Is there ptxas in PATH?
1 Like

I haven’t found success with nixGL:

[nix-shell:~/dev/research/lottery]$ nixGLNvidia-510.47.03 python cifar10_convnet_run.py --test
tests took 1.25754 seconds
Could not load library libcudnn_ops_infer.so.8. Error: libcublas.so.11: cannot open shared object file: No such file or directory
Please make sure libcudnn_ops_infer.so.8 is in your library path!
/home/ubuntu/.nix-profile/bin/nixGLNvidia-510.47.03: line 6: 11823 Aborted                 (core dumped) "$@"

Am I doing something wrong?

These are good questions!

  • I was symlinking in everything, including libnvidia-*.so. I believe everything was present but ofc anything’s possible since it wasn’t working.
  • I wasn’t aware of LD_DEBUG=libs, that’s very handy! It looks like it’s trying and failing to find libcublas.so.11 in a few places. (full logs here) IIRC libcublas.so.11 is in cudatoolkit, so I’m not sure why that’s failing… EDIT: yes, it lives at /nix/store/9lv0wxqkbqw2438wrhllcyf3sx644i5z-cudatoolkit-11.5.0/lib/libcublas.so.11
  • Yes, ptxas is in PATH.

Related thread:

For future reference, my solution is to add the following in my shell.nix:

  shellHook = ''
    export LD_LIBRARY_PATH=${pkgs.cudatoolkit_11_5}/lib
  '';

and then run things with nixGL:

$ nixGLNvidia-510.47.03 python myscript.py
1 Like

This should be fixed (no need to export LD_LIBRARY_PATH) now that cudnn: init cudnn_8_3 at 8.3.0 by samuela · Pull Request #158218 · NixOS/nixpkgs · GitHub has landed.

2 Likes
Hosted by Flying Circus.