Pytorch cuda on WSL

tom-huntington · March 19, 2022, 10:44pm

Without nix, I can get pytorch to use cuda if I install using pip. But if I use nix and the pytorch-bin package:

{ pkgs ? import <nixpkgs>
    {
        config =
        {
            allowUnfree = true;
            cudaSupport = true;
        };
    }
}:

pkgs.mkShell {
  buildInputs = 
  [
    pkgs.python38
    #pkgs.python38Packages.pytorch
    pkgs.python38Packages.pytorch-bin
  ];
}

I get the following error

torch.cuda.current_device()
Traceback (most recent call last):
File “”, line 1, in
File “/nix/store/az7dskqa2whs3dwihr1g9n9zmafs4dd8-python3.8-pytorch-1.8.1/lib/python3.8/site-packages/torch/cuda/init.py”, line 388, in current_device
_lazy_init()
File “/nix/store/az7dskqa2whs3dwihr1g9n9zmafs4dd8-python3.8-pytorch-1.8.1/lib/python3.8/site-packages/torch/cuda/init.py”, line 170, in _lazy_init
torch._C._cuda_init()
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from Download The Latest Official NVIDIA Drivers

Since this is a run time error, I imagined that an environment variable was being lost in nix-shell, but by comparing the outputs of printenv, nix seems to only append to the current environment variables.

I also tried the plain pytorch nix-package, but the build becomes non-responsive when building magma-2.5.4

[54/3311] Building Fortran object CMakeFiles/lapacktest.dir/testing/lin/chet21.f.o
…/testing/lin/chet21.f:311:33:

309 | DO 20 J = 1, N - 1
| 2
310 | CALL CHER2( CUPLO, N, -CMPLX( E( J ) ), U( 1, J ), 1,
311 | $ U( 1, J-1 ), 1, WORK, N )
| 1
Warning: Array reference at (1) out of bounds (0 < 1) in loop beginning at (2)

tom-huntington · March 19, 2022, 11:40pm

I found the solution here CUDA support running on WSL · Issue #535 · hasktorch/hasktorch · GitHub

you can add

shellHook = ''
    export LD_LIBRARY_PATH=/usr/lib/wsl/lib
  '';

samuela · March 20, 2022, 6:04am

Huh, what OS are you running in WSL?

tom-huntington · March 25, 2022, 2:48am

The default Ubuntu distro

samuela · March 25, 2022, 6:37am

Huh, I’m not familiar with WSL-specific stuff but I can say that if you’re able to reproduce this issue on a native install then this is a bug. As long as nix packages rely on CUDA, they should do so by adding cudatoolkit to their RUNPATH via patchelf. Users should not have to manually set LD_LIBRARY_PATH.

SergeK · March 25, 2022, 12:53pm

I think it’s a bit more complicated than that, the actual libcuda.so (not the stub) resides in hardware-specific nvidia_x11 and accessed through the impure /run/opengl-driver/lib. That means that on other linux distributions we’d also have either to use LD_LIBRARY_PATH or manually create /run/opengl-driver/lib

And then I have no clue what is the state of GL/accessing hardware on WSL

@tom-huntington do I understand you right, you’ve actually got pytorch running with cuda in WSL? If so, this sounds rather promising

samuela · March 25, 2022, 6:16pm

nixGL ought to do the job then I think? If that doesn’t work to populate /run/opengl-driver/lib/, then I believe that ought to be a bug.

tom-huntington · March 26, 2022, 7:45pm

Yes, I got it running. libcuda.so is in /usr/lib/wsl/lib and the problem is that nix doesn’t find it (pip must do special handling for wsl). No /run/opengl-driver/ directory exists for me.

I’m to sure how to add nixGL to my shell.nix file in the op… But I though that shouldn’t modify files outside /nix/store/. So may be nixGL needs to be add through nix-env which I though was only for nixOS…

Atry · February 11, 2023, 9:27am

This approach only works in nix-shell or nix develop. To enable CUDA in nix build, the following PRs are required:

Support CUDA in WSL2 by Atry · Pull Request #44 · nix-community/nixos-vscode-server · GitHub
Support CUDA in sandboxes by Atry · Pull Request #221 · nix-community/NixOS-WSL · GitHub