Installing PyTorch into a virtual Python environment

I’m trying to get PyTorch to work in a virtual environment on nixOS, but it doesn’t pick up the GPU:

$ python3 -m venv .venv
$ .venv/bin/pip install numpy torch
[...]
$ .venv/bin/python -c "import torch; print(torch.cuda.is_available())"
False

It does however work when using the globally installed torch (via pkgs.python3.withPackages (p: p.torch):

$ python -c "import torch; print(torch.cuda.is_available())"
True

I tried digging into https://github.com/NixOS/nixpkgs/blob/2353abf8a5324c41b1646737625c542ab7e82a16/pkgs/development/python-modules/torch/default.nix to see what the difference is, but didn’t make it very far…

Some more info on both installations:

# from .venv:
$ .venv/bin/pip show torch
Name: torch
Version: 2.1.0
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3
Location: /home/roland/tmp/test/.venv/lib/python3.10/site-packages
Requires: filelock, fsspec, jinja2, networkx, nvidia-cublas-cu12, nvidia-cuda-cupti-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-runtime-cu12, nvidia-cudnn-cu12, nvidia-cufft-cu12, nvidia-curand-cu12, nvidia-cusolver-cu12, nvidia-cusparse-cu12, nvidia-nccl-cu12, nvidia-nvtx-cu12, sympy, triton, typing-extensions
Required-by:

# the global one from `pkgs.python3.withPackages (p: p.torch)`:
$ pip show torch
Name: torch
Version: 2.0.1
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3
Location: /nix/store/wzz04sdcldbd61vgrqgdi0h7p31lww6v-python3-3.10.13-env/lib/python3.10/site-packages
Requires: filelock, jinja2, networkx, sympy, typing-extensions
Required-by:

Interestingly the global one (the one that does work) doesn’t have any cuda things listed in its Requires.

.venv/bin/pip install numpy torch
.venv/bin/python -c "import torch; print(torch.cuda.is_available())"
False

PyTorch from PyPi doesn’t know where to find the cuda driver, you need to tell tell it explicitly e.g. like so: LD_LIBRARY_PATH=/run/opengl-driver/lib python -c "import torch; ..."

This is because torch from PyPi is built specifically for FHS operating systems, i.e. systems that drop their libraries in the global /usr/lib and expose a global /etc/ld.so.conf file which tells them extra library search paths, including the location of the cuda driver (libcuda.so). If you want a better experience using prebuilt programs built with this assumption, you might also want to look into programs.nix-ld.enable.

Interestingly the global one (the one that does work) doesn’t have any cuda things listed in its Requires.

This is because pytorch on pypi only recently started declaring as python dependencies at the dist-info level. They used to just copy cuda native libraries into their own wheels instead. For our source-build this is irrelevant, because we don’t event want to be using the python-packaged cuda, we want to link to our own cudaPackages, which has a bunch of patchelf hardening applied and is (or at least should be in the long run, we’re still not doing too great) split into independent outputs to reduce runtime closure sizes

2 Likes

Thanks for the helpful answer!