CUDA not available in pytorch

Using this flake

{
  description = "Pytorch with CUDA";

  inputs = {
    nixpkgs     .url = "github:nixos/nixpkgs/nixos-24.11";
    flake-utils .url = "github:numtide/flake-utils";
  };

  nixConfig = {
    extra-substituters = [ "https://nix-community.cachix.org" ];
    extra-trusted-public-keys = [ "nix-community.cachix.org-1:mB9FSh9qf2dCimDSUo8Zy7bkq5CX+/rkCWyvRCYg3Fs=" ];
  };

  outputs = { self, nixpkgs, flake-utils, ... }:

    flake-utils.lib.eachDefaultSystem
      (system:
        let
          pkgs = import nixpkgs {
            inherit system;
            config = {
              cudaSupport = true;
              allowUnfree = true;
            };
          };

          # ----- A Python interpreter with the packages that interest us -------
          python-with-all-my-packages = (python:
            (python.withPackages (ps: [
              ps.torch-bin
              ps.pytest
            ])));

        in
          {
            devShell = pkgs.mkShell {
              name = "pytorch-with-cuda";
              buildInputs = [
                (python-with-all-my-packages pkgs.python312)
              ];
              packages = [
              ];
              shellHook =
                ''
                '';
            };
          }
      );
}

and this pytest

def test_torch_cuda_available():
    import torch
    print(f"torch.version.cuda: {torch.version.cuda}")
    assert torch.cuda.is_available()

I get this result

cuda_test.py:4: AssertionError
---------------------------------------------------- Captured stdout call -----------------------------------------------------
torch.version.cuda: 12.4
====================================================== warnings summary =======================================================
cuda_test.py::test_torch_cuda_available
  /nix/store/fiykpvx4ly0rwyhmf89qck9xf4fqzrl2-python3-3.12.8-env/lib/python3.12/site-packages/torch/cuda/__init__.py:129: UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.)
    return torch._C._cuda_getDeviceCount() > 0

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=================================================== short test summary info ===================================================
FAILED cuda_test.py::test_torch_cuda_available - AssertionError: assert False

nvidia-smi says

      
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.77                 Driver Version: 565.77         CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3060 ...    Off |   00000000:01:00.0  On |                  N/A |
| N/A   59C    P8             13W /  115W |    4187MiB /   6144MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

How how can I get to the bottom of the missing CUDA functionality in pytorch?

My system has CUDA 12.7, torch is on 12.4. IIUC, this version difference shouldn’t be a problem.

1 Like

I don’t have an answer to your question. Just wanted to relay that I am encountering the same issues with torch-bin. It seems to be compiling without CUDA.
This is the thread I am referrring to: Overlays: remove a package in a list (using torch-bin instead of torch) - #14 by gaidenko