Seeking help on building `pytorch` with CUDA 11.8

Hi all!

I am trying to build pytorch with CUDA 11.8 (just released) off the current nixpkgs master, by creating an overlay like this:

final: prev: let
  cuda11 = final.cudaPackages_11_8;
in rec {
  python3 = prev.python3.override {
    packageOverrides = pyFinal: pyPrev: rec {
      pytorchWithCuda11 = pyPrev.pytorchWithCuda.override {
        cudaPackages = cuda11;
        magma = final.magmaWithCuda11;
      };
    };
  };
  python3Packages = python3.pkgs;
}

When I tried to actually build the package in my the flake, it panics on the following assertion:

error: assertion '(useCudatoolkitRunfile || (libcublas != null))' failed

       at /nix/store/171cjmpyl45dz6dy818i3kf7x3nijkpg-source/pkgs/development/libraries/science/math/cudnn/generic.nix:28:1:

           27|
           28| assert useCudatoolkitRunfile || (libcublas != null);
             | ^
           29|
(use '--show-trace' to show detailed location information)

After checking the source code of the cudnn derivation, my best guess is that I need to override the cudaPackages_11_8.cudnn so that useCudatoolkitRunfile = true is passed in. My updated overlay now looks like:

final: prev: let
  cuda11 = final.cudaPackages_11_8;
in rec {
  python3 = prev.python3.override {
    packageOverrides = pyFinal: pyPrev: rec {
      pytorchWithCuda11 = pyPrev.pytorchWithCuda.override {
        cudaPackages = cuda11;
        magma = final.magmaWithCuda11;
      };
    };
  };
  python3Packages = python3.pkgs;
  # Overrride cudnn in cudaPackages_11_8
  cudaPackages_11_8 = prev.cudaPackages_11_8.overrideScope' (gfinal: gprev: {
    cudnn = gprev.cudnn.override {
      useCudatoolkitRunfile = true;
    };
  });
}

This produces the same error above. I thought the overrideScope' should have at least fix this particular error. Can someone point out what went wrong?

Also, I think not being able to build cudnn in cudaPackages_11_8 is probably not by design. If I can find a fix with the help, shall we patch the master as well?

Thanks for your time!

Thanks to @SergeK’s fix, the problem is now fixed by cudaPackages_11_8: fix missing manifest by SomeoneSerge · Pull Request #200426 · NixOS/nixpkgs · GitHub

2 Likes