Setting up nvcc on nix

I am trying to set up nvcc for compiling CUDA code on NixOS, but seem to be hitting known incompatibilities between specific pairs of nvcc and gcc versions when compiling .cu C++ code.

For nvcc 12.4, this specific version listing:

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Tue_Feb_27_16:19:38_PST_2024
Cuda compilation tools, release 12.4, V12.4.99
Build cuda_12.4.r12.4/compiler.33961263_

I have tried various gcc 13.x releases (through the devbox tool), only to find various x64 intrinsics, expected to exist, to not exist:

  • gcc 13.1.0 and gcc 13.2.0: __builtin_ia32_cmpccxadd, __builtin_ia32_cmpccxadd64
  • gcc 13.3.0: same as before, and additionally __builtin_ia32_ldtilecfg and __builtin_ia32_sttilecfg
  • gcc 14.2.0: (with flag -allow-unsupported-compiler) many many more, not expected to work as nvcc 12.4 does not support gcc past version 13

To be specific, the compilation errors look like this:

/nix/store/4krab2h0hd4wvxxmscxrw21pl77j4i7j-gcc-13.3.0/lib/gcc/x86_64-unknown-linux-gnu/13.3.0/include/amxtileintrin.h(49): error: identifier "__builtin_ia32_sttilecfg" is undefined
    __builtin_ia32_sttilecfg (__config);

As the cudaPackages_12.4.cudatoolkit-provided nvcc is not able to find its own headers, I explicitly include (-I flag) its own headers in cudart-dev. Paths so you can verify I did not mess up this step:

nvcc binary:
/nix/store/1vrn7wj5hscv90xx0diinq8xc7z9c0zk-cuda_nvcc-12.4.99/bin/nvcc

corresponding include directories:
/nix/store/58vlmd39635x86y2x387si8b27sjlvjv-cuda_cudart-12.4.99-dev/include

I have tried the same for cuda 11, with gcc versions 9 through 12

Anyone aware how to set up a sane cuda development environment on nix?
Additionally, why does nvcc seem to depend on AVX512/AMX headers? I certainly don’t need to generate such code.

Thanks!

The compatible compiler is exposed as cudaPackages.backendStdenv.cc, you can use it e.g. as

mkShell.override { stdenv = cudaPackages.backendStdenv; } {
  packages = [ cudaPackages.cuda_nvcc ];
}

(I didn’t test this snippet, treat it as pseudo-code)

2 Likes

I totally missed that cudaPackages provides a suitable stdenv. Thanks for the info. Unfortunately, this still doesn’t work. Indeed, the cudaPackages.backendStdenv uses gcc 13.3.0, which I have been testing with. (Unless I messed up my shell.nix file; I have provided it for reference at the bottom of this reply as I am most certainly not an experienced user of nix). I will boot up a non-nixos machine soon to see what configuration I have using previously, which has not had this issue.

I should note that I missed my own test code was doing a #include <immintrin.h>, which at least explains why that was being pulled in. However, it seems silly to me that the inclusion of that header would cause issues considering I have never had this problem outside of this NixOS machine. The workaround in my actual application, which does compile AVX2 code inside of a CUDA .cu file, is the following:

#if defined(__NVCC__)
extern void __nvcc_dummy_that_is_not_defined_anywhere(); // If you try to use this, it won't link. Thats the point
#define __builtin_ia32_cmpccxadd(...) __nvcc_dummy_that_is_not_defined_anywhere(), 0
#define __builtin_ia32_cmpccxadd64(...) __nvcc_dummy_that_is_not_defined_anywhere(), 0
#define __builtin_ia32_ldtilecfg(...) __nvcc_dummy_that_is_not_defined_anywhere(), 0
#define __builtin_ia32_sttilecfg(...) __nvcc_dummy_that_is_not_defined_anywhere(), 0
#endif
#include <immintrin.h>

Does the NIX cudaPackages do something special with nvcc? Do the cuda headers, provided by a different nix package, match? What is up?

As nvcc basically just calls into gcc, where the error occurs, I will probably just follow the macro definitions and investigate what specifically goes wrong with the defines provided by nvcc or perhaps the cuda_runtime.h header. Already strace-ed out the fork+exec into gcc (actually into sh, then to gcc, for some calls, for reasons known only to NVIDIA).

Regarding the headers, why does the nvcc package NOT provide all the headers and libs? This forces me to pass them on the cli to nvcc explicitly, which is just silly and unneccesary, and means that I need to hook into my build system to build on nix.

Here is the shell.nix file, for reference, as promised above:

# shell.nix
let
  pkgs = import <nixpkgs> {};
in
pkgs.mkShell.override{ stdenv = pkgs.cudaPackages.backendStdenv; } {
  packages = with pkgs; [
    cudaPackages.cudatoolkit
  ];
}

Not at that level. You can look-up overrides.nix under cuda-modules in Nixpkgs, and the setup-cuda-hook.sh.

We do apply a patch to gcc to unbreak intrinsics-related stuff on aarch64-linux in the __NVCC__ code branches.

Can’t tell much more just from memory

cudaPackages.cuda_nvcc includes the headers that are distributed by NVIDIA as part of their official cuda_nvcc package: Index of /compute/cuda/redist/cuda_nvcc. These headers are propagated by its dev output. The package does not try to include headers or libraries from other CUDAToolkit packages (like cudart, libcublas, etc.) because that would (1) inflate closure sizes by tens of gigabytes, (2) be a surprising behaviour.

Normally constructing the correct flags is handled by FindCUDAToolkit.cmake (widely tested in nixpkgs) or meson (currently underexplored in nixpkgs). You just use find_package(CUDAToolkit ... COMPONENTS ...) and you put the respective derivations into buildInputs.

packages = [ cudaPackages.cudatoolkit ]

Nit: as mentioned in all related threads on GitHub and Discourse: it’s advisable to replace cudatoolkit with individual packages, and to also put them into the correct input sets: nativeBuildInputs for nvcc and buildInputs for the rest.