I’m working on a tensor library using CUDA 13.2. I have a simple demo app that works by itself. But when I use cuda-gdb to try and run it, it always works the first time immediately after I build it, but it fails when I try again.
(cuda-gdb) r
Starting program: /home/wahid/dev/breeze/build/demo
[New LWP 27227]
[New LWP 27237]
[New LWP 27238]
[New LWP 27239]
Tensor A:
Tensor(shape=[2, 3], values=[1, 2, 3, 4, 5, 6])
Tensor B (shallow copy of A):
Tensor(shape=[2, 3], values=[1, 2, 3, 4, 5, 6])
Tensor C (A + B):
Tensor(shape=[2, 3], values=[2, 4, 6, 8, 10, 12])
[LWP 27239 exited]
[LWP 27237 exited]
[LWP 27227 exited]
[LWP 27193 exited]
[LWP 27238 exited]
[New process 27193]
[Inferior 1 (process 27193) exited normally]
(cuda-gdb) r
Starting program: /home/wahid/dev/breeze/build/demo
[Thread debugging using libthread_db enabled]
Using host libthread_db library “/nix/store/57iz36553175g3178pvxjij8z5rcsd4n-glibc-2.42-61/lib/libthread_db.so.1”.
[New Thread 0x7ffff1fff000 (LWP 27292)]
[libprotobuf ERROR /dvs/p4/build/sw/devtools/Agora/Rel/CUDA13.2/Imports/Source/ProtoBuf/protobuf-3_21_1/src/google/protobuf/descriptor_database.cc:560] Invalid file descriptor data passed to EncodedDescriptorDatabase::Add().
[libprotobuf FATAL /dvs/p4/build/sw/devtools/Agora/Rel/CUDA13.2/Imports/Source/ProtoBuf/protobuf-3_21_1/src/google/protobuf/descriptor.cc:1986] CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size):
terminate called after throwing an instance of ‘google::protobuf::FatalException’
what(): CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size):
Thread 1 “demo” received signal SIGABRT, Aborted.
0x00007ffff789fdcc in __pthread_kill_implementation ()
from /nix/store/57iz36553175g3178pvxjij8z5rcsd4n-glibc-2.42-61/lib/libc.so.6
Here is my flake.nix:
{
description = "Breeze flake";
inputs = {
nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
};
outputs = {nixpkgs, ...}: let
system = "x86_64-linux";
pkgs = import nixpkgs {
inherit system;
config = {
allowUnfree = true;
cudaSupport = true;
};
};
cudaPackages = pkgs.cudaPackages_13_2;
in {
devShells.x86_64-linux.default = import ./shell.nix {inherit pkgs cudaPackages;};
};
}
And shell.nix:
{
pkgs,
cudaPackages,
}:
pkgs.mkShell {
packages = with pkgs; [
clang-tools
neocmakelsp
cmake
gcc
gdb
gtest
stdenv.cc.cc.lib
cudaPackages.cuda_nvcc
cudaPackages.cuda_gdb
cudaPackages.cuda_cudart
];
shellHook = ''
export CUDA_HOME=${cudaPackages.cuda_cudart}
export CUDA_PATH=${cudaPackages.cuda_cudart}
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/run/opengl-driver/lib:/run/opengl-driver-32/lib
'';
}
nvidia-smi output says Driver Version: 595.71.05 and CUDA Version: 13.2