Cuda-gdb works only one time after compiling the binary

I’m working on a tensor library using CUDA 13.2. I have a simple demo app that works by itself. But when I use cuda-gdb to try and run it, it always works the first time immediately after I build it, but it fails when I try again.

(cuda-gdb) r
Starting program: /home/wahid/dev/breeze/build/demo
[New LWP 27227]
[New LWP 27237]
[New LWP 27238]
[New LWP 27239]
Tensor A:
Tensor(shape=[2, 3], values=[1, 2, 3, 4, 5, 6])

Tensor B (shallow copy of A):
Tensor(shape=[2, 3], values=[1, 2, 3, 4, 5, 6])

Tensor C (A + B):
Tensor(shape=[2, 3], values=[2, 4, 6, 8, 10, 12])
[LWP 27239 exited]
[LWP 27237 exited]
[LWP 27227 exited]
[LWP 27193 exited]
[LWP 27238 exited]
[New process 27193]
[Inferior 1 (process 27193) exited normally]
(cuda-gdb) r
Starting program: /home/wahid/dev/breeze/build/demo
[Thread debugging using libthread_db enabled]
Using host libthread_db library “/nix/store/57iz36553175g3178pvxjij8z5rcsd4n-glibc-2.42-61/lib/libthread_db.so.1”.
[New Thread 0x7ffff1fff000 (LWP 27292)]
[libprotobuf ERROR /dvs/p4/build/sw/devtools/Agora/Rel/CUDA13.2/Imports/Source/ProtoBuf/protobuf-3_21_1/src/google/protobuf/descriptor_database.cc:560] Invalid file descriptor data passed to EncodedDescriptorDatabase::Add().
[libprotobuf FATAL /dvs/p4/build/sw/devtools/Agora/Rel/CUDA13.2/Imports/Source/ProtoBuf/protobuf-3_21_1/src/google/protobuf/descriptor.cc:1986] CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size):
terminate called after throwing an instance of ‘google::protobuf::FatalException’
what():  CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size):

Thread 1 “demo” received signal SIGABRT, Aborted.
0x00007ffff789fdcc in __pthread_kill_implementation ()
from /nix/store/57iz36553175g3178pvxjij8z5rcsd4n-glibc-2.42-61/lib/libc.so.6

Here is my flake.nix:

{
  description = "Breeze flake";
  inputs = {
    nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
  };
  outputs = {nixpkgs, ...}: let
    system = "x86_64-linux";
    pkgs = import nixpkgs {
      inherit system;
      config = {
        allowUnfree = true;
        cudaSupport = true;
      };
    };
    cudaPackages = pkgs.cudaPackages_13_2;
  in {
    devShells.x86_64-linux.default = import ./shell.nix {inherit pkgs cudaPackages;};
  };
}

And shell.nix:

{
  pkgs,
  cudaPackages,
}:
pkgs.mkShell {
  packages = with pkgs; [
    clang-tools
    neocmakelsp

    cmake
    gcc
    gdb
    gtest
    stdenv.cc.cc.lib

    cudaPackages.cuda_nvcc
    cudaPackages.cuda_gdb
    cudaPackages.cuda_cudart
  ];

  shellHook = ''
    export CUDA_HOME=${cudaPackages.cuda_cudart}
    export CUDA_PATH=${cudaPackages.cuda_cudart}
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/run/opengl-driver/lib:/run/opengl-driver-32/lib
  '';
}

nvidia-smi output says Driver Version: 595.71.05 and CUDA Version: 13.2

I spent some time digging and found out that cuda-gdb is loading libcudadebugger which is statically linked to libprotobuf. What I don’t understand is why its failing, because the libcudadebugger.so is from driver 595 (its at /run/opengl-driver/lib/) with CUDA 13.2 support and my project flake is explicitly using the cudaPackages_13_2 package set. There’s no version mismatch, nor are there multiple instances of libprotobuf (that I can see) being loaded at once.