Attempting to get CUDA working in Elixir Livebook

Hey y’all - I’ve been running into some issues getting CUDA to actually work as a backend for Livebook. Unfortunately, there’s not a lot of documentation out there so here we are. Here’s the simplest repro:

Starting from this gist: Livebook + CUDA? · GitHub

  1. nix develop the flake.nix
  2. livebook start
  3. Open the web page then → OpenFrom source (or From URL), and evaluate the test_cuda.livemd.

It will fail at the last step with:

There was an error before creating cudnn handle (302): Error loading CUDA libraries. GPU will not be used. : Error loading CUDA libraries. GPU will not be used.

My guess here is that the EXLA library downloads a precompiled XLA binary as part of installation: nx/exla at main · elixir-nx/nx · GitHub

I don’t really know much about how the various CUDA libraries interact with each other so I’m not sure what is going on. It clearly detects that cuda is available from the call. Tracing that error, it comes from xla libcudart: xla/xla/tsl/cuda/cudart_stub.cc at e4ee0fec3dbfb53c61a11d5afac7f0dd6c559e5c · openxla/xla · GitHub

I’ve tried adding all the various cudaPackages into buildInputs and even setting LD_LIBRARY_PATH. I don’t know if I also need to do some rpath patching?

Appreciate any input - thanks!

I ended up solving it a different way. I used the docker image that contains all the correct libraries etc and using that on NixOS with oci-containers, pulling in the GPU with the flag:

  hardware.nvidia-container-toolkit.enable = true;

  virtualisation.oci-containers.containers.livebook = {
    image = "ghcr.io/livebook-dev/livebook:0.13.3-cuda12.1";
    ports = [ "8080:8080" "8081:8081" ];
    extraOptions = [ "--device=nvidia.com/gpu=all" ];
    volumes = [ "/var/lib/livebook:/data" ];
    environment = {
      LIVEBOOK_TOKEN_ENABLED = "false";
    };
  };