EDIT: Issue #412324 on github already takes on this exact issue.
There’s already some posts and github issues about this topic. But none have resolved my issue whatsoever. In particular:
- Distrobox nVidia
- distrobox: Using the GPU inside the container with `--nvidia` does not work · Issue #241316 · NixOS/nixpkgs · GitHub
- nvidia-container-toolkit: broken for podman start [container] and distrobox with both docker and podman backends · Issue #412324 · NixOS/nixpkgs · GitHub
Here’s my relevant config bits:
virtualisation = {
podman = {
enable = true;
defaultNetwork.settings.dns_enabled = true;
};
};
services.xserver.videoDrivers = [ "nvidia" ];
hardware = {
graphics.enable = true;
nvidia = {
modesetting.enable = true;
powerManagement.enable = false;
powerManagement.finegrained = false;
open = false;
nvidiaSettings = true;
package = config.boot.kernelPackages.nvidiaPackages.stable;
};
hardware.nvidia-container-toolkit.enable = true;
environment.systemPackages = with pkgs; [distrobox nvidia-container-toolkit];
};
when running the podman directly the GPU is detected:
podman run --rm -it --device=nvidia.com/gpu=all ubuntu:latest nvidia-smi -L
GPU 0: NVIDIA GeForce RTX 3060 Laptop GPU (UUID: GPU-f8b02fb5-c3b8-2793-cba1-f6277d2f3989)
podman run --rm -it --gpus all ubuntu:latest nvidia-smi -L
GPU 0: NVIDIA GeForce RTX 3060 Laptop GPU (UUID: GPU-f8b02fb5-c3b8-2793-cba1-f6277d2f3989)
but creating a distrobox container and trying to enter with:
distrobox create -i ubuntu:latest --name test --additional-flags "--device=nvidia.com/gpu=all"
distrobox enter test
results in
Error: unable to start container "3604390c85aa38a6659c3ebb92d83f6648d771d9b400af83dd6569cedbcc8aae": crun: open `/home/<user>/.local/share/containers/storage/overlay/da94f067de184714c7cd8803f2fbd2fbc5f45b6cfa96a101d4dbaae7f45f1675/merged`: Permission denied: OCI permission denied
sudo ls -l /home/<user>/.local/share/containers/storage/overlay/da94f067de184714c7cd8803f2fbd2fbc5f45b6cfa96a101d4dbaae7f45f1675
total 20
drwxr-xr-x 3 100000 100000 4096 Jul 30 10:49 diff
-rw-r--r-- 1 user user 26 Jul 30 10:49 link
-rw-r--r-- 1 user user 57 Jul 30 10:49 lower
drwx------ 2 user user 4096 Jul 30 10:56 merged
drwx------ 3 user user 4096 Jul 30 10:56 work
Trying the distrobox workaround results in
distrobox create --name example-nvidia-toolkit --additional-flags "--runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=all -e NVIDIA_DRIVER_CAPABILITIES=all" --image nvidia/cuda:12.9.1-base-ubuntu22.04
Creating 'example-nvidia-toolkit' using image nvidia/cuda:12.9.1-base-ubuntu22.04 Error: default OCI runtime "nvidia" not found: invalid argument
when switching to docker the same behavior persists. setting up the docker daemon config with:
sudo nvidia-ctk runtime configure --runtime=docker --enable-cdi
results in the following file:
{
"features": {
"cdi": true
},
"runtimes": {
"nvidia": {
"args": [],
"path": "nvidia-container-runtime"
}
}
}
but I’m not sure this will be used at all in the docker daemon anyway since that’s configured with the docker config options. I haven’t found the config file in the store though.