K3s + nvidia failed to create shim task: OCI runtime create fail

I am attempting to get nvidia working inside of k3s.

Following this wiki page.

I have added

{{ template "base" . }}

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia]
  privileged_without_host_devices = false
  runtime_engine = ""
  runtime_root = ""
  runtime_type = "io.containerd.runc.v2"

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia.options]
  BinaryName = "/run/current-system/sw/bin/nvidia-container-runtime"

to /var/lib/rancher/k3s/agent/etc/containerd/config.toml.tmpl

but after rebooting I get this from nerdctl:

 sudo nerdctl run -it --gpus=all ubuntu nvidia-smi
FATA[0004] failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 3, stdout: , stderr: No help topic for 'oci-hook': unknown 

but it is the --gpus=all flag that is the issue as without it nerdctl runs just fine,

sudo nerdctl run -it ubuntu /bin/bash -c "echo good"
good

I have also tried without these two lines:

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia.options]
  BinaryName = "/run/current-system/sw/bin/nvidia-container-runtime"

as the note on the wiki page suggests without any luck.

Does anyone else have this working?

Even attempting to get docker running with nvidia seems to be having issues on my setup:

sudo docker run --rm --runtime=nvidia --gpus all ubuntu /bin/bash -c "echo one"
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: /nix/store/va74ykggqzmamwh2aj39fxlwzf8csw6s-nvidia-docker/bin/nvidia-container-runtime did not terminate successfully: exit status 125: unknown.

just to be certain nvidia does indeed work outside of the container

nvidia-smi
Fri Jun  7 14:34:04 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.78                 Driver Version: 550.78         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3050 ...    Off |   00000000:01:00.0 Off |                  N/A |
| N/A   42C    P8              6W /   35W |      10MiB /   4096MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      2844      G   ...c90hy96r4-xorg-server-21.1.13/bin/X          4MiB |
+-----------------------------------------------------------------------------------------+

and I can run video games and LLM from ollama and watch nvtop spike etc.