GPU-enabled docker containers in nixos

Hi, I’d like to use a docker container with CUDA enabled for deep learning experiments, in a nixos host, and it seems that the containers can’t see the GPU that nixos can see.

I have enabled all the options for virtualization, docker and docker-nvidia on the nixos host, to the point in which the nvidia-smi command returns an output that shows that the graphic card is present:

$ nvidia-smi                               
Fri Dec  9 06:47:14 2022       
| NVIDIA-SMI 510.60.02    Driver Version: 510.60.02    CUDA Version: 11.6     |
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
|  0%   33C    P8    20W / 270W |    272MiB /  8192MiB |      0%      Default |
|                               |                      |                  N/A |
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|    0   N/A  N/A      1391      G   ...xorg-server-1.20.13/bin/X      122MiB |
|    0   N/A  N/A     12415      G   ...02.0/bin/.firefox-wrapped      147MiB |

Now, in the repo for gpu-jupyter, it is mentioned that this is a good command to check if your GPU can be used from a docker container:

docker run --gpus all nvidia/cuda:11.6.2-cudnn8-runtime-ubuntu20.04 nvidia-smi

In my case, the output of the command is:

docker: Error response from daemon: failed to create shim: OCI runtime create failed:
container_linux.go:380: starting container process caused: process_linux.go:545: 
container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: 
nvidia-container-cli: container error: cgroup subsystem devices not found: unknown.
ERRO[0000] error waiting for container: context canceled

by googling the error, I tried adding a number of cgroup-related options to the kernel, but none worked properly. I’d like to take a step back, and properly understand what this means, and ideally how to fix it. My overarching goal is to be able to get a gpu-accelerated docker container for deep learning.

It seems I can bypass the problem setting:

systemd.enableUnifiedCgroupHierarchy = false;

in my configuration. I consider this problem solved, but I’d still like to know if someone has a more comprehensive answer for why this has to be a problem.