I have a few machines running NixOS deployed with deploy-rs. I have a Kubernetes cluster running on top of it. Recently, I added a new machine with different hardware from the previous ones. My daemonset from Kubernetes created a new pod on the new machine trying to get some disk setup, but encounter an error of not being able to find kernel modules:
W0227 10:13:14.925727 13522 rbd_attach.go:238] nbd modprobe failed (an error (exit status 1) occurred while running modprobe args: [nbd]): "modprobe: ERROR: could not insert 'nbd': Unknown symbol in module, or unknown parameter (see dmesg)\n"
The path is mounted from /run/current-system/kernel-modules/lib/modules/
, it works fine with the previous machines. Running a shell from inside the container, I realized that the mounted path contained symbolic links to nix stores:
# ls -al /lib/modules/6.6.79/kernel
total 56
dr-xr-xr-x 3 root root 4096 Jan 1 1970 .
dr-xr-xr-x 3 root root 4096 Jan 1 1970 ..
lrwxrwxrwx 1 root root 87 Jan 1 1970 arch -> /nix/store/fdgd264g9902y3v38dvw24vgxw589bl6-linux-6.6.79/lib/modules/6.6.79/kernel/arch
lrwxrwxrwx 1 root root 88 Jan 1 1970 block -> /nix/store/fdgd264g9902y3v38dvw24vgxw589bl6-linux-6.6.79/lib/modules/6.6.79/kernel/block
lrwxrwxrwx 1 root root 89 Jan 1 1970 crypto -> /nix/store/fdgd264g9902y3v38dvw24vgxw589bl6-linux-6.6.79/lib/modules/6.6.79/kernel/crypto
dr-xr-xr-x 3 root root 4096 Jan 1 1970 drivers
lrwxrwxrwx 1 root root 85 Jan 1 1970 fs -> /nix/store/fdgd264g9902y3v38dvw24vgxw589bl6-linux-6.6.79/lib/modules/6.6.79/kernel/fs
lrwxrwxrwx 1 root root 89 Jan 1 1970 kernel -> /nix/store/fdgd264g9902y3v38dvw24vgxw589bl6-linux-6.6.79/lib/modules/6.6.79/kernel/kernel
lrwxrwxrwx 1 root root 86 Jan 1 1970 lib -> /nix/store/fdgd264g9902y3v38dvw24vgxw589bl6-linux-6.6.79/lib/modules/6.6.79/kernel/lib
lrwxrwxrwx 1 root root 85 Jan 1 1970 mm -> /nix/store/fdgd264g9902y3v38dvw24vgxw589bl6-linux-6.6.79/lib/modules/6.6.79/kernel/mm
lrwxrwxrwx 1 root root 86 Jan 1 1970 net -> /nix/store/fdgd264g9902y3v38dvw24vgxw589bl6-linux-6.6.79/lib/modules/6.6.79/kernel/net
lrwxrwxrwx 1 root root 91 Jan 1 1970 security -> /nix/store/fdgd264g9902y3v38dvw24vgxw589bl6-linux-6.6.79/lib/modules/6.6.79/kernel/security
lrwxrwxrwx 1 root root 88 Jan 1 1970 sound -> /nix/store/fdgd264g9902y3v38dvw24vgxw589bl6-linux-6.6.79/lib/modules/6.6.79/kernel/sound
lrwxrwxrwx 1 root root 87 Jan 1 1970 virt -> /nix/store/fdgd264g9902y3v38dvw24vgxw589bl6-linux-6.6.79/lib/modules/6.6.79/kernel/virt
The reason the modprobe failed is because nix store patches are not mounted inside the container. And I checked from the host, and indeed the kernel folder contains symbolic links to different nix store pathes:
ls -al /run/current-system/kernel-modules/lib/modules/6.6.79/kernel
total 56
dr-xr-xr-x 3 root root 4096 Dec 31 1969 .
dr-xr-xr-x 3 root root 4096 Dec 31 1969 ..
lrwxrwxrwx 1 root root 87 Dec 31 1969 arch -> /nix/store/fdgd264g9902y3v38dvw24vgxw589bl6-linux-6.6.79/lib/modules/6.6.79/kernel/arch
lrwxrwxrwx 1 root root 88 Dec 31 1969 block -> /nix/store/fdgd264g9902y3v38dvw24vgxw589bl6-linux-6.6.79/lib/modules/6.6.79/kernel/block
lrwxrwxrwx 1 root root 89 Dec 31 1969 crypto -> /nix/store/fdgd264g9902y3v38dvw24vgxw589bl6-linux-6.6.79/lib/modules/6.6.79/kernel/crypto
dr-xr-xr-x 3 root root 4096 Dec 31 1969 drivers
lrwxrwxrwx 1 root root 85 Dec 31 1969 fs -> /nix/store/fdgd264g9902y3v38dvw24vgxw589bl6-linux-6.6.79/lib/modules/6.6.79/kernel/fs
lrwxrwxrwx 1 root root 89 Dec 31 1969 kernel -> /nix/store/fdgd264g9902y3v38dvw24vgxw589bl6-linux-6.6.79/lib/modules/6.6.79/kernel/kernel
lrwxrwxrwx 1 root root 86 Dec 31 1969 lib -> /nix/store/fdgd264g9902y3v38dvw24vgxw589bl6-linux-6.6.79/lib/modules/6.6.79/kernel/lib
lrwxrwxrwx 1 root root 85 Dec 31 1969 mm -> /nix/store/fdgd264g9902y3v38dvw24vgxw589bl6-linux-6.6.79/lib/modules/6.6.79/kernel/mm
lrwxrwxrwx 1 root root 86 Dec 31 1969 net -> /nix/store/fdgd264g9902y3v38dvw24vgxw589bl6-linux-6.6.79/lib/modules/6.6.79/kernel/net
lrwxrwxrwx 1 root root 91 Dec 31 1969 security -> /nix/store/fdgd264g9902y3v38dvw24vgxw589bl6-linux-6.6.79/lib/modules/6.6.79/kernel/security
lrwxrwxrwx 1 root root 88 Dec 31 1969 sound -> /nix/store/fdgd264g9902y3v38dvw24vgxw589bl6-linux-6.6.79/lib/modules/6.6.79/kernel/sound
lrwxrwxrwx 1 root root 87 Dec 31 1969 virt -> /nix/store/fdgd264g9902y3v38dvw24vgxw589bl6-linux-6.6.79/lib/modules/6.6.79/kernel/virt
But I then checked the old machines (kernel ver is 6.6.78 instead of 6.6.79), it didn’t show symbolic links but straight files:
ls -al /run/current-system/kernel-modules/lib/modules/6.6.78/kernel
total 56
dr-xr-xr-x 14 root root 4096 Dec 31 1969 .
dr-xr-xr-x 3 root root 4096 Dec 31 1969 ..
dr-xr-xr-x 3 root root 4096 Dec 31 1969 arch
dr-xr-xr-x 2 root root 4096 Dec 31 1969 block
dr-xr-xr-x 4 root root 4096 Dec 31 1969 crypto
dr-xr-xr-x 113 root root 4096 Dec 31 1969 drivers
dr-xr-xr-x 64 root root 4096 Dec 31 1969 fs
dr-xr-xr-x 4 root root 4096 Dec 31 1969 kernel
dr-xr-xr-x 8 root root 4096 Dec 31 1969 lib
dr-xr-xr-x 2 root root 4096 Dec 31 1969 mm
dr-xr-xr-x 55 root root 4096 Dec 31 1969 net
dr-xr-xr-x 3 root root 4096 Dec 31 1969 security
dr-xr-xr-x 16 root root 4096 Dec 31 1969 sound
dr-xr-xr-x 3 root root 4096 Dec 31 1969 virt
I wonder what’s causing the difference? The nixos config of old machines and this new one is mostly the same. The new one comes with a nvidia GPU so I have cuda enabled tho. Is there any recent changes made to Nixos so that the Linux kernel modules all become symbolic links? Or did the one I deployed previously is not the normal one, it suppose to be symbolic links but somehow didn’t?
version: nixos-24.11
ref: 11415c7ae8539d6292f2928317ee7a8410b28bb9