/dev/zfs has the wrong permissions after rebooting

System Info

expand to see system details

output of inxi -Faz

System:
  Kernel: 6.8.12 arch: x86_64 bits: 64 compiler: gcc v: 13.2.0 clocksource: tsc
    avail: hpet,acpi_pm
    parameters: initrd=\EFI\nixos\qb1iz4b7lgdqyzji71h6n2kil7h9zp3b-initrd-linux-6.8.12-initrd.efi
    init=/nix/store/6zf5i8lqpkr68v7snb7x6dldbrckk6pg-nixos-system-revivajxo-24.05.2580.194846768975/init
    loglevel=4
  Console: pty pts/0 DM: SDDM Distro: NixOS 24.05 (Uakari)
Machine:
  Type: Desktop System: Hewlett-Packard product: HP Z440 Workstation v: N/A
    serial: <superuser required> Chassis: type: 6 serial: <superuser required>
  Mobo: Hewlett-Packard model: 212B v: 1.01 serial: <superuser required> part-nu: X2D67UT#ABA
    uuid: <superuser required> UEFI: Hewlett-Packard v: M60 v02.59 date: 03/31/2022
CPU:
  Info: model: Intel Xeon E5-1620 v4 bits: 64 type: MT MCP arch: Broadwell level: v3 note: check
    built: 2015-18 process: Intel 14nm family: 6 model-id: 0x4F (79) stepping: 1
    microcode: 0xB000040
  Topology: cpus: 1x cores: 4 tpc: 2 threads: 8 smt: enabled cache: L1: 256 KiB
    desc: d-4x32 KiB; i-4x32 KiB L2: 1024 KiB desc: 4x256 KiB L3: 10 MiB desc: 1x10 MiB
  Speed (MHz): avg: 1553 high: 3800 min/max: 1200/3800 scaling: driver: intel_cpufreq
    governor: schedutil cores: 1: 1430 2: 1200 3: 1200 4: 1200 5: 1200 6: 1200 7: 3800 8: 1197
    bogomips: 55873
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
  Vulnerabilities:
  Type: gather_data_sampling status: Not affected
  Type: itlb_multihit status: KVM: VMX disabled
  Type: l1tf mitigation: PTE Inversion; VMX: conditional cache flushes, SMT vulnerable
  Type: mds mitigation: Clear CPU buffers; SMT vulnerable
  Type: meltdown mitigation: PTI
  Type: mmio_stale_data mitigation: Clear CPU buffers; SMT vulnerable
  Type: reg_file_data_sampling status: Not affected
  Type: retbleed status: Not affected
  Type: spec_rstack_overflow status: Not affected
  Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via prctl
  Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer sanitization
  Type: spectre_v2 mitigation: Retpolines; IBPB: conditional; IBRS_FW; STIBP: conditional; RSB
    filling; PBRSB-eIBRS: Not affected; BHI: Not affected
  Type: srbds status: Not affected
  Type: tsx_async_abort mitigation: Clear CPU buffers; SMT vulnerable
Graphics:
  Message: Required tool lspci not installed. Check --recommends
  Display: server: X.org v: 1.21.1.13 with: Xwayland v: 24.1.0 driver: X: loaded: modesetting
    unloaded: fbdev,vesa dri: i965 gpu: N/A tty: 115x21
  API: OpenGL Message: GL data unavailable in console, glxinfo missing.
Audio:
  Message: No device data found.
  API: ALSA v: k6.8.12 status: kernel-api tools: N/A
  Server-1: PipeWire v: 1.0.7 status: off with: 1: pipewire-pulse status: off 2: wireplumber
    status: off 3: pipewire-alsa type: plugin tools: pw-cat,pw-cli,wpctl
  Server-2: PulseAudio v: 17.0 status: off tools: pacat,pactl
Network:
  Message: Required tool lspci not installed. Check --recommends
  IF-ID-1: br-8f55a10f1801 state: down mac: <filter>
  IF-ID-2: docker0 state: down mac: <filter>
  IF-ID-3: eno1 state: up speed: 1000 Mbps duplex: full mac: <filter>
  IF-ID-4: virbr0 state: down mac: <filter>
  IF-ID-5: wg0 state: unknown speed: N/A duplex: N/A mac: N/A
  Info: services: NetworkManager, nfsd, smbd, sshd, systemd-timesyncd, xinetd
RAID:
  Device-1: revivajxo_srv type: zfs status: ONLINE level: mirror-0 raw: size: 5.45 TiB
    free: 1.43 TiB allocated: 4.02 TiB zfs-fs: size: 5.33 TiB free: 1.31 TiB
  Components: Online:
  1: sda maj-min: 8:0 size: 5.46 TiB
  2: sdc maj-min: 8:32 size: 5.46 TiB
Drives:
  Local Storage: total: raw: 11.37 TiB usable: 5.78 TiB used: 3.9 TiB (67.5%)
  SMART Message: Required tool smartctl not installed. Check --recommends
  ID-1: /dev/sda maj-min: 8:0 vendor: Seagate model: ST6000VN001-2BB186 size: 5.46 TiB
    block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s tech: HDD rpm: 5425 serial: <filter>
    fw-rev: SC60 scheme: GPT
  ID-2: /dev/sdb maj-min: 8:16 vendor: Samsung model: SSD 860 EVO 500GB size: 465.76 GiB
    block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s tech: SSD serial: <filter>
    fw-rev: 3B6Q scheme: GPT
  ID-3: /dev/sdc maj-min: 8:32 vendor: Seagate model: ST6000VN001-2BB186 size: 5.46 TiB
    block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s tech: HDD rpm: 5425 serial: <filter>
    fw-rev: SC60 scheme: GPT
Partition:
  ID-1: / raw-size: 465.26 GiB size: 456.89 GiB (98.20%) used: 62.08 GiB (13.6%) fs: ext4
    dev: /dev/sdb1 maj-min: 8:17
  ID-2: /boot raw-size: 511 MiB size: 510 MiB (99.80%) used: 190.6 MiB (37.4%) fs: vfat
    dev: /dev/sdb2 maj-min: 8:18
Swap:
  Alert: No swap data was found.
Sensors:
  Src: /sys System Temperatures: cpu: 34.0 C mobo: N/A gpu: nouveau temp: 43.0 C
  Fan Speeds (rpm): N/A
Info:
  Memory: total: 16 GiB available: 15.53 GiB used: 1.11 GiB (7.2%)
  Processes: 277 Power: uptime: 0h 46m states: freeze,mem,disk suspend: deep avail: s2idle
    wakeups: 0 hibernate: platform avail: shutdown, reboot, suspend, test_resume image: 6.2 GiB
    services: upowerd Init: systemd v: 255 default: graphical tool: systemctl
  Packages: 1366 pm: nix-default pkgs: 0 pm: nix-sys pkgs: 1331 libs: 245 pm: nix-usr pkgs: 35
    libs: 6 Compilers: gcc: 13.2.0 Shell: Zsh v: 5.9 default: Bash v: 5.2.26
    running-in: pty pts/0 (SSH) inxi: 3.3.34

Symptoms

$ zfs list
Permission denied the ZFS utilities must be run as root.
$ ls -la /dev/zfs
crw------- 1 root root 10, 249 Jul  9 06:49 /dev/zfs
$ sudo chmod 666 /dev/zfs
$ zfs list
NAME                                                   USED  AVAIL  REFER  MOUNTPOINT
revivajxo_srv                                         4.02T  1.31T   104K  /revivajxo_srv
...

Background

This seems to be a known issue, a new feature of systemd that is interfering with the openzfs packaging. There’s an OpenZFS issue open on GitHub. I’ll summarize a couple relevant quotes:

The udev rule coming with zfs is not working anymore:
KERNEL=="zfs", MODE="0666", OPTIONS+="static_node=zfs"

The permissions are overwriten by kmod-static-nodes.service

mabod

To be clear: the udev rule is not obsolete. This is a bug, just a particularly annoying one.

systemd/systemd@b42482a was introduced in systemd 254-1 to help get /dev nodes up and running before udev is running. Alas, it has a bug such that it doesn’t always connect up properly with a udev rule for the node. A potential fix for that bug is in systemd/systemd#28681 and hopefully will land and make its way into distros soon.

In the meantime, the best workaround is (99%) the one described by @mabod upthread: create a file /etc/tmpfiles.d/zfs.conf with contents:

z /dev/zfs          0666 - -     -

(I’ve said /etc/tmpfiles.d as /etc is the right place for local modifications, but either will work).

The standard OpenZFS udev file is almost certainly on your system; I use Debian’s packaging of OpenZFS, and it drops it in /lib/udev/rules.d/90-zfs.rules. Without it you likely wouldn’t even have a /dev/zfs (unless your distro is doing something different, but they’ll definitely have something to create it).

robn

Following the linked systemd bug leads to a few failed attempts at fixing this issue and finally a resolution supposedly merged into Systemd 254.2, and hence presumably in v255 (which my system is running).

Workaround

I can confirm that, as recommended in the above referenced comment, setting…

  environment.etc."tmpfiles.d/zfs.conf".text = ''
    z /dev/zfs          0666 - -     -
    '';

… in your NixOS configuration provides a workaround for this issue. However, it seems to be a bug? Should I take more steps to resolve this in a less hacky way?