Hello, I’ve been running NixOS 21.05 + ZFS with no problems since this past spring, using kernelPackages = pkgs.linuxPackages_latest; in my config (eg kernel version 5.12).
Recently I’ve started getting this build error, below, where the ZFS package is now broken. The error looks like its trying to pull kernel 5.14 and run it with ZFS module 2.0.5, and doesn’t like that combo.
Anyone know what the problem is here, and how to fix it?
bgibson@z11pa-d8:~ (*)
> sudo nixos-rebuild -v --show-trace dry-build
...
error: while evaluating the attribute 'buildCommand' of the derivation 'nixos-system-z11pa-d8-21.05.3248.6120ac5cd20' at /nix/store/xkjafds79w6s17nf0jkzdcfw09b1s2zj-nixos-21.05.3248.6120ac5cd20/nixos/pkgs/stdenv/generic/make-derivation.nix:201:11:
while evaluating 'optionalString' at /nix/var/nix/profiles/per-user/root/channels/nixos/lib/strings.nix:202:5, called from /nix/var/nix/profiles/per-user/root/channels/nixos/nixos/modules/system/activation/top-level.nix:36:9:
while evaluating the attribute 'passAsFile' of the derivation 'kernel-modules' at /nix/store/xkjafds79w6s17nf0jkzdcfw09b1s2zj-nixos-21.05.3248.6120ac5cd20/nixos/pkgs/stdenv/generic/make-derivation.nix:201:11:
while evaluating the attribute 'passAsFile' at /nix/store/xkjafds79w6s17nf0jkzdcfw09b1s2zj-nixos-21.05.3248.6120ac5cd20/nixos/pkgs/build-support/buildenv/default.nix:77:5:
while evaluating anonymous function at /nix/var/nix/profiles/per-user/root/channels/nixos/lib/types.nix:358:14, called from undefined position:
while evaluating the attribute 'value' at /nix/var/nix/profiles/per-user/root/channels/nixos/lib/modules.nix:570:27:
while evaluating anonymous function at /nix/var/nix/profiles/per-user/root/channels/nixos/lib/modules.nix:559:17, called from /nix/var/nix/profiles/per-user/root/channels/nixos/lib/modules.nix:559:12:
while evaluating 'check' at /nix/var/nix/profiles/per-user/root/channels/nixos/lib/types.nix:349:15, called from /nix/var/nix/profiles/per-user/root/channels/nixos/lib/modules.nix:559:22:
while evaluating the attribute 'handled' at /nix/store/xkjafds79w6s17nf0jkzdcfw09b1s2zj-nixos-21.05.3248.6120ac5cd20/nixos/pkgs/stdenv/generic/check-meta.nix:302:7:
while evaluating 'handleEvalIssue' at /nix/store/xkjafds79w6s17nf0jkzdcfw09b1s2zj-nixos-21.05.3248.6120ac5cd20/nixos/pkgs/stdenv/generic/check-meta.nix:188:38, called from /nix/store/xkjafds79w6s17nf0jkzdcfw09b1s2zj-nixos-21.05.3248.6120ac5cd20/nixos/pkgs/stdenv/generic/check-meta.nix:303:14:
Package ‘zfs-kernel-2.0.5-5.14.5’ in /nix/store/xkjafds79w6s17nf0jkzdcfw09b1s2zj-nixos-21.05.3248.6120ac5cd20/nixos/pkgs/os-specific/linux/zfs/default.nix:175 is marked as broken, refusing to evaluate.
a) To temporarily allow broken packages, you can use an environment variable
for a single invocation of the nix tools.
$ export NIXPKGS_ALLOW_BROKEN=1
b) For `nixos-rebuild` you can set
{ nixpkgs.config.allowBroken = true; }
in configuration.nix to override this.
c) For `nix-env`, `nix-build`, `nix-shell` or any other Nix command you can add
{ allowBroken = true; }
to ~/.config/nixpkgs/config.nix.
bgibson@z11pa-d8:~ (*)
> zfs --version
zfs-2.0.5-1
zfs-kmod-2.0.5-1
bgibson@z11pa-d8:~ (*)
> uname -a
Linux z11pa-d8 5.12.15 #1-NixOS SMP Wed Jul 7 12:26:52 UTC 2021 x86_64 GNU/Linux
bgibson@z11pa-d8:~ (*)
> nix-shell -p nix-info --run "nix-info -m"
...
- system: `"x86_64-linux"`
- host os: `Linux 5.12.15, NixOS, 21.05.1408.9376bf7b342 (Okapi)`
- multi-user?: `yes`
- sandbox: `yes`
- version: `nix-env (Nix) 2.3.12`
- channels(root): `"nixos-21.05.3248.6120ac5cd20"`
- nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixos`
You can cherry-pick that PR patch to a local nixpkgs checkout, and use it via the -I option to nixos-rebuild. Since only the modules have changed it shouldn’t cause a kernel rebuild.
Since I am the one partly responsible for this situation I will try to shed some light.
It’s probably a bit too obvious but nevertheless important: The ZFS driver is not part of the mainline Linux kernel but instead provided as a kernel module. This is necessary because ZFS has to perform low-level filesystem operations which are not part of the syscall interface.
ZFS has to make use of quite a number of subsystems in the Linux kernel, which change every once in a while. When such a change occurs the ZFS developers have to sit down and adapt their module code. This usually takes a while, so when a new stable Linux kernel is released it takes a few days or weeks for ZFS to catch up.
This is why in zfs: 2.0.2 -> 2.0.3 by hmenke · Pull Request #112910 · NixOS/nixpkgs · GitHub I decided to to put in a safeguard that just marks ZFS as broken when the kernel version is not officially supported. Experienced users can circumvent this by simply settings nixpkgs.config.allowBroken = true; and compile it anyway.
Unfortunately, marking it as broken has a downside in terms of usability. By marking ZFS as broken evaluation will terminate, but ofborg correctly picks up the package as broken and skips the check. The problem is that ofborg only does this if there is no other output on stdout so the only message that can be displayed to the user is the generic “Package … is marked as broken, refusing to evaluate.” message. It would be more useful to tell the user to try zfsUnstable first but that’s not possible. Fix ofborg evaluation failure for zfs with unsupported kernels by hmenke · Pull Request #122478 · NixOS/nixpkgs · GitHub
Thanks for the explanation, and I fully support you guys erring on the side of caution, especially with filesystem-related stuff.
Is there any way to keep the stable channel and get my build’s kernel version back to 5.12? That was working fine with ZFS 2.0.5-1, but now all the boot.kernelPackages = pkgs.linuxPackages variants pull some other version:
boot.kernelPackages = pkgs.linuxPackages_latest; now pulls 5.14
kernelPackages = pkgs.linuxPackages; pulls 5.10
boot.kernelPackages = pkgs.linuxPackages_5_12; is broken (attribute 'linuxPackages_5_12' missing, at /etc/nixos/configuration.nix:141:22)
boot.kernelPackages = config.zfs.package.latestCompatibleLinuxPackages; doesn’t work in stable
That issue is closed, but are there any other problems with kernel 5.10 and ZFS 2.0.x (2.0.5 in my case), or are they fully compatible now?
For the time being, I can either run my prior 5.12 + ZFS 2.0.5 build which is working fine despite 5.12 being EOL’d.
Or, I can run a new build that regresses the kernel back to 5.10, but pulls in the other application updates I want. That would be fine, unless there are still problems with 5.10 and ZFS 2.0.
Generally we just follow up the zfs upstream says it supports. If it failed to evaluate, I wouldn’t push it (as it will likely fail to build anyway), but if it’s successful, then you should be good.
What about the option of keeping recently EOL’d kernels like 5.12 and 5.13 until the newest kernel fully builds with all of nixpkgs, but making the the EOL kernels explicitly opt-in by requiring boot.kernelPackages = pkgs.linuxPackages_5_12; or _5_13, or perhaps boot.kernelPackages = pkgs.linuxPackages_5_12_eol to really force an explicit opt-in.
I ask b/c 5.12 + ZFS 2.0.5 was working fine for me, and still is in fact. I wouldn’t mind continuing to use 5.12 or 5.13 for a few more months until 5.14 builds with ZFS. I know they’re EOL’d and no longer receiving security updates, so I keep an eye on vulnerability announcements. If something serious comes out, I can roll back to pkgs.linuxPackages if needed.
Basically, preserve user agency but make it explicit, and assume that anyone who explicitly opts into an EOL’d kernel has a good reason for it and knows what they’re doing.