ZFS rollbacks suddenly stopped working

I have noticed that my ZFS rollbacks suddenly stopped succeeding. I have not changed these settings in a long time.

dmesg gives

[    6.390496] stage-1-init: [Fri Nov  1 09:48:18 UTC 2024] cannot open 'zroot/volatile@empty': dataset does not exist
[    6.595862] stage-1-init: [Fri Nov  1 09:48:18 UTC 2024] importing root ZFS pool "zroot"...

However,

$ zfs list -t snapshot
NAME                   USED  AVAIL  REFER  MOUNTPOINT
zroot/volatile@empty    80K      -    96K  -

My config settings:

boot.kernelPackages = config.boot.zfs.package.latestCompatibleLinuxPackages;
boot.initrd.kernelModules = [ "kvm-intel" ];
boot.initrd.luks.devices."luks".device = "/dev/disk/by-label/luks";
boot.initrd.postDeviceCommands = lib.mkAfter "zfs rollback -r zroot/volatile@empty";
boot.initrd.availableKernelModules = [
  "ahci"
  "ata_piix"
  "ehci_pci"
  "nvme"
  "ohci_pci"
  "rtsx_pci_sdmmc"
  "sd_mod"
  "sr_mod"
  "usb_storage"
  "usbhid"
  "xhci_pci"
];
fileSystems = let device = d: { device = "zroot/${d}"; fsType = "zfs"; neededForBoot = true; }; in {
  "/boot".device = "/dev/disk/by-label/boot";
  "/" = device "volatile";
  "/keep" = device "keep";
  "/nix" = device "nix";
};

Kernel version:

Linux 6.6.58 #1-NixOS SMP PREEMPT_DYNAMIC Tue Oct 22 13:46:36 UTC 2024 x86_64 GNU/Linux

Also I have noticed some weird jumps in kernels:

$ journalctl -x  | rg 'Linux version'
Sep 07 02:46:38 kernel: Linux version 6.6.48 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.42) #1-NixOS SMP PREEMPT_DYNAMIC Thu Aug 29 15:33:59 UTC 2024
Sep 11 16:43:43 kernel: Linux version 6.10.9 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.42) #1-NixOS SMP PREEMPT_DYNAMIC Sun Sep  8 05:56:41 UTC 2024
Sep 14 13:01:47 kernel: Linux version 6.10.9 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.42) #1-NixOS SMP PREEMPT_DYNAMIC Sun Sep  8 05:56:41 UTC 2024
Sep 20 00:04:41 kernel: Linux version 6.10.10 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.42) #1-NixOS SMP PREEMPT_DYNAMIC Thu Sep 12 09:13:13 UTC 2024
Sep 25 07:11:10 kernel: Linux version 6.6.52 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.42) #1-NixOS SMP PREEMPT_DYNAMIC Wed Sep 18 17:24:10 UTC 2024
Sep 30 10:01:04 kernel: Linux version 6.6.52 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.42) #1-NixOS SMP PREEMPT_DYNAMIC Wed Sep 18 17:24:10 UTC 2024
Okt 02 14:51:08 kernel: Linux version 6.6.52 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.42) #1-NixOS SMP PREEMPT_DYNAMIC Wed Sep 18 17:24:10 UTC 2024
Okt 11 08:59:18 kernel: Linux version 6.6.54 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.42) #1-NixOS SMP PREEMPT_DYNAMIC Fri Oct  4 14:30:05 UTC 2024
Okt 14 15:01:57 kernel: Linux version 6.6.54 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.42) #1-NixOS SMP PREEMPT_DYNAMIC Fri Oct  4 14:30:05 UTC 2024
Okt 23 18:59:55 kernel: Linux version 6.6.57 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.43.1) #1-NixOS SMP PREEMPT_DYNAMIC Thu Oct 17 13:24:38 UTC 2024
Okt 24 21:15:17 kernel: Linux version 6.6.57 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.43.1) #1-NixOS SMP PREEMPT_DYNAMIC Thu Oct 17 13:24:38 UTC 2024
Okt 24 22:16:36 kernel: Linux version 6.6.57 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.43.1) #1-NixOS SMP PREEMPT_DYNAMIC Thu Oct 17 13:24:38 UTC 2024
Okt 25 10:09:39 kernel: Linux version 6.6.58 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.43.1) #1-NixOS SMP PREEMPT_DYNAMIC Tue Oct 22 13:46:36 UTC 2024
Okt 27 02:40:41 kernel: Linux version 6.6.58 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.43.1) #1-NixOS SMP PREEMPT_DYNAMIC Tue Oct 22 13:46:36 UTC 2024
Okt 27 15:04:37 kernel: Linux version 6.6.58 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.43.1) #1-NixOS SMP PREEMPT_DYNAMIC Tue Oct 22 13:46:36 UTC 2024
Okt 27 15:36:10 kernel: Linux version 6.6.58 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.43.1) #1-NixOS SMP PREEMPT_DYNAMIC Tue Oct 22 13:46:36 UTC 2024
Okt 27 15:41:38 kernel: Linux version 6.6.58 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.43.1) #1-NixOS SMP PREEMPT_DYNAMIC Tue Oct 22 13:46:36 UTC 2024
Okt 28 12:50:37 kernel: Linux version 6.6.58 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.43.1) #1-NixOS SMP PREEMPT_DYNAMIC Tue Oct 22 13:46:36 UTC 2024
Okt 28 22:22:24 kernel: Linux version 6.6.58 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.43.1) #1-NixOS SMP PREEMPT_DYNAMIC Tue Oct 22 13:46:36 UTC 2024
Okt 29 01:59:40 kernel: Linux version 6.6.58 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.43.1) #1-NixOS SMP PREEMPT_DYNAMIC Tue Oct 22 13:46:36 UTC 2024
Okt 29 02:27:14 kernel: Linux version 6.6.58 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.43.1) #1-NixOS SMP PREEMPT_DYNAMIC Tue Oct 22 13:46:36 UTC 2024
Okt 29 14:59:12 kernel: Linux version 6.6.52 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.42) #1-NixOS SMP PREEMPT_DYNAMIC Wed Sep 18 17:24:10 UTC 2024
Okt 29 15:08:23 kernel: Linux version 6.6.58 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.43.1) #1-NixOS SMP PREEMPT_DYNAMIC Tue Oct 22 13:46:36 UTC 2024
Okt 29 15:31:00 kernel: Linux version 6.6.58 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.43.1) #1-NixOS SMP PREEMPT_DYNAMIC Tue Oct 22 13:46:36 UTC 2024
Okt 30 14:38:30 kernel: Linux version 6.6.58 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.43.1) #1-NixOS SMP PREEMPT_DYNAMIC Tue Oct 22 13:46:36 UTC 2024
Nov 01 10:48:20 kernel: Linux version 6.6.58 (nixbld@localhost) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.43.1) #1-NixOS SMP PREEMPT_DYNAMIC Tue Oct 22 13:46:36 UTC 2024

I am using bleeding edge flake-based unstable, almost daily updates.

It seems Kernel 6.10 was removed for being EOL upstream, which was the latest version ZFS supported. That’s why the Kernel switched to 6.6 (LTS).

I added

- boot.initrd.postDeviceCommands = lib.mkAfter "zfs rollback -r zroot/volatile@empty";
+ boot.initrd.postDeviceCommands = lib.mkAfter "zpool import zroot; zfs rollback -r zroot/volatile@empty";

which for now works again.

If anyone knows why that became necessary suddenly, that would be good to know.

This value for that option is potentially problematic, because kernels can run out of their support timeline, and when that happens the used kernel is actually reverted to an earlier (most likely LTS) version.
What the consequences are depends on the changes to ZFS in the period between the releases of each kernel version, but in general what happens is undefined, which makes it a dangerous thing to do.

A better option is to just pin the kernel to a specific version, and manually update it when a new kernel version is released, after verifying that that release actually has ZFS support (not all kernel releases have openzfs support).

As-is, I’d say your ZFS array is in an undefined state.

2 Likes

See zfs.latestCompatibleLinuxPackages is deprecated and Zfs latestCompatibleLinuxPackages for context

2 Likes

This is because the boot phase in which ZFS pools are imported during stage 1 was moved slightly, so you should now use postResumeCommands instead of postDeviceCommands. TL;DR: This is an important change to minimize the risk of data loss when using hibernation with ZFS, but it remains a risk so hibernation is still disabled by default when ZFS is in use.

4 Likes

Wow, thanks a lot! I don’t know how I would have ever got aware of this otherwise.

1 Like

Thanks a lot, that is very important to know. Would pinning a kernel also mean having to rebuild it once it gets EOL upstream, since then it will also be dropped in nixpkgs? Does it make sense to use the LTS kernel, then?

I must admit I am still not sure I understand what the best approach would be to stay on the newest ZFS-compatible Kernel without breakage or plausible future rebuilds

LTS kernel is the “safe” choice. I don’t see a massive reason to choose a newer kernel unless you know exactly what feature you care about on some x.y version.

2 Likes