Pi4 broken on 23.11 -- UEFI / ZFS root

Just searching for ideas –

I have a Pi4 on ZFS root using UEFI that’s been doing great on 23.05. My boot partition is zfs with compatibility=grub2 as recommended by OpenZFS. I updated to 23.11 yesterday, the nixos-rebuild test went fine so I went ahead and switched.

Reboot never came up. The device is usually tucked away and headless so I pulled it out, hooked up the keyboard and HDMI, and found right after grub error: symbol grub_is_shim_lock_enabled. Same error with any prior generation.

Found NixOS doesn’t boot: symbol grub_is_shim_lock_enabled not found · Issue #243026 · NixOS/nixpkgs · GitHub – makes sense that something borked grub so it doesn’t matter what generation I try to use.

Booted to a 23.11 rescue image, was able to nixos-enter and change my config to use efiInstallAsRemovable = true; and nixos-install --install-bootloader switch ran without errors.

Now after reboot it gets a little further, I see a glimpse of Stage 1, and then the HDMI screen goes blank and nothing else happens. Never comes up on the network (Ethernet).

Anyone else running a similar setup? Any suggestios?

Well, recommended…

Likely contributed by someone, but it seems opinionated and I don’t necessarily agree.

grub was updated to 2.12rc1 for 23.11, which has caused some fallout: grub2: 2.06 -> unstable-2023-07-03 by K900 · Pull Request #240887 · NixOS/nixpkgs · GitHub

Your setup uses a zfs /boot pool, which provides little benefit yet requires you to use GRUB’s broken zfs support for it to be able to find your kernel/initramfs.

I would suggest following the recommendations outlined in the NixOS manual, that is mounting your (fat32) ESP at /boot and using systemd-boot instead of grub, i.e. set boot.loader.systemd-boot.enable = true

EDIT: maybe-related Boot broken on Raspberry Pi CM4 when using linux_rpi4 kernel package · Issue #270343 · NixOS/nixpkgs · GitHub

Thanks for such a thorough and thoughtful response!

Interestingly, I was following the advice from the NixOS Wiki, which at the time of installation seemed like an unambiguous endorsement of following the OpenZFS instructions:

It has since been updated to be far less charitable about the OpenZFS guide:

Warning: This guide is not endorsed by NixOS and some features like immutable root do not have upstream support and could break on updates. If an issue arises while following this guide, please consult the guides support channels.

Based on Stage 1 starting, I assume it’s at least finding the kernel and initramfs okay. The advantages I was hoping to leverage include ZFS snapshots, ZFS send / receive for backing up as much as possible, and compression (this is a raspberry pi – please see the dozens of posts about “boot partition out of space” issues before deciding this is completely silly)…

$ zfs get compressratio bpool
NAME   PROPERTY       VALUE  SOURCE
bpool  compressratio  1.27x  -

Personally I don’t consider these to be “little” benefit, although I suppose one might be assuming that NixOS provides me the ability to regenerate the boot partition with little trouble – unfortunately I’ve not found this to be the case with Pi4 on UEFI, which refuses to boot from the standard aarch64 ISO, so I’m left booting from the sd-image not in EFI mode, which means that grub can’t install the EFI bootloader stuff unless I change my configuration to efiInstallAsRemovable = true;, and then I change it back and reinstall once things are booting… NixOS on an RPi with ZFS root + UEFI has been fine when it works, but when it doesn’t it’s been a handful, so I was hoping having a ZFS boot partition would help make sure I could recover with minimal headache.

Clearly my hopes were in vain :laughing:

For full context, my other motivating factor was that I have other NixOS on ZFS root devices that are booting from ZFS mirrored boot partitions; my understanding is that grub is the only option for booting from a ZFS mirror. Sticking to grub for all my devices seemed like it would minimize what I’d need to learn (as opposed to learning intricacies and failure modes of multiple bootloaders).

Maybe I’ll give a shot to systemd-boot on this machine and see how it goes. I’m particularly interested in the potential of [WIP] nixos/systemd-boot: boot counting and automatic fallback by danielfullmer · Pull Request #84204 · NixOS/nixpkgs · GitHub

I guess in the end it’s a matter of preference, but using systemd-boot is much more aligned with NixOS defaults and less prone to breakage whereas zfs support in grub is fragile in general.

Yeah that would exactly be my reasoning - /boot is disposable.

Used to work fine for me on RPi3 when I last installed it - the installer ISO uses grub as well, so that would explain why it doesn’t work if grub is the issue here.

Would be nice if we can move nixos/iso-image: extract GRUB EFI boot logic and offer systemd-boot as an alternative by RaitoBezarius · Pull Request #246441 · NixOS/nixpkgs · GitHub forward

Regarding /boot space, I feel you:

# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/mmcblk0p1  253M  208M   45M  83% /boot

Main issue here is that AArch64 kernels are currently not compressed, there’s another PR that should help: linux: enable EFI_ZBOOT for generic compression support by RaitoBezarius · Pull Request #239721 · NixOS/nixpkgs · GitHub

You could try to rebuild the iso with systemd-boot to see if that works - if it does then it would likely fix your install as well.

Short update (short because my kids are dying for some play time and I’ve been fixing ZFS grub UEFI issues all morning) :laughing:

Thank you again for your time and attention!

2 Likes

Update: gave a shot to switching to systemd-boot today.

  1. sudo cp -a /boot/. /boot.bak
  2. sudo umount -R /boot
  3. sudo rm -r /boot/* (though preserving my raspberry pi UEFI stuff)
  4. Minor config changes such as boot.loader.systemd-boot.enable, deleting the mountpoint for my ZFS bpool, changing /boot/efi to /boot
  5. sudo nixos-rebuild --install-bootloader switch
  6. reboot

One new warning, otherwise so easy it was quite boring. Thanks for the kick in the pants, I didn’t realize it was the blessed approach in the nixos manual.

Barely. If you test it out, grub actually just drops the ball and fails when a vdev is missing a disk, even if there’s sufficient redundancy.

Oh really? So there is no option that really “uses” mirrored ZFS boot pools?

1 Like