Fstab has wrong disk UUID and fails boot after re-install

Having a bit of time on my hands, I decided to fix a mistake I made when originally setting up my workstation. I’ve been running a ZFS root (rpool) using a zvol for swap, which I learned afterwards is a no-no (see Swap deadlock in 0.7.9 · Issue #7734 · openzfs/zfs · GitHub). As a workaround, I’ve been running with swap disabled to prevent the lock-ups but now I want to fix it.

After backing up rpool with zfs send/recv to another system, I re-partitioned the disk to have a separate swap partition which is encrypted with LUKS (currently unused because I also activated zram swap). I also switched from a LUKS-encrypted rpool to native encryption. Other than the adjustments below, configuration.nix is the same as the successfully running 20.03 installation I had before.

The problem is that the stage-2 bootloader attempts to mount the old UUID for /boot (which doesn’t exist anymore) rather than the new UUID and I don’t know how to force it to update. I expected that the configuration would change corretly when I installed 20.09 using nixos-install but apparently not because it hangs with the journal error message:

Dependency failed for File System Check on /dev/disk/by-uuid/768F-1DF8
Dependency failed for /boot
Dependency failed for Local File system

That is the old UUID not the new one A1E2-24A7. Suggestions?

Details:

  • NixOS: 20.09.2718.e4adbfbab8a (minimal installer downloaded this week)
  • Platform: amd64, UEFI, systemd-boot (not grub2)
  • Partitioning: (5 disks)
    • /dev/sda1: native encrypted ZFS (rpool)
    • /dev/sda3: LUKS-encrypted swap (currently unused)
    • /dev/sda2: EFI vfat for /boot
    • /dev/sd{b,c,d,e}1: LUKS-encrypted partitions for ZFS pool (dpool) # legacy

Relevant configuration:

  • configuration.nix
    • boot.initrd.supportedFilesystems = [ “zfs” ]; # unchanged
    • boot.initrd.forceLuksSupportInInitrd = true; # new; provides cryptsetup since we are no longer not using using LUKS-encryption with ZFS for rpool and cryptsetup is needed to decrypt the legacy dpool
    • boot.postMountCoummands = ‘’
      cryptsetup open … /dev/disk/by-partuuid/… cdiskb
      cryptsetup open … /dev/disk/by-partuuid/… cdiskc
      cryptsetup open … /dev/disk/by-partuuid/… cdiskd
      cryptsetup open … /dev/disk/by-partuuid/… cdisk3
      ‘’'; # switched from /dev/sd{b,c,d,e}1 to using UUIDs but otherwise unchanged; legacy dpool
    • zramSwap = { enable = true; algorithm = “zstd”; priority = 32767; memoryPercent = 10; }; # new
  • hardware-configuration.nix
    • fileSystems.“/” = { device = “rpool/root/nixos”; fsType = “zfs”; };
    • fileSystems.“/boot” = { device = “/dev/disk/by-uuid/A1E2-24A7”; fsType = “vfat”; }; # used to be 768F-1DF8 with old config
    • fileSystems.“/home” = { device = “dpool/home”; fstype = “zfs”; };

Incidentally, is there a better way to decrypt the dpool disks rather than using postMountCommands?

1 Like

Solved. I’m documenting the solution in case others hit the same problem.

One fact I didn’t mention (because I didn’t think it relevant) is that in the working 20.03 installation /etc/nixos/{configuration,hardware-configuration}.nix are reallly symlinks into a /etc/nixos/cfg/hosts/box1 directory. In reinstalling with 20.09, I cloned the old config ran nixos-generate-config, replaced /etc/nixos/configuration.nix with a symlink to /etc/nixos/cfg/hosts/box1/configuration.nix and updated the config as outlined in the previous post. I used the generated /etc/nixos/hardware-configuration.nix without moving it into /ec/nixos/cfg/hosts/box1 and symlinking it into place. I thought that nixos-install would use /etc/nixos/hardware-configuration.nix and ignore the other but I was wrong. For some reason it was getting the original version of the file and hence was creating a /etc/fstab with the wrong UUID for /boot. I don’t know if this is intended behavior or not. It sure caught me by surprise.

Just ran across this exact same bug. I’m on nixos-minimal-21.05.4361.43bbfd4994c-aarch64-linux.iso. This really seems like a bug to me. I did a completely fresh install and still hit this issue.

@mkg What was your solution/workaround to this?

I’m guessing that when configuration.nix does an import of hardware-configuration.nix, that path is evaluated from the location of configuration.nix after the symlinks have been followed.