tl;dr ZFS is unable to import the pool at boot when it’s on a LUKS partition
I would like to install NixOS on my laptop with the following requirements:
Full disk encryption, root filesystem and swap
Use ZFS as root filesystem
Originally I was going to use one encrypted ZFS pool to host both the filesystem and swap, however due to a know issue with swap on ZFS (GH openzfs issue) I chose take an alternative route.
Setup a LUKS encrypted partition with two sub-partitions, one for root filesystem and one for swap. I was able to find an article which describes such setup on ArchLinux (Arch Linux on an encrypted ZFS root system - 2017).
I tried to reproduce it on a VM but I am experiencing an error on boot. Once the LUKS partition is decryption, ZFS is unable to find the pool (screenshot below).
Someone had a similar problem Cannot import ZFS pool at boot - 2019, but didn’t have the LUKS layer which I suspect is where the problem is as I was able to setup a working system without LUKS.
Below is the script I use to setup the partition and system.
Note that it doesn’t make use of LVM on LUKS as I think (without knowing any better) that it is an unecessary layer.
#!/usr/bin/env bash
# NixOS install with encrypted root and swap
#
# sda
# ├─sda1 BOOT
# └─sda2 LINUX (LUKS CONTAINER)
# └─cryptroot LUKS MAPPER
# └─cryptroot1 ZFS
# └─cryptroot2 SWAP (omitted for conciseness)
set -e
pprint () {
local cyan="\e[96m"
local default="\e[39m"
# ISO8601 timestamp + ms
local timestamp
timestamp=$(date +%FT%T.%3NZ)
echo -e "${cyan}${timestamp} $1${default}" 1>&2
}
# Set DISK
select ENTRY in $(ls /dev/disk/by-id/);
do
DISK="/dev/disk/by-id/$ENTRY"
echo "Installing system on $ENTRY."
break
done
read -p "> Do you want to wipe all data on $ENTRY ?" -n 1 -r
echo # move to a new line
if [[ "$REPLY" =~ ^[Yy]$ ]]
then
# Clear disk
wipefs -af "$DISK"
sgdisk -Zo "$DISK"
fi
pprint "Creating boot (EFI) partition"
sgdisk -n 0:1M:+513M -t 0:EF00 "$DISK"
BOOT="$DISK-part1"
pprint "Creating Linux partition"
sgdisk -n 0:0:+10Gib -t 0:BF01 "$DISK"
LINUX="$DISK-part2"
# Inform kernel
partprobe "$DISK"
sleep 1
pprint "Format BOOT partition $BOOT"
mkfs.vfat "$BOOT"
pprint "Creating LUKS container on $LINUX"
cryptsetup --type luks2 luksFormat "$LINUX"
LUKS_DEVICE_NAME=cryptroot
cryptsetup luksOpen "$LINUX" "$LUKS_DEVICE_NAME"
LUKS_DISK="/dev/mapper/$LUKS_DEVICE_NAME"
# ZFS partition
sgdisk -n 0:0:0 -t 0:BF01 $LUKS_DISK
ZFS="${LUKS_DISK}1"
# SWAP omitted for conciseness
echo "zfs disk $ZFS"
# Inform kernel
partprobe "$LUKS_DISK"
sleep 1
pprint "Create ZFS pool"
# -f force
# -m mountpoint
zpool create -f -m none -R /mnt rpool "$ZFS"
pprint "Create ZFS datasets"
zfs create -o mountpoint=legacy rpool/root
zfs create -o mountpoint=legacy rpool/root/nix
zfs create -o mountpoint=legacy rpool/home
zfs snapshot rpool/root@blank
pprint "Mount ZFS datasets"
mount -t zfs rpool/root /mnt
mkdir /mnt/nix
mount -t zfs rpool/root/nix /mnt/nix
mkdir /mnt/home
mount -t zfs rpool/home /mnt/home
mkdir /mnt/boot
mount "$BOOT" /mnt/boot
pprint "Generate NixOS configuration"
nixos-generate-config --root /mnt
# Add LUKS and ZFS configuration
HOSTID=$(head -c8 /etc/machine-id)
LINUX_DISK_UUID=$(blkid --match-tag UUID --output value "$LINUX")
HARDWARE_CONFIG=$(mktemp)
cat <<CONFIG > "$HARDWARE_CONFIG"
networking.hostId = "$HOSTID";
boot.initrd.luks.devices."$LUKS_DEVICE_NAME".device = "/dev/disk/by-uuid/$LINUX_DISK_UUID";
boot.zfs.devNodes = "$ZFS";
CONFIG
pprint "Append configuration to hardware-configuration.nix"
sed -i "\$e cat $HARDWARE_CONFIG" /mnt/etc/nixos/hardware-configuration.nix
You can then complete the instal with nixos-install --root /mnt.
Thank you in advance for you help
There are numerous resources online that describe how to setup ZFS as the root filesystem with an encrypted pool but will have swap on a separate partion unencrypted or make no mention of swap alltogether.
Thanks @dalto. Adding LVM worked (see revised script below).
I did some additional investigation before adding LVM. On the reboot following the NixOS install, I was able to inspect the decrypted LUKS partition (via the rescue shell) which showed no sub-partition had been created. Which makes think something is wrong with the command to create the root partition on the LUKS device.
# In the original script of this post
# ZFS partition
sgdisk -n 0:0:0 -t 0:BF01 $LUKS_DISK
ZFS="${LUKS_DISK}1"
I still believe LVM is unnecessary here and hope to find a solution.
Revised script with encrypted root filesystem (ZFS) and SWAP.
#!/usr/bin/env bash
# NixOS install with encrypted root and swap
#
# sda
# ├─sda1 BOOT
# └─sda2 LINUX (LUKS CONTAINER)
# └─cryptroot LUKS MAPPER
# └─lvmvg-swap SWAP
# └─lvmvg-root ZFS
set -e
pprint () {
local cyan="\e[96m"
local default="\e[39m"
# ISO8601 timestamp + ms
local timestamp
timestamp=$(date +%FT%T.%3NZ)
echo -e "${cyan}${timestamp} $1${default}" 1>&2
}
# Set DISK
select ENTRY in $(ls /dev/disk/by-id/);
do
DISK="/dev/disk/by-id/$ENTRY"
echo "Installing system on $ENTRY."
break
done
read -p "> Do you want to wipe all data on $ENTRY ?" -n 1 -r
echo # move to a new line
if [[ "$REPLY" =~ ^[Yy]$ ]]
then
# Clear disk
wipefs -af "$DISK"
sgdisk -Zo "$DISK"
fi
pprint "Creating boot (EFI) partition"
sgdisk -n 0:1M:+513M -t 0:EF00 "$DISK"
BOOT="$DISK-part1"
pprint "Creating Linux partition"
sgdisk -n 0:0:+10Gib -t 0:BF01 "$DISK"
LINUX="$DISK-part2"
# Inform kernel
partprobe "$DISK"
sleep 1
pprint "Format BOOT partition $BOOT"
mkfs.vfat "$BOOT"
pprint "Creating LUKS container on $LINUX"
cryptsetup --type luks2 luksFormat "$LINUX"
LUKS_DEVICE_NAME=cryptroot
cryptsetup luksOpen "$LINUX" "$LUKS_DEVICE_NAME"
LUKS_DISK="/dev/mapper/$LUKS_DEVICE_NAME"
# Create LVM physical volume
pvcreate $LUKS_DISK
LVM_VOLUME_GROUP=lvmvg
vgcreate "$LVM_VOLUME_GROUP" "$LUKS_DISK"
lvcreate --name swap --size 1G "$LVM_VOLUME_GROUP"
SWAP="/dev/$LVM_VOLUME_GROUP/swap"
pprint "Enable SWAP on $SWAP"
mkswap $SWAP
swapon $SWAP
# ZFS partition
lvcreate --name root --extents 100%FREE "$LVM_VOLUME_GROUP"
ZFS="/dev/$LVM_VOLUME_GROUP/root"
pprint "Create ZFS pool on $ZFS"
# -f force
# -m mountpoint
zpool create -f -m none -R /mnt rpool "$ZFS"
pprint "Create ZFS datasets"
zfs create -o mountpoint=legacy rpool/root
zfs create -o mountpoint=legacy rpool/root/nix
zfs create -o mountpoint=legacy rpool/home
zfs snapshot rpool/root@blank
pprint "Mount ZFS datasets"
mount -t zfs rpool/root /mnt
mkdir /mnt/nix
mount -t zfs rpool/root/nix /mnt/nix
mkdir /mnt/home
mount -t zfs rpool/home /mnt/home
mkdir /mnt/boot
mount "$BOOT" /mnt/boot
pprint "Generate NixOS configuration"
nixos-generate-config --root /mnt
# Add LUKS and ZFS configuration
HOSTID=$(head -c8 /etc/machine-id)
LINUX_DISK_UUID=$(blkid --match-tag UUID --output value "$LINUX")
HARDWARE_CONFIG=$(mktemp)
cat <<CONFIG > "$HARDWARE_CONFIG"
networking.hostId = "$HOSTID";
boot.initrd.luks.devices."$LUKS_DEVICE_NAME".device = "/dev/disk/by-uuid/$LINUX_DISK_UUID";
boot.zfs.devNodes = "$ZFS";
CONFIG
pprint "Append configuration to hardware-configuration.nix"
sed -i "\$e cat $HARDWARE_CONFIG" /mnt/etc/nixos/hardware-configuration.nix
I would argue that using lvm is a much more standard and supported way to solve this problem than writing partitions into your crypt device. That being said, a couple of questions on your original config.
I can see where you are creating a partition for zfs on your cryptdevice, but did you first write a gpt partition table to the cryptdevice?
Since you removed the setup of your swap partition I can’t see what was done but be sure that your partition config is valid.