ZFS takes one minute to import in stage 1

nat · December 12, 2022, 7:57pm

Hello!

I did a fresh install of NixOS using ZFS + impermanence.
Now, during booting, it takes 2-3 minutes in total for me to boot. (I have two pools so what I said in title is still correct )

I tried changing zfs module settings, updating kernel, disabling hardening related tweaks, disabling init_on_alloc and init_on_free, enabling systemd in stage 1 and I checked if crng initialization isn’t making it stuck.

Again, it only happens in stage 1 and takes like 60-65 seconds per pool.

If someone could help me I will be grateful.

Here is my dmesg: https://termbin.com/wcpqa
And here is config of host on which I have this problem: https://github.com/surfaceflinger/nixos-config/tree/master/nixos/hosts/blahaj

nat · December 13, 2022, 11:25pm

So, I just learned that stage 1 is basically a long shell script.

I noticed this:

echo -n "importing root ZFS pool \"blahaj\"..."
# Loop across the import until it succeeds, because the devices needed may not be discovered yet.
if ! poolImported "blahaj"; then
  for trial in `seq 1 60`; do
    poolReady "blahaj" > /dev/null && msg="$(poolImport "blahaj" 2>&1)" && break
    sleep 1
    echo -n .
  done
  echo
  if [[ -n "$msg" ]]; then
    echo "$msg";
  fi
  poolImported "blahaj" || poolImport "blahaj"  # Try one last time, e.g. to import a degraded pool.
fi
zfs load-key -a


echo -n "importing root ZFS pool \"ikea\"..."
# Loop across the import until it succeeds, because the devices needed may not be discovered yet.
if ! poolImported "ikea"; then
  for trial in `seq 1 60`; do
    poolReady "ikea" > /dev/null && msg="$(poolImport "ikea" 2>&1)" && break
    sleep 1
    echo -n .
  done
  echo
  if [[ -n "$msg" ]]; then
    echo "$msg";
  fi
  poolImported "ikea" || poolImport "ikea"  # Try one last time, e.g. to import a degraded pool.
fi
zfs load-key -a

Isn’t seq 1 60 causing this? I’m pretty there’s some issue here.
Also, wouldn’t having a single zfs load-key -a at the end make it so zfs would ask passphrase once, and then it would reuse it for the second pool?

EDIT:
I’ve found a workaround by doing

boot.initrd.postDeviceCommands = "zpool import -a -d ${config.boot.zfs.devNodes}";

Again, I have no idea why it’s not working, but my system doesn’t see any pools until I do zpool import -a -d <path to dir with devices>. That’s why it was waiting 60 seconds and then finally doing

poolImported "ikea" || poolImport "ikea"  # Try one last time, e.g. to import a degraded pool.

which does exactly what I have set in boot.initrd.postDeviceCommands

uep · December 13, 2022, 11:59pm

No; seq itself should run basically instantly, and just create the values for the loop to iterate over. The loop has a sleep, but that’s waiting for the pool to show up.

It’s janky, I agree. The systemd stage 1 stuff changes this construction, but you mentioned you already tried that.

The question is why that’s not happening yet on your system.

uep · December 14, 2022, 12:11am

You might try experimenting with boot.zfs.devNodes, perhaps giving a path where the devices show up sooner before waiting for udev to create the extra symlinks

nat · December 14, 2022, 12:14am

You might try experimenting with boot.zfs.devNodes

Well, I already tried it. I don’t think this is an issue with NixOS but rather with me not understanding something about zfs and I messed something up when creating pools

uep · December 14, 2022, 12:30am

Can you share your pool config zpool status and partition/labels blkid?

We can at least look and maybe spot anything unusual

nat · December 14, 2022, 12:55am

pool layout is in markdown file on github;

zpool status:

  pool: blahaj
 state: ONLINE
  scan: scrub repaired 0B in 00:03:56 with 0 errors on Wed Dec 14 01:32:34 2022
config:

	NAME                            STATE     READ WRITE CKSUM
	blahaj                          ONLINE       0     0     0
	  wwn-0x500a0751e5d95f83-part2  ONLINE       0     0     0
	  wwn-0x500a0751e5d95f96        ONLINE       0     0     0

errors: No known data errors

  pool: ikea
 state: ONLINE
  scan: scrub canceled on Mon Dec 12 19:15:51 2022
config:

	NAME                      STATE     READ WRITE CKSUM
	ikea                      ONLINE       0     0     0
	  wwn-0x5000c500c7a2d75d  ONLINE       0     0     0

errors: No known data errors

And now, blkid gets interesting

/dev/sdb9: PARTUUID="b0206aa3-8b35-5b44-bf9f-817094b8e3b3"
/dev/sdb1: UUID="1cdf5025-4e71-47be-86be-23d2daf75613" TYPE="crypto_LUKS" PARTLABEL="zfs-66448020d84bbe8d" PARTUUID="4be34d3f-a214-9140-930f-c434f6f6949a"
/dev/sdc9: PARTUUID="6f7c4057-b62e-e947-9169-0371026946b3"
/dev/sdc1: UUID="a2cda069-46d4-49c8-ae91-b9188a261760" TYPE="crypto_LUKS" PARTLABEL="zfs-0e4744847fd998f7" PARTUUID="feff6bec-e2d7-5c4c-9fa4-015daec66381"
/dev/sda2: UUID="d9907dfb-a0a3-4717-836e-da15d8e95eb7" TYPE="crypto_LUKS" PARTUUID="3034ed74-385f-b94a-b53f-d491079f96ed"
/dev/sda1: LABEL_FATBOOT="EFI" LABEL="EFI" UUID="4B56-046D" BLOCK_SIZE="512" TYPE="vfat" PARTUUID="78d53244-bf06-ed41-a980-68038e6229e3"

Looks like TYPE="crypto_LUKS" has somehow persisted from older installation after I did sgdisk --zap-all.

nat@blahaj [~] ✨ sudo wipefs --no-act /dev/sda2
DEVICE OFFSET       TYPE        UUID                                 LABEL
sda2   0x0          crypto_LUKS d9907dfb-a0a3-4717-836e-da15d8e95eb7 
sda2   0x3f000      zfs_member  9057078032123706440                  blahaj

How could I try to get rid of it safely?

nat · December 14, 2022, 1:15am

Alright! Thanks! So, the whole issue was that I had crypto_LUKS signatures at the beginning and zfs probably doesn’t bother with trying to detect pools after seeing them

The correct fix is to get rid of crypto_LUKS signatures by running

wipefs -o 0x0 /dev/sda2 -f

and of course replacing sda2 with devices that have wrong TYPE in blkid.