I did a fresh install of NixOS using ZFS + impermanence.
Now, during booting, it takes 2-3 minutes in total for me to boot. (I have two pools so what I said in title is still correct )
I tried changing zfs module settings, updating kernel, disabling hardening related tweaks, disabling init_on_alloc and init_on_free, enabling systemd in stage 1 and I checked if crng initialization isn’t making it stuck.
Again, it only happens in stage 1 and takes like 60-65 seconds per pool.
So, I just learned that stage 1 is basically a long shell script.
I noticed this:
echo -n "importing root ZFS pool \"blahaj\"..."
# Loop across the import until it succeeds, because the devices needed may not be discovered yet.
if ! poolImported "blahaj"; then
for trial in `seq 1 60`; do
poolReady "blahaj" > /dev/null && msg="$(poolImport "blahaj" 2>&1)" && break
sleep 1
echo -n .
done
echo
if [[ -n "$msg" ]]; then
echo "$msg";
fi
poolImported "blahaj" || poolImport "blahaj" # Try one last time, e.g. to import a degraded pool.
fi
zfs load-key -a
echo -n "importing root ZFS pool \"ikea\"..."
# Loop across the import until it succeeds, because the devices needed may not be discovered yet.
if ! poolImported "ikea"; then
for trial in `seq 1 60`; do
poolReady "ikea" > /dev/null && msg="$(poolImport "ikea" 2>&1)" && break
sleep 1
echo -n .
done
echo
if [[ -n "$msg" ]]; then
echo "$msg";
fi
poolImported "ikea" || poolImport "ikea" # Try one last time, e.g. to import a degraded pool.
fi
zfs load-key -a
Isn’t seq 1 60 causing this? I’m pretty there’s some issue here.
Also, wouldn’t having a single zfs load-key -a at the end make it so zfs would ask passphrase once, and then it would reuse it for the second pool?
EDIT:
I’ve found a workaround by doing
boot.initrd.postDeviceCommands = "zpool import -a -d ${config.boot.zfs.devNodes}";
Again, I have no idea why it’s not working, but my system doesn’t see any pools until I do zpool import -a -d <path to dir with devices>. That’s why it was waiting 60 seconds and then finally doing
poolImported "ikea" || poolImport "ikea" # Try one last time, e.g. to import a degraded pool.
which does exactly what I have set in boot.initrd.postDeviceCommands
No; seq itself should run basically instantly, and just create the values for the loop to iterate over. The loop has a sleep, but that’s waiting for the pool to show up.
It’s janky, I agree. The systemd stage 1 stuff changes this construction, but you mentioned you already tried that.
The question is why that’s not happening yet on your system.
You might try experimenting with boot.zfs.devNodes, perhaps giving a path where the devices show up sooner before waiting for udev to create the extra symlinks
You might try experimenting with boot.zfs.devNodes
Well, I already tried it. I don’t think this is an issue with NixOS but rather with me not understanding something about zfs and I messed something up when creating pools
pool: blahaj
state: ONLINE
scan: scrub repaired 0B in 00:03:56 with 0 errors on Wed Dec 14 01:32:34 2022
config:
NAME STATE READ WRITE CKSUM
blahaj ONLINE 0 0 0
wwn-0x500a0751e5d95f83-part2 ONLINE 0 0 0
wwn-0x500a0751e5d95f96 ONLINE 0 0 0
errors: No known data errors
pool: ikea
state: ONLINE
scan: scrub canceled on Mon Dec 12 19:15:51 2022
config:
NAME STATE READ WRITE CKSUM
ikea ONLINE 0 0 0
wwn-0x5000c500c7a2d75d ONLINE 0 0 0
errors: No known data errors
Alright! Thanks! So, the whole issue was that I had crypto_LUKS signatures at the beginning and zfs probably doesn’t bother with trying to detect pools after seeing them
The correct fix is to get rid of crypto_LUKS signatures by running
wipefs -o 0x0 /dev/sda2 -f
and of course replacing sda2 with devices that have wrong TYPE in blkid.