I’ve been running out of space on the root logical volume, so I thought I’d change some over from /home to /. This is the rough sequence of events:
lvreduce -L 100G /dev/mapper/vg-home
lvextend -l+100%FREE /dev/mapper/vg-root
resize2fs /dev/mapper/vg-home insisted I run e2fsck first.
Rebooted into NixOS 21.11 installer.
Tried running e2fsck, but it just never finished and did not respond to Ctrl-c or SIGUSR1.
Rebooted into “normal” boot to see if that file system check would be more responsive.
Waited 17 hours before posting this.
Is my machine hosed? Is this a known issue?
The system is NixOS 21.11, and it is running on a single ~250 GB SSD. The LVs are formatted as EXT4. I’ve not had any storage issues so far, so it’s a bit of a mystery why fsck is so slow. To be clear, the machine is still responsive - there’s a [ *** ]-style waiting indicator going back and forth, and pressing Enter does move the text down. The machine is room temperature, so it doesn’t seem to be doing anything power-intensive.
At this point the boot log says “Failed to start File System Check on /dev/disk/by-uuid/[…].”, “Dependency failed for /home.” and “Dependency failed for Local File Systems.”
systemctl status systemd-fsck@[…] says “Inodes that were part of a corrupted orphan linked list found.” To recover:
Run fsck /dev/mapper/[…]
Answer “y” to all queries
After this the system starts up.
Lessons learned:
fsck does not print any useful status information by default.
Do not combine lvreduce and lvextend. They might work separately; future experiment coming Soon™.
At this point, I think my success rate with fsck is about 30% over 10+ years.
The fact that this is recoverable at all is fantastic.
This took all of a few minutes to run. The --resizefs is absolutely key here - trying to resize2fs manually after lvreduce does not work - it’ll ask you to run fsck - and running fsck never finishes.