Cannot boot after 22.05 upgrade (LVM volumes not found/mounted?)

Hello, I’ve been facing a problem after updating to the latest 22.05 version of NixOS.

Upon booting the system, it hangs when starting a job for /dev/System/lv_home (which is my home LVM partition). Then it drops into an emergency shell. If I run vgchange -ay and continue the boot process, it works just fine. I tried to add this command to boot.initrd.preDeviceCommands or boot.initrd.preLVMCommands without success.

This LVM configuration worked fine on 21.11.

Do you have any ideas about what could cause this?
I don’t have any LUKS encryption and my hardware-configuration.nix file contains :

{ config, lib, pkgs, modulesPath, ... }:

{
  imports =
    [ (modulesPath + "/installer/scan/not-detected.nix")
    ];

  boot.initrd.availableKernelModules = [ "xhci_pci" "ahci" "nvme" "usbhid" "usb_storage" "sd_mod" ];
  boot.initrd.kernelModules = [ "dm-snapshot" "dm-cache" ];
  boot.initrd.preDeviceCommands = "vgchange -ay";
  boot.kernelModules = [ "kvm-intel" ];
  boot.extraModulePackages = [ ];

  fileSystems."/" =
    { device = "/dev/disk/by-uuid/52ee5c9f-fc46-45fa-b284-859a8b37f6d0";
      fsType = "ext4";
    };

  fileSystems."/home" =
    { device = "/dev/System/lv_home";
      fsType = "ext4";
    };

  fileSystems."/efi" =
    { device = "/dev/disk/by-uuid/A11C-3DC1";
      fsType = "vfat";
    };

  swapDevices = [ ];

  powerManagement.cpuFreqGovernor = lib.mkDefault "powersave";
  hardware.cpu.intel.updateMicrocode = lib.mkDefault config.hardware.enableRedistributableFirmware;
}

Best regards,

sda

Do you use encryption? Someone else described the same fix recently for the emergency shell, and they used encryption.

I’ve seen the thread yes, unfortunately I don’t use any kind of encryption.

Have you still checked whether or not the non encryption related settings proposed in that thread make sense or differ from your settings?

The only setting that is not encryption related (I believe ?) is canTouchEFIVariables which I don’t think relates to that but I’ll try putting it to false anyway and give it a try.

Update: Didn’t change anything :frowning:

/home would be mounted in stage 2, not in initrd, so it makes sense that boot.initrd.preDeviceCommands = "vgchange -ay"; wouldn’t do anything. Without anything other than LVM to speak of though, I’m not sure what could be going wrong…

Can you share what lsblk looks like once you’re all booted up?

Hello !
Sure thing:

sda                                        8:0    0 298.1G  0 disk
├─sda1                                     8:1    0   100M  0 part
├─sda2                                     8:2    0    16M  0 part
├─sda3                                     8:3    0 297.5G  0 part
└─sda4                                     8:4    0   499M  0 part
sdb                                        8:16   0   1.8T  0 disk
└─sdb1                                     8:17   0   1.8T  0 part
  └─System-lv_home_corig                 254:2    0   1.8T  0 lvm
    └─System-lv_home                     254:3    0   1.8T  0 lvm  /home
nvme0n1                                  259:0    0 238.5G  0 disk
├─nvme0n1p1                              259:1    0     1G  0 part /efi
├─nvme0n1p2                              259:2    0   100G  0 part /nix/store
│                                                                  /
└─nvme0n1p3                              259:3    0 137.5G  0 part
  ├─System-lv_home_cachepool_cpool_cdata 254:0    0 137.4G  0 lvm
  │ └─System-lv_home                     254:3    0   1.8T  0 lvm  /home
  └─System-lv_home_cachepool_cpool_cmeta 254:1    0    28M  0 lvm
    └─System-lv_home                     254:3    0   1.8T  0 lvm  /home
nvme1n1                                  259:4    0 232.9G  0 disk
├─nvme1n1p1                              259:5    0    16M  0 part
└─nvme1n1p2                              259:6    0 232.9G  0 part

sdb is the disk where my /home actually is written to and nvme0n1 is where the lvm cachepool and the system is installed

Yea I honestly have no idea why that wouldn’t work. Though I’ve never messed with LVM caches.

I’m going to try to binary search the github patch that introduced this problem tomorrow. I’ll let you know if I find anything interesting…

Hi,

I stumbled upon the exact same issue with a LUKS on LVM today with 22.05.

One fix working for me is adding boot.initrd.preLVMCommands = "lvm vgchange -ay";

Oh that would make sense I don’t know what I didn’t think of that earlier. Thank you will try.

I found this lvm call in stage-1-init.sh, but I don’t know if it’s not working or not supposed to be triggered for waiting for LVM disks, given the output in the script, we never enter into the code running lvm vgchange -ay

Would you mind opening an issue on nixpkgs? If LVM is really broken in 22.05 that’s not fun :confused:

I don’t think that’s the same issue. @sda’s problem is with /home, which isn’t mounted until well after initrd is done with. So the issue lies in the LVM support in stage 2. So boot.initrd.preLVMCommands would have no effect. Plus, look here. By adding boot.initrd.preLVMCommands = "lvm vgchange -ay";, you’re literally just causing it to run lvm vgchange -ay twice in a row. So it doesn’t really make sense to me that it solved your issue.

EDIT: Actually @Solene, looking at what happens to preLVMCommands when a LUKS device is involved, it looks like you are actually causing it to run vgchange before the LUKS stuff happens. So you must have LUKS on an LVM device, while NixOS assumes by default that it’s the other way around. You’re expected to inform NixOS of this switcharoo with boot.initrd.luks.devices.FOO.preLVM = false;

1 Like

I see that you’re using cache. Does that require a special kernel module to work?

If so, make sure initrd has access to that module.

In the emergency shell, take a look at dmesg. What’s the actual error?

@Atemu again, their issue is not in initrd because /home is mounted in stage 2.

Thanks! That did the trick for me. That option wasn’t really easy to understand in the man page. :confused:

In the future hopefully the preLVM sorts of things will go away when we switch to a systemd-based initrd. It’s available as an experimental feature today, but the options are hidden from the manual. The way systemd handles devices and mountpoints is much more dynamic, so it’s able to infer how to order these things rather than require the user to write it in their nixos config.

1 Like