`boot.initrd.luks.devices` disk fails to decrypt

Hi, I’m having some trouble with the upgrade to 22.11. I have a system with two nvme drives, both of which I encrypt with the same password:

NAME                         SIZE TYPE  MOUNTPOINTS        UUID
nvme1n1                      1.8T disk                     dd17e735-fac4-467f-b1ee-8bb214bc2b08
└─storage                    1.8T crypt                    e8dH6T-tvQT-mtMv-VwOb-cf8k-kDhj-XNxq5X
  ├─storage-steam--library  1000G lvm   /mnt/steam-library 2800a362-a988-42ed-a7b4-172b6326d39d
  ├─storage-music             50G lvm                      df349b4d-e387-470d-8335-32f58ddc5baa
  └─storage-flatpaks          20G lvm   /var/lib/flatpak   c08b4434-a464-4ff4-99fb-9724576bfe88
nvme0n1                    931.5G disk
├─nvme0n1p1                  512M part  /boot              F0D3-969E
└─nvme0n1p2                  931G part                     3c0d48f6-f051-4328-9919-677a7fcddae7
  └─root                     931G crypt                    Pt1seT-eTLc-0hzA-fscF-4Zue-GZk1-sNw88y
    ├─main-nix--store         60G lvm   /nix/store         38ba7f29-0828-452a-a85b-1e61ee416285
    │                                   /nix
    ├─main-root               20G lvm   /                  87fd2a8f-e3c2-4ea2-a170-c05cd68090c5
    ├─main-home              500G lvm   /home              b5ce8483-2630-4d55-8627-f61c80329bd6
    └─main-swap               32G lvm   [SWAP]             e622a760-a23b-403c-91bd-5280edeb69f5

I have them set up as follows:

    initrd.luks.devices = {
      root.device = "/dev/disk/by-uuid/3c0d48f6-f051-4328-9919-677a7fcddae7";
      storage = {
        allowDiscards = true;
        device = "/dev/disk/by-uuid/dd17e735-fac4-467f-b1ee-8bb214bc2b08";
      };
    };

Also using the hardware-configuration.nix that was autogenerated, which has been working fine until 22.11.

Normally, this results in the system booting, asking me for the password once, then reusing the password for the second drive, and booting.

Since the upgrade to 22.11, it:

  • Asks for the password
  • Successfully decrypts the first drive
  • Tries to reuse the password for the second drive
  • Fails to decrypt the second drive with “keyslot failed to connect”

I looked around a bit, but haven’t found anything useful. Both drives use the same cipher, so it’s not a missing kernel module, and I doubt it’s related to the switch to libxcrypt. I have no clue how to debug this, nothing ends up in the journal since the system hasn’t booted far enough yet.

If I mark the second drive as not required for boot, the system boots fine, and I can mount it by hand:

tlater ~ $ sudo cryptsetup luksOpen --allow-discards /dev/nvme1n1 storage
Enter passphrase for /dev/nvme1n1:
tlater ~ $ sudo mount /dev/disk/by-uuid/2800a362-a988-42ed-a7b4-172b6326d39d /mnt/steam-library
tlater ~ $ sudo mount /dev/mapper/storage-flatpaks /var/lib/flatpak

I guess this is some kind of regression in the init scripts?

A week and a bit later and I finally have holiday to try and debug this. I’ve been scratching my head over this for a further 3 hours now. I’ve since discovered that adding debug to the kernelParams will give better debug output, and using boot.initrd.systemd will fail exactly the same way, but is more robust to non-root drives not mounting, since it doesn’t ask for the password ad infinitum, and is therefore nicer for debugging.

So now I have some logs! The broken drive:

Dec 15 20:28:12 localhost systemd[1]: systemd-cryptsetup@storage.service: Will spawn child (service_enter_start): /nix/store/8ac6kybdmbla694ikm5bihbj9v116rv2-systemd-stage-1-251.7/lib/systemd/systemd-cryptsetup
Dec 15 20:28:12 localhost systemd[1]: systemd-cryptsetup@storage.service: Passing 0 fds to service
Dec 15 20:28:12 localhost systemd[1]: systemd-cryptsetup@storage.service: About to execute /nix/store/8ac6kybdmbla694ikm5bihbj9v116rv2-systemd-stage-1-251.7/lib/systemd/systemd-cryptsetup attach storage /dev/disk/by-uuid/dd17e735-fac4-467f-b1ee-8bb214bc2b08 - cipher=serpent-xts-plain64,discard,no-read-workqueue,no-write-workqueue
Dec 15 20:28:12 localhost systemd[1]: systemd-cryptsetup@storage.service: Forked /nix/store/8ac6kybdmbla694ikm5bihbj9v116rv2-systemd-stage-1-251.7/lib/systemd/systemd-cryptsetup as 247
Dec 15 20:28:12 localhost systemd[1]: systemd-cryptsetup@storage.service: Changed dead -> start
Dec 15 20:28:12 localhost systemd[1]: Starting Cryptography Setup for storage...
Dec 15 20:28:12 localhost systemd[247]: systemd-cryptsetup@storage.service: Executing: /nix/store/8ac6kybdmbla694ikm5bihbj9v116rv2-systemd-stage-1-251.7/lib/systemd/systemd-cryptsetup attach storage /dev/disk/by-uuid/dd17e735-fac4-467f-b1ee-8bb214bc2b08 - cipher=serpent-xts-plain64,discard,no-read-workqueue,no-write-workqueue
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: run storage ← /dev/disk/by-uuid/dd17e735-fac4-467f-b1ee-8bb214bc2b08 type= cipher=serpent-xts-plain64
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Allocating context for crypt device /dev/disk/by-uuid/dd17e735-fac4-467f-b1ee-8bb214bc2b08.
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Trying to open and read device /dev/disk/by-uuid/dd17e735-fac4-467f-b1ee-8bb214bc2b08 with direct-io.
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Initialising device-mapper backend library.
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: dm version   [ opencount flush ]   [16384] (*1)
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: dm versions   [ opencount flush ]   [16384] (*1)
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Detected dm-ioctl version 4.47.0.
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Device-mapper backend running with UDEV support enabled.
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: dm status storage  [ opencount noflush ]   [16384] (*1)
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Trying to load any crypt type from device /dev/disk/by-uuid/dd17e735-fac4-467f-b1ee-8bb214bc2b08.
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Crypto backend (OpenSSL 3.0.7 1 Nov 2022 [default]) initialized in cryptsetup library version 2.5.0.
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Detected kernel Linux 6.0.10 x86_64.
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Loading LUKS2 header (repair disabled).
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Acquiring read lock for device /dev/disk/by-uuid/dd17e735-fac4-467f-b1ee-8bb214bc2b08.
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Locking directory /run/cryptsetup will be created with default compiled-in permissions.
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Opening lock resource file /run/cryptsetup/L_259:0
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Verifying lock handle for /dev/disk/by-uuid/dd17e735-fac4-467f-b1ee-8bb214bc2b08.
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Device /dev/disk/by-uuid/dd17e735-fac4-467f-b1ee-8bb214bc2b08 READ lock taken.
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Trying to read primary LUKS2 header at offset 0x0.
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Opening locked device /dev/disk/by-uuid/dd17e735-fac4-467f-b1ee-8bb214bc2b08
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Verifying locked device handle (bdev)
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: LUKS2 header version 2 of size 16384 bytes, checksum sha256.
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Checksum:c20a6837e55bd871b7b73f776c0401fdeb2d9c831bd0c07bb47332e5ea529018 (on-disk)
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Checksum:c20a6837e55bd871b7b73f776c0401fdeb2d9c831bd0c07bb47332e5ea529018 (in-memory)
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Trying to read secondary LUKS2 header at offset 0x4000.
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Reusing open ro fd on device /dev/disk/by-uuid/dd17e735-fac4-467f-b1ee-8bb214bc2b08
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: LUKS2 header version 2 of size 16384 bytes, checksum sha256.
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Checksum:f894490c5d825194d72c927fc87b394e0207599687db63b8ee6957a0406094a3 (on-disk)
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Checksum:f894490c5d825194d72c927fc87b394e0207599687db63b8ee6957a0406094a3 (in-memory)
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Device size 2000398934016, offset 16777216.
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Device /dev/disk/by-uuid/dd17e735-fac4-467f-b1ee-8bb214bc2b08 READ lock released.
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: PBKDF argon2id, time_ms 2000 (iterations 0), max_memory_kb 1048576, parallel_threads 4.
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Activating volume storage using token (any type) -1.
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: dm versions   [ opencount flush ]   [16384] (*1)
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: dm status storage  [ opencount noflush ]   [16384] (*1)
Dec 15 20:28:12 localhost systemd-cryptsetup[247]: Token activation unsuccessful for device /dev/disk/by-uuid/dd17e735-fac4-467f-b1ee-8bb214bc2b08: No such file or directory
Dec 15 20:28:25 localhost systemd-cryptsetup[247]: Added key to kernel keyring as 687670047.
Dec 15 20:28:25 localhost systemd-cryptsetup[247]: Set cipher serpent, mode xts-plain64, key size 512 bits for device /dev/disk/by-uuid/dd17e735-fac4-467f-b1ee-8bb214bc2b08.
Dec 15 20:28:25 localhost systemd-cryptsetup[247]: Activating volume storage [keyslot -1] using passphrase.
Dec 15 20:28:25 localhost systemd-cryptsetup[247]: dm versions   [ opencount flush ]   [16384] (*1)
Dec 15 20:28:25 localhost systemd-cryptsetup[247]: dm status storage  [ opencount noflush ]   [16384] (*1)
Dec 15 20:28:25 localhost systemd-cryptsetup[247]: Keyslot 0 priority 1 != 2 (required), skipped.
Dec 15 20:28:25 localhost systemd-cryptsetup[247]: Trying to open LUKS2 keyslot 0.
Dec 15 20:28:25 localhost systemd-cryptsetup[247]: Taking global memory-hard access serialization lock.
Dec 15 20:28:25 localhost systemd-cryptsetup[247]: Acquiring blocking write lock for resource memory-hard-access.
Dec 15 20:28:25 localhost systemd-cryptsetup[247]: Opening lock resource file /run/cryptsetup/LN_memory-hard-access
Dec 15 20:28:26 localhost systemd-cryptsetup[247]: Verifying lock handle for memory-hard-access.
Dec 15 20:28:26 localhost systemd-cryptsetup[247]: Lock handle verification failed.
Dec 15 20:28:26 localhost systemd-cryptsetup[247]: Opening lock resource file /run/cryptsetup/LN_memory-hard-access
Dec 15 20:28:26 localhost systemd-cryptsetup[247]: Verifying lock handle for memory-hard-access.
Dec 15 20:28:26 localhost systemd-cryptsetup[247]: WRITE lock for resource memory-hard-access taken.
Dec 15 20:28:26 localhost systemd-cryptsetup[247]: Running keyslot key derivation.
Dec 15 20:28:27 localhost systemd[1]: systemd-cryptsetup@storage.service: Reinstalled deserialized job systemd-cryptsetup@storage.service/start as 33
Dec 15 20:28:27 localhost systemd[1]: systemd-cryptsetup@storage.service: Changed dead -> start
Dec 15 20:28:28 localhost systemd-cryptsetup[247]: Unlocking WRITE lock for resource memory-hard-access.
Dec 15 20:28:28 localhost systemd-cryptsetup[247]: Reading keyslot area [0x8000].
Dec 15 20:28:28 localhost systemd-cryptsetup[247]: Acquiring read lock for device /dev/disk/by-uuid/dd17e735-fac4-467f-b1ee-8bb214bc2b08.
Dec 15 20:28:28 localhost systemd-cryptsetup[247]: Opening lock resource file /run/cryptsetup/L_259:0
Dec 15 20:28:28 localhost systemd-cryptsetup[247]: Verifying lock handle for /dev/disk/by-uuid/dd17e735-fac4-467f-b1ee-8bb214bc2b08.
Dec 15 20:28:28 localhost systemd-cryptsetup[247]: Device /dev/disk/by-uuid/dd17e735-fac4-467f-b1ee-8bb214bc2b08 READ lock taken.
Dec 15 20:28:28 localhost systemd-cryptsetup[247]: Reusing open ro fd on device /dev/disk/by-uuid/dd17e735-fac4-467f-b1ee-8bb214bc2b08
Dec 15 20:28:28 localhost systemd-cryptsetup[247]: Device /dev/disk/by-uuid/dd17e735-fac4-467f-b1ee-8bb214bc2b08 READ lock released.
Dec 15 20:28:28 localhost systemd-cryptsetup[247]: Keyslot 0 (luks2) open failed with -22.
Dec 15 20:28:28 localhost systemd-cryptsetup[247]: Keyslot open failed.
Dec 15 20:28:28 localhost systemd-cryptsetup[247]: Failed to activate with specified passphrase: Invalid argument
Dec 15 20:28:28 localhost systemd-cryptsetup[247]: Releasing crypt device /dev/disk/by-uuid/dd17e735-fac4-467f-b1ee-8bb214bc2b08 context.
Dec 15 20:28:28 localhost systemd-cryptsetup[247]: Releasing device-mapper backend.
Dec 15 20:28:28 localhost systemd-cryptsetup[247]: Closing read only fd for /dev/disk/by-uuid/dd17e735-fac4-467f-b1ee-8bb214bc2b08.
Dec 15 20:28:28 localhost systemd[1]: systemd-cryptsetup@storage.service: Child 247 belongs to systemd-cryptsetup@storage.service.
Dec 15 20:28:28 localhost systemd[1]: systemd-cryptsetup@storage.service: Main process exited, code=exited, status=1/FAILURE
Dec 15 20:28:28 localhost systemd[1]: systemd-cryptsetup@storage.service: Failed with result 'exit-code'.
Dec 15 20:28:28 localhost systemd[1]: systemd-cryptsetup@storage.service: Service will not restart (restart setting)
Dec 15 20:28:28 localhost systemd[1]: systemd-cryptsetup@storage.service: Changed start -> failed
Dec 15 20:28:28 localhost systemd[1]: systemd-cryptsetup@storage.service: Job 33 systemd-cryptsetup@storage.service/start finished, result=failed
Dec 15 20:28:28 localhost systemd[1]: Failed to start Cryptography Setup for storage.
Dec 15 20:28:28 localhost systemd[1]: systemd-cryptsetup@storage.service: Unit entered failed state.
Dec 15 20:28:28 localhost systemd[1]: systemd-cryptsetup@storage.service: Consumed 6.104s CPU time.
Dec 15 20:28:28 localhost systemd[1]: systemd-cryptsetup@storage.service: Control group is empty.

The working drive:

Dec 15 20:28:12 localhost systemd[1]: systemd-cryptsetup@root.service: Will spawn child (service_enter_start): /nix/store/8ac6kybdmbla694ikm5bihbj9v116rv2-systemd-stage-1-251.7/lib/systemd/systemd-cryptsetup
Dec 15 20:28:12 localhost systemd[1]: systemd-cryptsetup@root.service: Passing 0 fds to service
Dec 15 20:28:12 localhost systemd[1]: systemd-cryptsetup@root.service: About to execute /nix/store/8ac6kybdmbla694ikm5bihbj9v116rv2-systemd-stage-1-251.7/lib/systemd/systemd-cryptsetup attach root /dev/disk/by-uuid/3c0d48f6-f051-4328-9919-677a7fcddae7 - discard,no-read-workqueue,no-write-workqueue
Dec 15 20:28:12 localhost systemd[1]: systemd-cryptsetup@root.service: Forked /nix/store/8ac6kybdmbla694ikm5bihbj9v116rv2-systemd-stage-1-251.7/lib/systemd/systemd-cryptsetup as 251
Dec 15 20:28:12 localhost systemd[1]: systemd-cryptsetup@root.service: Changed dead -> start
Dec 15 20:28:12 localhost systemd[1]: Starting Cryptography Setup for root...
Dec 15 20:28:12 localhost systemd[251]: systemd-cryptsetup@root.service: Executing: /nix/store/8ac6kybdmbla694ikm5bihbj9v116rv2-systemd-stage-1-251.7/lib/systemd/systemd-cryptsetup attach root /dev/disk/by-uuid/3c0d48f6-f051-4328-9919-677a7fcddae7 - discard,no-read-workqueue,no-write-workqueue
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: run root ← /dev/disk/by-uuid/3c0d48f6-f051-4328-9919-677a7fcddae7 type= cipher=
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Allocating context for crypt device /dev/disk/by-uuid/3c0d48f6-f051-4328-9919-677a7fcddae7.
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Trying to open and read device /dev/disk/by-uuid/3c0d48f6-f051-4328-9919-677a7fcddae7 with direct-io.
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Initialising device-mapper backend library.
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: dm version   [ opencount flush ]   [16384] (*1)
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: dm versions   [ opencount flush ]   [16384] (*1)
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Detected dm-ioctl version 4.47.0.
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Device-mapper backend running with UDEV support enabled.
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: dm status root  [ opencount noflush ]   [16384] (*1)
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Trying to load any crypt type from device /dev/disk/by-uuid/3c0d48f6-f051-4328-9919-677a7fcddae7.
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Crypto backend (OpenSSL 3.0.7 1 Nov 2022 [default]) initialized in cryptsetup library version 2.5.0.
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Detected kernel Linux 6.0.10 x86_64.
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Loading LUKS2 header (repair disabled).
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Acquiring read lock for device /dev/disk/by-uuid/3c0d48f6-f051-4328-9919-677a7fcddae7.
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Opening lock resource file /run/cryptsetup/L_259:3
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Verifying lock handle for /dev/disk/by-uuid/3c0d48f6-f051-4328-9919-677a7fcddae7.
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Device /dev/disk/by-uuid/3c0d48f6-f051-4328-9919-677a7fcddae7 READ lock taken.
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Trying to read primary LUKS2 header at offset 0x0.
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Opening locked device /dev/disk/by-uuid/3c0d48f6-f051-4328-9919-677a7fcddae7
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Verifying locked device handle (bdev)
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: LUKS2 header version 2 of size 16384 bytes, checksum sha256.
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Checksum:c3926ea265a6f61692f645cd07aff329faa16af969083f5dbd0b0f834da60286 (on-disk)
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Checksum:c3926ea265a6f61692f645cd07aff329faa16af969083f5dbd0b0f834da60286 (in-memory)
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Trying to read secondary LUKS2 header at offset 0x4000.
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Reusing open ro fd on device /dev/disk/by-uuid/3c0d48f6-f051-4328-9919-677a7fcddae7
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: LUKS2 header version 2 of size 16384 bytes, checksum sha256.
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Checksum:21385b12052395b849cfb5fd84cf9ca7865baeecae3183365a6c7b30377ec3ed (on-disk)
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Checksum:21385b12052395b849cfb5fd84cf9ca7865baeecae3183365a6c7b30377ec3ed (in-memory)
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Device size 999665917952, offset 16777216.
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Device /dev/disk/by-uuid/3c0d48f6-f051-4328-9919-677a7fcddae7 READ lock released.
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: PBKDF argon2id, time_ms 2000 (iterations 0), max_memory_kb 1048576, parallel_threads 4.
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Activating volume root using token (any type) -1.
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: dm versions   [ opencount flush ]   [16384] (*1)
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: dm status root  [ opencount noflush ]   [16384] (*1)
Dec 15 20:28:12 localhost systemd-cryptsetup[251]: Token activation unsuccessful for device /dev/disk/by-uuid/3c0d48f6-f051-4328-9919-677a7fcddae7: No such file or directory
Dec 15 20:28:25 localhost systemd-cryptsetup[251]: Set cipher serpent, mode xts-plain64, key size 512 bits for device /dev/disk/by-uuid/3c0d48f6-f051-4328-9919-677a7fcddae7.
Dec 15 20:28:25 localhost systemd-cryptsetup[251]: Activating volume root [keyslot -1] using passphrase.
Dec 15 20:28:25 localhost systemd-cryptsetup[251]: dm versions   [ opencount flush ]   [16384] (*1)
Dec 15 20:28:25 localhost systemd-cryptsetup[251]: dm status root  [ opencount noflush ]   [16384] (*1)
Dec 15 20:28:25 localhost systemd-cryptsetup[251]: Keyslot 0 priority 1 != 2 (required), skipped.
Dec 15 20:28:25 localhost systemd-cryptsetup[251]: Trying to open LUKS2 keyslot 0.
Dec 15 20:28:25 localhost systemd-cryptsetup[251]: Taking global memory-hard access serialization lock.
Dec 15 20:28:25 localhost systemd-cryptsetup[251]: Acquiring blocking write lock for resource memory-hard-access.
Dec 15 20:28:25 localhost systemd-cryptsetup[251]: Opening lock resource file /run/cryptsetup/LN_memory-hard-access
Dec 15 20:28:25 localhost systemd-cryptsetup[251]: Verifying lock handle for memory-hard-access.
Dec 15 20:28:25 localhost systemd-cryptsetup[251]: WRITE lock for resource memory-hard-access taken.
Dec 15 20:28:25 localhost systemd-cryptsetup[251]: Running keyslot key derivation.
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: Unlocking WRITE lock for resource memory-hard-access.
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: Reading keyslot area [0x8000].
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: Acquiring read lock for device /dev/disk/by-uuid/3c0d48f6-f051-4328-9919-677a7fcddae7.
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: Opening lock resource file /run/cryptsetup/L_259:3
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: Verifying lock handle for /dev/disk/by-uuid/3c0d48f6-f051-4328-9919-677a7fcddae7.
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: Device /dev/disk/by-uuid/3c0d48f6-f051-4328-9919-677a7fcddae7 READ lock taken.
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: Reusing open ro fd on device /dev/disk/by-uuid/3c0d48f6-f051-4328-9919-677a7fcddae7
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: Device /dev/disk/by-uuid/3c0d48f6-f051-4328-9919-677a7fcddae7 READ lock released.
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: Verifying key from keyslot 0, digest 0.
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: dm target-version crypt  [ opencount flush ]   [16384] (*1)
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: dm versions   [ opencount flush ]   [16384] (*1)
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: Detected dm-crypt version 1.24.0.
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: Loading key (64 bytes, type logon) in thread keyring.
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: dm versions   [ opencount flush ]   [16384] (*1)
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: dm status root  [ opencount noflush ]   [16384] (*1)
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: Calculated device size is 1952439728 sectors (RW), offset 32768.
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: DM-UUID is CRYPT-LUKS2-3c0d48f6f05143289919677a7fcddae7-root
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: Udev cookie 0xd4d59de (semid 0) created
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: Udev cookie 0xd4d59de (semid 0) incremented to 1
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: Udev cookie 0xd4d59de (semid 0) incremented to 2
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: Udev cookie 0xd4d59de (semid 0) assigned to CREATE task(0) with flags DISABLE_LIBRARY_FALLBACK         (0x20)
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: dm create root CRYPT-LUKS2-3c0d48f6f05143289919677a7fcddae7-root [ opencount flush ]   [16384] (*1)
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: dm reload   (254:0) [ opencount flush securedata ]   [16384] (*1)
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: dm resume root  [ opencount flush securedata ]   [16384] (*1)
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: root: Stacking NODE_ADD (254,0) 0:0 0600 [trust_udev]
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: root: Stacking NODE_READ_AHEAD 256 (flags=1)
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: Udev cookie 0xd4d59de (semid 0) decremented to 1
Dec 15 20:28:26 localhost systemd-cryptsetup[251]: Udev cookie 0xd4d59de (semid 0) waiting for zero
Dec 15 20:28:27 localhost systemd-cryptsetup[251]: Udev cookie 0xd4d59de (semid 0) destroyed
Dec 15 20:28:27 localhost systemd-cryptsetup[251]: root: Skipping NODE_ADD (254,0) 0:0 0600 [trust_udev]
Dec 15 20:28:27 localhost systemd-cryptsetup[251]: root: Processing NODE_READ_AHEAD 256 (flags=1)
Dec 15 20:28:27 localhost systemd-cryptsetup[251]: root (254:0): read ahead is 256
Dec 15 20:28:27 localhost systemd-cryptsetup[251]: root: retaining kernel read ahead of 256 (requested 256)
Dec 15 20:28:27 localhost systemd-cryptsetup[251]: Releasing crypt device /dev/disk/by-uuid/3c0d48f6-f051-4328-9919-677a7fcddae7 context.
Dec 15 20:28:27 localhost systemd-cryptsetup[251]: Releasing device-mapper backend.
Dec 15 20:28:27 localhost systemd-cryptsetup[251]: Closing read only fd for /dev/disk/by-uuid/3c0d48f6-f051-4328-9919-677a7fcddae7.
Dec 15 20:28:27 localhost systemd[1]: systemd-cryptsetup@root.service: Child 251 belongs to systemd-cryptsetup@root.service.
Dec 15 20:28:27 localhost systemd[1]: systemd-cryptsetup@root.service: Main process exited, code=exited, status=0/SUCCESS (success)
Dec 15 20:28:27 localhost systemd[1]: systemd-cryptsetup@root.service: Changed start -> exited
Dec 15 20:28:27 localhost systemd[1]: systemd-cryptsetup@root.service: Job 30 systemd-cryptsetup@root.service/start finished, result=done
Dec 15 20:28:27 localhost systemd[1]: Finished Cryptography Setup for root.
Dec 15 20:28:27 localhost systemd[1]: systemd-cryptsetup@root.service: Control group is empty.
Dec 15 20:28:27 localhost systemd[1]: systemd-cryptsetup@root.service: Changed dead -> exited
Dec 15 20:28:30 yui systemd[1]: systemd-cryptsetup@root.service: Installed new job systemd-cryptsetup@root.service/stop as 251

I have two take-aways from this:

  1. I believe the first drive that is attempted to be decrypted fails to decrypt, the second one succeeds. This is because the logs for storage mention opening temporary files, while the ones for root don’t.
  2. This is the actual error output, nothing else seems significantly different between the drives:

    Dec 15 20:28:28 localhost systemd-cryptsetup[247]: Keyslot 0 (luks2) open failed with -22.
    Dec 15 20:28:28 localhost systemd-cryptsetup[247]: Keyslot open failed.
    Dec 15 20:28:28 localhost systemd-cryptsetup[247]: Failed to activate with specified passphrase: Invalid argument

I’ve also since reproduced this on my second system, after I finally got around to trying to upgrade that. That one is royally borked since it only has one drive.

I’ve tried manually loading the modules required for serpent-xts-plain64 after stumbling across someone with a similar error message on a random Russian forum solving it that way, and since confirmed they are included in boot.initrd.luks.cryptoModules by default.

I’ve also manually force-set the crypttab cipher, with no success.

I remain completely stumped. Does anyone have any idea what changed? Both my desktop systems can suddenly no longer boot with 22.11.

I noticed that the broken one’s logs have cipher=serpent-xts-plain64 in the arguments for systemd-cryptsetup attach while the other one doesn’t have a cipher= argument at all. That’s the only difference I see in the logs until the failure occurs, other than the fact that the one that runs first adds the passphrase to the kernel keyring for the other one to reuse.

Yep, that was debugging on my end, I have a few runs where I didn’t explicitly set that.

@TLATER is there any noteworthy difference between cryptsetup luksDump on the two of them?

1 Like

Oh! Interesting, I thought they were exactly the same, but:

*** working     2022-12-15 21:51:10.174539992 +0000
--- broken      2022-12-15 21:51:16.728727574 +0000
***************
*** 1,9 ****
  LUKS header information
  Version:              2
! Epoch:                4
  Metadata area:        16384 [bytes]
  Keyslots area:        16744448 [bytes]
--- 1,9 ----
  LUKS header information
  Version:              2
! Epoch:                3
  Metadata area:        16384 [bytes]
  Keyslots area:        16744448 [bytes]
***************
*** 25,34 ****
!       AF hash:    sha256
--- 25,34 ----
!       AF hash:    whirlpool
**************

Confirmed on my other computer, too, it also uses whirlpool. Guess I accidentally set up sha256 for one of them, and that’s the problem. Now to figure out why whirlpool isn’t liked, guessing it is a missing module of some sort after all.

Sadly, adding wp512 to cryptoModules doesn’t seem to help. At least it’s something to base debugging off, thanks @ElvishJerricco

More debugging later, adding wp512 to either of boot.kernelModules or boot.initrd.luks.cryptoModules doesn’t change what modules are actually loaded in the kernel during stage1 (or at least in rescue mode).

Even after forcibly loading the wp512 module with modprobe in the rescue shell, and testing what algorithms cryptsetup benchmark successfully runs, I get:

PBKDF2-whirlpool  N/A

???

Is the cryptsetup that is put in the initrd somehow a different build from the one running on my desktop, and built without whirlpool support?

Many more debugging cycles later, I’ve tried blacklisting the wp512 module and unlocking the device in question using my normal desktop. Surprise surprise, it still decrypts and mounts fine. So, despite the whirlpool hash being used on the device, it’s not necessary to decrypt.

This is backed up by some cryptsetup mailing list posts as well, looks like that hash is only used for key finding, not for actual IO. I should one day learn more about how cryptsetup’s disk encryption actually works.

That leaves me back at square one, though; the only difference between the disks makes no difference in their decryption, and there are clearly no missing kernel modules in the initrd.

Unless someone here has bright ideas, it’s probably time to go ask the cryptsetup upstream for debugging help. Doubt they’ll be happy to assist with intird issues, but maybe they can at least point me at what might be causing this error message…

I’ve since extracted my initrd and verified that the cryptsetup from the initrd also fails to open the drive when my full system has booted. This has also made it a bit easier to get full output from it, and I managed to directly compare working vs non-working, detailed logs. I see this:

diff --git a/broken b/working
index af8d103..37c3c57 100644
--- a/broken
+++ b/working
@@ -1,5 +1,5 @@
-tlater ~/Scratch/temp $ sudo ./nix/store/xlqq2g3vzdwv48hb31kkh257jck1kzdz-extra-utils/bin/cryptsetup --debug -v luksOpen /dev/nvme1n1 storage
-# cryptsetup 2.5.0 processing "./nix/store/xlqq2g3vzdwv48hb31kkh257jck1kzdz-extra-utils/bin/cryptsetup --debug -v luksOpen /dev/nvme1n1 storage"
+tlater ~ $ sudo cryptsetup --debug -v luksOpen /dev/nvme1n1 storage
+# cryptsetup 2.5.0 processing "cryptsetup --debug -v luksOpen /dev/nvme1n1 storage"
 # Verifying parameters for command open.
 # Running command open.
 # Locking memory.
@@ -9,7 +9,7 @@ tlater ~/Scratch/temp $ sudo ./nix/store/xlqq2g3vzdwv48hb31kkh257jck1kzdz-extra-
 # Trying to open and read device /dev/nvme1n1 with direct-io.
 # Initialising device-mapper backend library.
 # Trying to load any crypt type from device /dev/nvme1n1.
-# Crypto backend (OpenSSL 3.0.7 1 Nov 2022 [default]) initialized in cryptsetup library version 2.5.0.
+# Crypto backend (OpenSSL 3.0.7 1 Nov 2022 [default][legacy]) initialized in cryptsetup library version 2.5.0.
 # Detected kernel Linux 6.1.1 x86_64.
 # Loading LUKS2 header (repair disabled).
 # Acquiring read lock for device /dev/nvme1n1.
@@ -53,10 +53,31 @@ Enter passphrase for /dev/nvme1n1:
 # Device /dev/nvme1n1 READ lock taken.
 # Reusing open ro fd on device /dev/nvme1n1
 # Device /dev/nvme1n1 READ lock released.
-# Keyslot 1 (luks2) open failed with -22.
-Keyslot open failed.
+# Verifying key from keyslot 1, digest 1.
+# Loading key (64 bytes, type logon) in thread keyring.
+# dm versions   [ opencount flush ]   [16384] (*1)
+# dm status storage  [ opencount noflush ]   [16384] (*1)
+# Calculated device size is 3906996400 sectors (RW), offset 32768.
+# DM-UUID is CRYPT-LUKS2-dd17e735fac4467fb1ee8bb214bc2b08-storage
+# Udev cookie 0xd4ddccb (semid 2) created
+# Udev cookie 0xd4ddccb (semid 2) incremented to 1
+# Udev cookie 0xd4ddccb (semid 2) incremented to 2
+# Udev cookie 0xd4ddccb (semid 2) assigned to CREATE task(0) with flags DISABLE_LIBRARY_FALLBACK         (0x20)
+# dm create storage CRYPT-LUKS2-dd17e735fac4467fb1ee8bb214bc2b08-storage [ opencount flush ]   [16384] (*1)
+# dm reload   (254:5) [ opencount flush securedata ]   [16384] (*1)
+# dm resume storage  [ opencount flush securedata ]   [16384] (*1)
+# storage: Stacking NODE_ADD (254,5) 0:0 0600 [trust_udev]
+# storage: Stacking NODE_READ_AHEAD 256 (flags=1)
+# Udev cookie 0xd4ddccb (semid 2) decremented to 1
+# Udev cookie 0xd4ddccb (semid 2) waiting for zero
+# Udev cookie 0xd4ddccb (semid 2) destroyed
+# storage: Skipping NODE_ADD (254,5) 0:0 0600 [trust_udev]
+# storage: Processing NODE_READ_AHEAD 256 (flags=1)
+# storage (254:5): read ahead is 256
+# storage: retaining kernel read ahead of 256 (requested 256)
+Key slot 1 unlocked.
 # Releasing crypt device /dev/nvme1n1 context.
 # Releasing device-mapper backend.
 # Closing read only fd for /dev/nvme1n1.
 # Unlocking memory.
-Command failed with code -1 (wrong or missing parameters).
+Command successful.

The only notable message is:

-# Crypto backend (OpenSSL 3.0.7 1 Nov 2022 [default]) initialized in cryptsetup library version 2.5.0.
+# Crypto backend (OpenSSL 3.0.7 1 Nov 2022 [default][legacy]) initialized in cryptsetup library version 2.5.0.

I suspect that [legacy] means that “legacy” algorithms are enabled once my system has actually booted. I thought at first that this was because of busybox ssl being used in the initrd (although I haven’t a clue why it would report itself as an openssl backend…), but setting initrd.yubikeySupport (which installs a full openssl) didn’t resolve that.

Nonetheless, this probably means that the whirlpool hash is now considered legacy by something involved in the initrd process, and this probably wasn’t noticed in the release notes because it’s a bit of a rare edge case.

I dug up and down the stack to find what something that is to report this, but finally have given up and cryptsetup reencrypt-ed my drive to use sha256. This is fine, since it’s clearly not actually security relevant; upstream considers sha1 fine for this, if not for silly regulatory requirements.

Still not very fun, given it’s 2TB and took over 6 hours, and probably burnt quite a few SSD write cycles, but at least my system boots properly now. I dread the day that I actually have to find out why this is broken for something commercial, not even strace could point me in the right direction.

On the up side, I learned a lot about the initrd, how NixOS generates it, and also about cryptsetup while debugging this.

If anyone ever figures out what exactly broke, I’d love to know.

2 Likes