How to unlock some luks devices with a keyfile on a first luks device

Hi all.

I’ve been looking for a while now, but seem unable to find a way to achieve what I’m looking to do.
i’m looking to unlock luks devices, where one luks device is the root filesystem, and has the keyfile for unlocking the remaining drives.

On Debian this was something I was able to achieve by creating a /etc/crypttab file, and it would manage the dependencies correctly.

My block devices look like this:

NAME                       FSTYPE      MOUNTPOINT
sda
└─sda1                     crypto_LUKS
  └─data_disk1_crypt       xfs         /data/data_disk1
sdb
├─sdb1                     vfat        /boot/efi
├─sdb2                     ext2
├─sdb3                     crypto_LUKS
│ └─root                   LVM2_member
│   ├─nexus--vg-root       ext4        /data/debian_root
│   ├─nexus--vg-swap_1     swap        [SWAP]
│   └─nexus--vg-nixos_root ext4        /
└─sdb4                     ext4        /boot
sdd
└─sdd1                     crypto_LUKS
  └─data_disk2_crypt       xfs         /data/data_disk2
sde
└─sde1                     crypto_LUKS
  └─data_disk3_crypt       xfs         /data/data_disk3
sdf
└─sdf1                     crypto_LUKS
  └─parity_disk1_crypt     xfs         /data/parity_disk

nixos-generate-config generated a hardware-configuration.nix which contains

  filesystems."/data/data_disk1" = {
    device = "/dev/disk/by-uuid/eea269a4-81fb-49cd-883d-44d4070cba00";
    fstype = "xfs";
  };

  boot.initrd.luks.devices."data_disk1_crypt".device =
    "/dev/disk/by-uuid/2b7c47cb-a425-4493-8d7d-4227537a40d5";

I attempted to add the keyfile by extending this in my own configuration:

    boot.initrd.luks.devices."data_disk1_crypt" = {
      device = "/dev/disk/by-uuid/2b7c47cb-a425-4493-8d7d-4227537a40d5";
      preLVM = true;
      keyFile = "/root/keyfile";
    };

But this caused my machine to be unable to boot, since it was unable to locate the keyFile.

I found this post which seems to be asking the same thing, and the answer was to set preLVM to false. This doesn’t seem to fix this for me, I attempted this change, and after providing my passphrase for on a reboot saw this error (hand transcribed here, so forgive typos, I can’t copy-paste over IPMI):

Verifying passphrase for /dev/disk/by-uuid/2644f599-e320-4c60-bc1a-bc0d4cba7d46... - success
starting device mapper and LVM...
  3 logical volume(s) in volume group "nexus-vg" now active
Waiting 10 seconds for key file /root/keyfile to appear............. - failure
/root/keyfile is unavilable

What is the recommended way forward? Is there a way to achieve what I’m doing easily via the configuration options available? My other idea is to define these mounts via systemd units, and see if I can figure out how to define systemd units to unlocks the luks drives.

  1. Is that a reasonable approach?
  2. Will I need to patch future hardware-configuration.nix files since nixos-generate-config will keep generating entries for these filesystems when I rerun it?

Thank you for your time!

Try keyFile = "/mnt-root/root/keyfile";

The root filesystem is first mounted at /mnt-root, and later remounted at /.

Hmm, that didn’t seem to work.
I tried it with both preLVM set to false and true.

I see from journalctl logs (journalctl -xb from the emergency boot environment) that it’s failing to mount the filesystem. Logs are below.

May 02 22:18:37 nexus systemd[1]: Dependency failed for /data/parity_disk.
░░ Subject: A start job for unit data-parity_disk.mount has failed
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ A start job for unit data-parity_disk.mount has finished with a failure.
░░
░░ The job identifier is 829 and the job result is dependency.
May 02 22:18:37 nexus systemd[1]: Dependency failed for Local File Systems.
░░ Subject: A start job for unit local-fs.target has failed
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ A start job for unit local-fs.target has finished with a failure.
░░
░░ The job identifier is 822 and the job result is dependency.
May 02 22:18:37 nexus systemd[1]: local-fs.target: Job local-fs.target/start failed with result 'dependency'.
May 02 22:18:37 nexus systemd[1]: local-fs.target: Triggering OnFailure= dependencies.
May 02 22:18:37 nexus systemd[1]: data-parity_disk.mount: Job data-parity_disk.mount/start failed with result 'dependency'.
May 02 22:18:37 nexus systemd[1]: dev-disk-by\x2duuid-4387ddd1\x2d3199\x2d4f05\x2dade8\x2dec4535ebb05f.device: Job dev-disk-by\x2duuid-4387ddd1\x2d3199\x2d4f05\x2dade8\x2dec4535ebb05f.device/start failed wit>
May 02 22:18:37 nexus systemd[1]: dev-disk-by\x2duuid-eea269a4\x2d81fb\x2d49cd\x2d883d\x2d44d4070cba00.device: Job dev-disk-by\x2duuid-eea269a4\x2d81fb\x2d49cd\x2d883d\x2d44d4070cba00.device/start timed out.
May 02 22:18:37 nexus systemd[1]: Timed out waiting for device /dev/disk/by-uuid/eea269a4-81fb-49cd-883d-44d4070cba00.
░░ Subject: A start job for unit dev-disk-by\x2duuid-eea269a4\x2d81fb\x2d49cd\x2d883d\x2d44d4070cba00.device has failed
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ A start job for unit dev-disk-by\x2duuid-eea269a4\x2d81fb\x2d49cd\x2d883d\x2d44d4070cba00.device has finished with a failure.
░░
░░ The job identifier is 865 and the job result is timeout.

I see earlier in the message that the stage-1 bootloader exits without decrypting the other data volumes.

May 02 21:47:51 nexus stage-1-init: [Tue May  3 01:47:47 UTC 2022] [fsck.ext4 (1) -- /mnt-root/] fsck.ext4 -a /dev/disk/by-uuid/91a72ae1-ea3f-4a0d-a642-043a355f2e85
May 02 21:47:51 nexus stage-1-init: [Tue May  3 01:47:47 UTC 2022] nixos: recovering journal
May 02 21:47:51 nexus stage-1-init: [Tue May  3 01:47:47 UTC 2022] nixos: clean, 167304/7700480 files, 1683123/30772224 blocks
May 02 21:47:51 nexus stage-1-init: [Tue May  3 01:47:47 UTC 2022] mounting /dev/disk/by-uuid/91a72ae1-ea3f-4a0d-a642-043a355f2e85 on /...
May 02 21:47:51 nexus kernel: EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
May 02 21:47:51 nexus kernel: ixgbe 0000:03:00.0: removed PHC on eno3
May 02 21:47:51 nexus kernel: ixgbe 0000:03:00.1: removed PHC on eno4
May 02 21:47:51 nexus kernel: EXT4-fs (dm-3): re-mounted. Opts: (null)
May 02 21:47:51 nexus unknown: booting system configuration /nix/store/jr883pjv78cbqrjmh13ywq1ilpg7qrm9-nixos-system-nexus-21.11beta333507.8bcc413
May 02 21:47:51 nexus stage-2-init: running activation script...
May 02 21:47:51 nexus stage-2-init: setting up /etc...

I’m looking at luksRoot.nix and based on what I see there it should be generating crypttab entries for each device.

Any idea how I an view the generated cryptab?

When you run nixos-rebuild build it produces a symlink named result. Similarly, nixos-rebuild [switch|boot] produces symlinks in /nix/var/nix/profiles.

You can follow either symlink to the file initrd, which should contain the crypttab file. initrd is likely compressed with zstd, otherwise with gzip.

Well, I should have been more careful. The link I provided earlier was to the master branch.
The corresponding luksroot.nix shows that in 21.11 there is no crypttab.

I had previously tried to look at the initrd. It wasn’t compressed by either zstd or gzip, unless cpio automatically decompresses archives.

I was able to decompress the archive using cpio -idmv < initrd

I was able look at the script that seems to run in stage 2 ( at least I think it’s stage 2 ) via nixos-option, I had originally tried to use nix repl <nixos/nixpkgs>, but the lack of a method to unescape the script value led me to nixos-option.

> # Empty lines in the output below have been trimmed
> nixos-option 'boot.initrd.preLVMCommands' | less
  # <snip> Many lines above
  # LUKS
  open_normally() {
      if wait_target "key file" /mnt-root/root/keyfile; then
      cryptsetup luksOpen /dev/disk/by-uuid/ebaf3ced-e0ec-4978-bf7c-839c30ba0051 data_disk2_crypt --key-file=/mnt-root/root/keyfile \
         \

  else
      die "/mnt-root/root/keyfile is unavailable"
      echo " - failing back to interactive password prompt"
      do_open_passphrase
  fi

  }
  # commands to run right before we mount our device
  open_normally
  # <snip> Many lines below

Looking over my journalctl logs, I can’t see find any instances of to falling back to interactive password prompt which makes me think I didn’t deploy the correct value.

It’s also interesting that the failures in my previous log output are all from systemd, which I’ve now learnt doesn’t run in stage 1.

I’m not quite sure what to make of this just yet. I’ll try a few more things tonight and write on what I learn.

I’m also excited about systemd in initrd, which seems to have been merged in #168554. I guess in the future that should make what I’m trying to do much simpler, while also unlocking some more parallelism since the script unlocks drives sequentially.

For drives that are not needed to reach stage 2, I’m pretty sure you can still just use /etc/crypttab. The file systems that are needed to reach stage 2 by default are these ones. But /etc/crypttab can be configured for stage 2 with the NixOS environment.etc option. The systemd-cryptsetup-generator should be run automatically (despite not being included in /etc/systemd/system-generators).

Stage 1 is another matter. I don’t think there’s a way to manage the dependency ordering of LUKS drives in stage 1 unless you use the experimental systemd-stage-1 work. With that, we use crypttab in stage 1, so we should get the auto dependency ordering, but you can also add manual orderings with the systemd options if necessary. If you’re feeling adventurous, more testers are very much appreciated on that front :slight_smile:

For drives that are not needed to reach stage 2, I’m pretty sure you can still just use /etc/crypttab . The file systems that are needed to reach stage 2 by default are these ones. But /etc/crypttab can be configured for stage 2 with the NixOS environment.etc option. The systemd-cryptsetup-generator should be run automatically (despite not being included in /etc/systemd/system-generators ).

Luckily I don’t need these drives in stage 1 (or at least I don’t think I do, I’m not sure I fully understand how stage 1 and stage 2 are semantically different).

I’ll give the environment.etc approach a go.

Stage 1 is another matter. I don’t think there’s a way to manage the dependency ordering of LUKS drives in stage 1 unless you use the experimental systemd-stage-1 work. With that, we use crypttab in stage 1, so we should get the auto dependency ordering, but you can also add manual orderings with the systemd options if necessary. If you’re feeling adventurous, more testers are very much appreciated on that front :slight_smile:

I’m happy to do any testing, but I’ll need guidance on how to go about testing it? Do I need to change my channel to nixos-unstable for this, or can I opt into just the systemd-stage-1 changes somehow? I’m definitely still very new to nix/nixos and nix the language.

I looked over the script that was generated for boot.initrd.preLVMCommands, and I think I see why the disks aren’t loading.

[root@nexus:~]# nixos-option 'boot.initrd.preLVMCommands' | grep -n luksOpen
221:          echo -n "$passphrase" | cryptsetup luksOpen /dev/disk/by-uuid/2b7c47cb-a425-4493-8d7d-4227537a40d5 data_disk1_crypt --key-file=-
238:      cryptsetup luksOpen /dev/disk/by-uuid/2b7c47cb-a425-4493-8d7d-4227537a40d5 data_disk1_crypt --key-file=/mnt-root/root/keyfile \
296:          echo -n "$passphrase" | cryptsetup luksOpen /dev/disk/by-uuid/ebaf3ced-e0ec-4978-bf7c-839c30ba0051 data_disk2_crypt --key-file=-
313:      cryptsetup luksOpen /dev/disk/by-uuid/ebaf3ced-e0ec-4978-bf7c-839c30ba0051 data_disk2_crypt --key-file=/mnt-root/root/keyfile \
371:          echo -n "$passphrase" | cryptsetup luksOpen /dev/disk/by-uuid/353ce3c1-7c53-448a-909b-d239d210c99b data_disk3_crypt --key-file=-
388:      cryptsetup luksOpen /dev/disk/by-uuid/353ce3c1-7c53-448a-909b-d239d210c99b data_disk3_crypt --key-file=/mnt-root/root/keyfile \
446:          echo -n "$passphrase" | cryptsetup luksOpen /dev/disk/by-uuid/a7c2e0a5-c5ec-4dc9-94fd-4118890e6486 parity_disk1_crypt --key-file=-
463:      cryptsetup luksOpen /dev/disk/by-uuid/a7c2e0a5-c5ec-4dc9-94fd-4118890e6486 parity_disk1_crypt --key-file=/mnt-root/root/keyfile \
521:          echo -n "$passphrase" | cryptsetup luksOpen /dev/disk/by-uuid/2644f599-e320-4c60-bc1a-bc0d4cba7d46 root --allow-discards --key-file=-

This definitely shows that the root filesystem (which has the keyfile) is being opened last.
I’m still unsure why I’m not seeing the failures of opening the other disks in the journalctl logs.

Just to give an update, creating the crypttab file via etc.environment did end up working!

For now I went with the simplest possible approach, which I’ve copied below:

  environment.etc.crypttab = {
    enable = true;
    text = ''
      # sda3_crypt UUID=2644f599-e320-4c60-bc1a-bc0d4cba7d46 none luks
      data_disk1_crypt UUID=2b7c47cb-a425-4493-8d7d-4227537a40d5 /root/keyfile luks
      data_disk2_crypt UUID=ebaf3ced-e0ec-4978-bf7c-839c30ba0051 /root/keyfile luks
      data_disk3_crypt UUID=353ce3c1-7c53-448a-909b-d239d210c99b /root/keyfile luks
      parity_disk1_crypt UUID=a7c2e0a5-c5ec-4dc9-94fd-4118890e6486 /root/keyfile luks
    '';
  };

The contents of the file are almost identical to the file I had on the Debian installation I’m moving away from. The key different is that the entry for the disk containing the the root filesystem (sda3_crypt) had to be commented out.

I’m guessing that’s unlocked by the Stage1 Bootloader via the preLVMCommands and that seems to interfere in minor ways with having it in the crypttab.

I use ssh in my initrd via the boot.initrd.network.ssh option. Having that first line in my crpyttab causes the boot sequence to ask for the password a second time, once when I’ve already unlocked it over ssh via cryptsetup-askpass.

I’ll investigate whether or not that double prompting behaviour is exclusive to ssh unlocking and report back tonight.

Thank you for the help @ElvishJerricco and @emmanuelrosa

2 Likes

I’ll investigate whether or not that double prompting behaviour is exclusive to ssh unlocking and report back tonight.

I haven’t gotten around to testing this just yet, other things have come up, so I might not be able to report back until later this week.

I did want to jump in to point out that my earlier attempt at extracting the initrd was incorrect.

I was able to decompress the archive using cpio -idmv < initrd

This doesn’t work. It appears that the initrd is actually a concatenated series of files.

> ls -lha /boot/kernels/c0r232n0lsakf0zz2199n3z20prlzhgm-initrd-linux-5.10.113-initrd
-rw-r--r-- 1 root root 17M May  6 04:40 /boot/kernels/c0r232n0lsakf0zz2199n3z20prlzhgm-initrd-linux-5.10.113-initr

> cat /boot/kernels/c0r232n0lsakf0zz2199n3z20prlzhgm-initrd-linux-5.10.113-initrd | cpio -idmv
kernel/x86/microcode/GenuineIntel.bin
9121 blocks

> du -hs .
4.5M    .

That doesn’t make sense unless there is more data that cpio isn’t extracting. Did some research and it seems like it’s possible to concatenate multiple cpio archives, but I’m unsure how to extract them.

I attempted to use

(while cpio -id ; do :; done) < /boot/kernels/c0r232n0lsakf0zz2199n3z20prlzhgm-initrd-linux-5.10.113-initrd

But that failed in some ways. I fear that some of the files are archives are gzipped, wihle others are not. If anyone knows how to unpack this initrd I’d appreciate a pointer, otherwise I’ll keep investigating.

Alright, I was able to extract this using the an iterative approach with using dd.
The key idea is described in this unix.stackexchange.com post.

> dd if=/nix/store/c0r232n0lsakf0zz2199n3z20prlzhgm-initrd-linux-5.10.113/initrd skip=0 | cpio -it
kernel/x86/microcode/GenuineIntel.bin
9121 blocks

# I attempted to simply pass skip 9121 to dd at this point, but it failed. 
# The next cpio archive appears to be compressed.
# I attempted to insert gunzip and lz4cat into the pipeline to no effect
# Leading me to extract this into it's own file so I could inspect it.

# Extract the next archive to a seprate file 
> dd if=/nix/store/c0r232n0lsakf0zz2199n3z20prlzhgm-initrd-linux-5.10.113/initrd skip=9121 of=init2
dd if=/nix/store/c0r232n0lsakf0zz2199n3z20prlzhgm-initrd-linux-5.10.113/initrd skip=9121 of=init2
24375+1 records in
24375+1 records out
12480267 bytes (12 MB, 12 MiB) copied, 0.0658362 s, 190 MB/s

> file init2
init2: Zstandard compressed data (v0.8+), Dictionary ID: None

# Now extract using zstdcat. Could have extracted from the file, but chose to use dd again
> dd if=/nix/store/c0r232n0lsakf0zz2199n3z20prlzhgm-initrd-linux-5.10.113/initrd skip=9121 | zstdcat | cpio -idmv
Hosted by Flying Circus.