Migrating to boot.initrd.systemd, and debugging stage-1 systemd services

I just recently migrated to the systemd based initrd, and wrote up some notes on it. This isn’t really a complete migration guide since people have different pre*Commands and post*Commands in their boot processes, but gives some tips for writing oneshot services to replace them, and inspecting the stage-1 systemd services to decide what dependencies to include in them.

Hopefully it helps somebody other than me.

11 Likes

Hey nice article! As the primary developer of systemd stage 1, I do have some notes :slight_smile:


About debugging, it’s worth noting the kernel params rd.systemd.unit=, and rd.systemd.debug_shell. These are equivalent to their stage 2 equivalents (without the rd. prefix) and in fact most of the systemd kernel params will work either directly or with the rd. prefix to indicate it should only apply in stage 1. You can use rd.systemd.unit=rescue.target to enter a rescue mode in stage 1 instead of booting immediately, though you’ll need to set boot.initrd.systemd.emergencyAccess to be able to get to a shell with this. Or, rd.systemd.debug_shell will start a shell on tty9 that you can switch to while stage 1 is proceeding normally.

Network config changes slightly with a systemd based initd. Network interfaces don’t automatically get set up based on your networking.interfaces.* config

Actually they should, if you use boot.initrd.network.enable (which implicitly enables boot.initrd.systemd.network.enable). It’s kinda like the difference between networking.useNetworkd and systemd.network.enable. The systemd option just turns on networkd, but the higher level option implements networking.* options implicitly using networkd. Right now the focus was on matching the scripted stage 1’s networking implementation rather than stage 2’s networking implementation, so IIRC it merely configures the networking.interfaces.* interfaces, but it also implements a few of the virtual interfaces like bridges I think.

but if there is some binary you require in your initrd environment, boot.initrd.systemd.storePaths

Try boot.initrd.systemd.initrdBin or boot.initrd.systemd.extraBin to get binaries on PATH in the shell.

and I used boot.initrd.network.postCommands to populate my initrd root user’s .profile with commands to decrypt my ZFS encryption dataset, and kill the local decryption command (zfs load-key rpool/crypt; killall zfs ).

Rather than manually loading ZFS keys and killing ZFS processes, try systemd-tty-ask-password-agent. The ZFS service that unlocks datasets uses systemd-ask-password (and indeed so do other things like LUKS or bcachefs password prompts), so any ask-password agent is able to reply to this prompt. You can just run the tty agent in your shell to answer it. Or what I do actually is use systemctl default, which just spawns the tty ask-password agent in the shell and waits for initrd.target to be reached or failed.

The main downside of this is that ZFS native encryption doesn’t support key rotation without copying all of your data

zfs change-key?

I want to try to use LUKS on a ZVol to store file encryption keys for my encrypted datasets instead

I actually do exactly this, but for a different reason. For one, I use it to have multiple key slots to unlock an encrypted ZFS dataset, which ZFS doesn’t have an equivalent of. But mainly, I use it for all of the extremely nice LUKS features that systemd has, like binding to the TPM2, or a FIDO2 key. FIDO2 is especially nice as a dramatically less complicated way to use yubikeys than what you have to do with scripted stage 1.


Thanks for writing this! Documentation about this migration is something I need to include in the NixOS manual sometime soon, and this provided me with some great notes on what to include.

3 Likes

Hey nice article! As the primary developer of systemd stage 1, I do have some notes :slight_smile:

Oh hey, thanks for your work on the stage 1 systemd! It’s been great to work with.

I really appreciate the feedback as well. I’ll definitely take another pass over the article with all of the earlier sections of feedback and credit you once I get a chance to sit down and play with all of it.

zfs change-key?

You know, I don’t know how I missed that in the ZFS docs.

I actually do exactly this, but for a different reason. For one, I use it to have multiple key slots to unlock an encrypted ZFS dataset, which ZFS doesn’t have an equivalent of. But mainly, I use it for all of the extremely nice LUKS features that systemd has, like binding to the TPM2, or a FIDO2 key. FIDO2 is especially nice as a dramatically less complicated way to use yubikeys than what you have to do with scripted stage 1.

Yeah, that is the other part of the motivation for me, too. I want to be able to easily enable and disable unattended decryption.

1 Like

About debugging, it’s worth noting the kernel params rd.systemd.unit=, and rd.systemd.debug_shell. These are equivalent to their stage 2 equivalents (without the rd. prefix) and in fact most of the systemd kernel params will work either directly or with the rd. prefix to indicate it should only apply in stage 1. You can use rd.systemd.unit=rescue.target to enter a rescue mode in stage 1 instead of booting immediately, though you’ll need to set boot.initrd.systemd.emergencyAccess to be able to get to a shell with this. Or, rd.systemd.debug_shell will start a shell on tty9 that you can switch to while stage 1 is proceeding normally.

This is great. Definitely will add this in.

Actually they should, if you use boot.initrd.network.enable (which implicitly enables boot.initrd.systemd.network.enable). It’s kinda like the difference between networking.useNetworkd and systemd.network.enable. The systemd option just turns on networkd, but the higher level option implements networking.* options implicitly using networkd. Right now the focus was on matching the scripted stage 1’s networking implementation rather than stage 2’s networking implementation, so IIRC it merely configures the networking.interfaces.* interfaces, but it also implements a few of the virtual interfaces like bridges I think.

Huh, that is definitely not working for me. Dropping the boot.initrd.systemd.network.networks.* config leaves the machine inaccessible for me at the moment.

Try boot.initrd.systemd.initrdBin or boot.initrd.systemd.extraBin to get binaries on PATH in the shell.

Is there a reason to use those over storePaths? Just curious.

Rather than manually loading ZFS keys and killing ZFS processes, try systemd-tty-ask-password-agent. The ZFS service that unlocks datasets uses systemd-ask-password (and indeed so do other things like LUKS or bcachefs password prompts), so any ask-password agent is able to reply to this prompt. You can just run the tty agent in your shell to answer it. Or what I do actually is use systemctl default, which just spawns the tty ask-password agent in the shell and waits for initrd.target to be reached or failed.

Oh I like that quite a bit. Definitely switching to that in my configs.

You have to enable boot.initrd.network.enable (note the difference from boot.initrd.systemd.network.enable) to get the automatic configuration of interfaces from networking.interfaces.*.

storePaths just puts the file into the initrd. It doesn’t put it on PATH. If you only need the binary for a systemd service or something that’s going to reference it by absolute path anyway, then storePaths is good. But if you want to invoke the binary from a shell then initrdBin or extraBin will put it on PATH.

systemd-ask-password is one of my favorite things about systemd initrd :slight_smile:

You have to enable boot.initrd.network.enable (note the difference from boot.initrd.systemd.network.enable) to get the automatic configuration of interfaces from networking.interfaces.*.

I did read and configure that correctly. Adjacent to the ssh config, not the networks config. I’ll give it a shot next time I update my flake inputs, but no dice for me at the moment.

storePaths just puts the file into the initrd. It doesn’t put it on PATH. If you only need the binary for a systemd service or something that’s going to reference it by absolute path anyway, then storePaths is good. But if you want to invoke the binary from a shell then initrdBin or extraBin will put it on PATH.

Okay, that makes sense. I usually reference full paths in systemd services out of habit, so I didn’t really notice a difference.

Odd. That sounds like a bug.

Yea that’s the right thing to do because PATH won’t be set right in a systemd service usually regardless. So it’s mainly useful for using things from the shell manually.

Odd. That sounds like a bug.

Oh, probably don’t worry about this. I didn’t really think about it until randomly today, but I use a bridge network on the host I was using for testing all of this so I can assign multiple IP addresses to the same host for virtualization. I’m probably a weird edge case.

FYI there is one fairly substantial mistake in the config in network.ssh.hostKeys: it should not be using a path there, but instead a string. Using a path results in the private key being copied to the world-readable Nix store.

You may run into a similar bug as me if you use a string in practice though: if you are using agenix and put an agenix path in there, it will break agenix on the rest of the system during boot (!): if you give it a string value for a path (which is the correct thing to do since we do not intend to copy into the store), initrdKeyPath is the identity function, which means that initrdSecrets will put the secret that appears at the given path on the running system at the same path in the initrd. This in turn breaks agenix since it creates the agenix directory as a non-symlink. Whoops!

Seemingly the solution is injecting a new HostKeys line in the extraConfig, setting ignoreEmptyHostKeys, and setting boot.initrd.secrets yourself to put the secret somewhere other than where it is stored on the running system.

3 Likes

That’s a good catch! I’ll get the post updated and make a note of it. Thanks for reading it and for the feedback.

I really like your agenix solution. I use sops-nix and am definitely tempted to do something similar with these boot keys.

Thanks for the write up! One simplification you can make is to drop the zfs-remote-unlock service and instead use the command= option for authorizedKeys.
i.e.,


      ssh = {
        enable = true;
        port = 22;
        authorizedKeys = [
          "command=\"systemctl default\" ssh-ed25519 AAAAC3NXXX decryption prompt"
          ];
        hostKeys = [ "/root/.initrd_ssh_host_rsa_key" ];
      };

systemd-tty-ask-password-agent is a real motivator to switch to a systemd initrd. We should probably update the zfs docs to recommend this approach.

2 Likes

Thanks for mentioning this!
I initially stumbled upon this error with sops-nix and thought it may be a bad interaction between sops and impermanence but I’m glad to finally know what the error was.
I can finally unlock my system through ssh and have working secrets in the actual system.

Thank you for the awesome writeup! And thanks to @jade for mentioning the pitfall with ssh hostkeys.
It took some time for me to get the network working. The blocker for me was the usage of boot.initrd.availableKernelModules instead of boot.initrd.kernelModules for the needed module of my network adapter. After switching to this option everything worked fine.

Here are all the settings I made to get it working, in case someone wants a similiar setup:

#initrd related stuff
  sops.secrets.initrd_host_key = {
    mode = "0600";
  };

  boot.initrd = {
    secrets = { "/secrets/boot/ssh/ssh_host_ed25519_key" = "/run/secrets/initrd_host_key"; };
    kernelModules = [ "r8169" ];
    network = {
      enable = true;
      ssh = {
        enable = true;
        port = 2222;
        ignoreEmptyHostKeys = true;
        extraConfig = "HostKey /secrets/boot/ssh/ssh_host_ed25519_key";
        # hostKeys = ["/secrets/boot/ssh/ssh_host_ed25519_key"];
        authorizedKeyFiles = [ ssh-keys.outPath ];
        };
      };
    systemd = {
      enable = true;
      network = {
        enable = true; 
        networks."10-lan" = {
        enable = true;
        matchConfig.Name = "enp3s0";
        address = [ "192.168.2.44/24" ];
        gateway = [ "192.168.2.1" ];
        linkConfig.RequiredForOnline = "routable";
        };
      };
    };
  };

Maybe it is worth mentioning that I don’t use networking.interfaces.* but systemd.network.* for the rest of my system.

That shouldn’t have made a difference… It should have loaded automatically even when it’s just in availableKernelModules… Weird.

I thought so as well, but when reading the documentation for these config options I noticed the “only loaded when needed” part for availableKernelModules.
It seems my config didn’t suggest that the network adapter was needed, but forcing it with kernelModules worked.

I tried it a few times back and forth and only putting it in kernelModules works for me.

This basically means “whenever the OS encounters hardware that uses this module”, so it does usually automatically load. I’d be curious to know why yours didn’t do this. Not sure how to find out, and it’s probably not worth the effort anyway :stuck_out_tongue:

My thoughts as well. It works the way it does atm. maybe future me will have some interest time looking into it.