Boot Time Integrity Checks for the Nix Store

ElvishJerricco · December 13, 2023, 10:19am

This is a writeup about the research done for the supply chain security project funded by the STF.

(Note this is not an RFC, though I did use the NixOS/rfcs template).

Integrity checks for the Nix store

Summary

I researched the options available, and implented NixOS options for using Nix’s own signature verification system to verify the system closure before transitioning to stage 2.

Motivation

While there are ways to use Secure Boot with NixOS, they currently fall short when transitioning to stage 2. They need to include a mechanism for verifying the OS stored in the Nix store. Currently, it is common to simply rely on disk encryption to keep the Nix store safe from tampering, but this is not always desirable, e.g. for devices that should boot unattended.

There are a variety of technologies built into the kernel that were considered for this. However, they all come with some glaring drawbacks, which will be discussed in the Alternatives section. The design described here uses Nix’s own signature verification system to verify the system closure before transitioning to stage two. This was chosen for its relative simplicity, without compromising security. It’s also extremely easy to build and deploy, given that it’s already a feature of Nix itself.

Finally, this approach allows to continue using an ordinary Nix store file system, meaning no new disk images need to be constructed, and the system can be used like an ordinary NixOS system. New generations can be added without large storage requirements for every single one, because it’s just an ordinary Nix store.

Detailed design

Implementation

When the system boots, its initrd (a.k.a. stage 1) will mount the OS’s file systems, and use the Nix CLI to automatically verify that every path in the system closure has a valid Nix signature. This establishes trust between stage 1 and the stage 2 OS it’s about to switch to, before ever allowing any of stage 2’s code to run.

There are some new NixOS options.

These options can be used to configure which public keys stage 1 should trust, as well as how many of those keys need to have signed every individual path for that path to be trusted:

boot.initrd.verify.enable
boot.initrd.verify.sigsNeeded
boot.initrd.verify.trustedPublicKeys

These options control whether the system should sign its own closures when installing the boot loader. This is not done by default, with the assumption that the system closure is built and signed by a trusted builder:

boot.initrd.verify.signing.enable
boot.initrd.verify.signing.keyFile

If you enable this, ensure the key file is only accessible when absolutely necessary.

Examples and Interactions

First, generate a signing key.

nix-store --generate-binary-cache-key key-name secret-key-file public-key-file

In a NixOS module, configure stage 2 verification.

{
  boot.initrd.verify = {
    enable = true;
    sigsNeeded = 1;
    trustedPublicKeys = [(builtins.readFile ./public-key-file)];
  };
}

If this machine should self-deploy, then store the secret-key-file somewhere safe and encrypted, and configure the signing.enable and signing.keyFile options. Preferably, this file should only be accessible when it’s time to deploy an update.

Otherwise, store this key on the system that will build the system closure, and configure its /etc/nix/nix.conf settings with secret-key-files pointing to the secret. When deploying, make sure to copy the closure’s signatures as well.

Drawbacks

This implementation is very simple. While it does verify stage 2, it does so by delaying boot to read and verify the entire closure. On a fast computer, this takes 5-10 seconds. On a slower system (e.g. a raspberry pi using an SD card for the OS), this can take a few minutes.

Alternatives

The kernel has a few features that I spent time researching, but they have problems.

dm-verity is a block device layer that uses a merkle tree to verify that every block matches an expected hash as they are read. With this, you only need to sign the root of the merkle tree, and everything on the block device becomes implicitly verified on-demand.

However, these block devices are necessarily read-only, because there is no way to atomically update the merkle tree and the block device together. This makes it difficult to use the system as one is used to from NixOS. Every NixOS generation would be an entire disk image, consuming large amounts of space, and being very slow to build and update.
fs-verity is similar, except it operates at the file level, requiring support from the file system driver. In theory this solves the problem, but it has a few problems of its own. Critically, fs-verity does nothing to verify the locations of files within the file system. That is, it does not protect from moving files around (e.g. swapping the systemd and bash binaries to gain a shell as PID 1).

In theory initrd could check a manifest of expected fs-verity root hashes on the whole closure, but this isn’t much better than just doing a Nix-style signature verification. And since fs-verity hashses are file system metadata that isn’t included in NAR serializations, it’s much harder to deploy.
IMA (Integrity Measurement Architecture) and EVM (Extended Verification Module) have essentially the same benefits and drawbacks as fs-verity does, for these purposes. They work at the file level, and they don’t verify the locations of files. Plus, IMA requires using policies that are not very flexible, which makes it much more useful as an auditing and measurement tool than a boot verification tool.

Prior art

Some other Linux distribution projects are moving toward image-based OSes, similar to what’s described in Fitting Everything Together by Lennart Poettering (and in other articles on that blog). I think some components of these ideas could be used; but the big problem is that the “unit” of software in NixOS is the store path, and a given system closure has hundreds or even thousands of paths directly under /nix/store. Reconciling this difference in philosophy is nontrivial.
Apple uses what they call the “Signed System Volume” (SSV). It’s similar to fs-verity, but it covers the entire metadata tree of the file system, and is reproducible between devices. It’s a good solution to the problems described with fs-verity.

Future work

While dm-verity certainly isn’t ideal, I think there are use cases where it would be desirable nonetheless. We should make it possible to generate dm-verity-based NixOS images.
It would be extremely useful to attempt implementing the Apple-style “SSV” concept with something like bcachefs. While it’s similar at a technical level to fs-verity, it would still have the problem of making the file system readonly that dm-verity has. However, since bcachefs is a Copy-on-Write file system, it should be possible to load a signing key into the kernel and allow using the system in read-write mode when desired. In a given CoW transaction, the root metadata hash can be signed by the kernel and written atomically.

mightyiam · December 13, 2023, 4:17pm

Thank you for your work.

Any benefits for a personal/professional desktop/laptop system?

ttamttam1 · December 13, 2023, 5:04pm

Oh exciting! I think this is good for the nix store but might need to layer with IMA and homed for /home /var /run and similar locations that store state.
I think a large drawback is that you can’t do unattended local deployments anymore. This is mitigated on the desktop because you could use something like systemd homed to encrypt your home directory and put your key there, essentially tying it to user login. After that you could run auto updates when the user logs in. I wonder if TPM could somehow be leveraged for this.
Is there any mechanism to mitigate tampering with the symlinks in profiles and generations? You could cause some issues swapping one version of a dependency for another. This is essentially the same issue as in fs verity and IMA/EVM.

RaitoBezarius · December 13, 2023, 5:52pm

Hm, why? You mean under IMA you cannot rebuild switch in-place, right?

Bootables contains links to generation, as long as they cannot be tampered with, this is covered by Secure Boot or Measured Boot.

At runtime, Nix generation / profiles symlinks are in the Nix store AFAIK, so if you protect the whole filesystem, you are also set.

I don’t know if signatures protect profiles / generations, I don’t believe so, that’s a good remark. Downgrade attacks should probably be defended against by active removal of leftovers, I’d say? Or you can use TPM2 counters to have a range of versions and prevent any downgrade < version K.

uep · December 13, 2023, 10:42pm

Some of this comes down to additional usage guidance. The activation script is in the verified store path, even if symlinks and some other state in /etc is not. So, with this ability in place and being relied on, one way to minimise this issue without additional solutions is to not have state in /etc… no mutableUsers or other stuff that can be tampered with outside the store.

ttamttam1 · December 14, 2023, 5:20pm

I meant under nix store integrity checks. If you need a key to sign your store paths on rebuild, that key needs to be protected. At least if I understand the following correctly:

This is fine for self deploy but on a server you may want to do an unattended self deploy, such as with autoUpdate. If there’s no user to decrypt the signing key, then you can’t sign your derivations. That’s why I asked if TPM could be leveraged to to decrypt the signing key only so long as all the previous stages in the chain were fine.

I think that would help, but couldn’t you still swap the links for systemd and bash as mentioned for fs-verity in the alternatives section?

RaitoBezarius · December 14, 2023, 5:21pm

Right, though I would probably not let a server perform unattended redeploy of itself, but yeah, you could probably plug your favorite way to perform signatures.