NixOS Reproducible Builds: minimal installation ISO successfully independently rebuilt

We have successfully created an independent, bit-by-bit-identical rebuild of the nixos-minimal ISO published by Hydra :tada:

Why is this useful?

While there are a number of ‘side-benefits’, the main point of Reproducible Builds is that it gives us a reliable way to verify the binaries we ship are faithful to their sources, and have not been tampered with anywhere in the build pipeline (e.g. on Hydra).

For general information on Reproducible Builds see:

What exactly was reproduced?

This means we now have successfully reproduced:

  • All packages that make it into the ISO
  • The building of the ISO itself

The rebuild also built the packages that were needed to build the ISO (but aren’t included in it), rather than relying on cached binaries.

How did we reproduce?

We reproduced this Hydra build by starting a fresh VirtualBox appliance with NixOS 20.03 (adding plenty of CPU and memory, and resizing the disk to about 65G), and then:

nix-shell -p git
git clone https://github.com/nixos/nixpkgs
cd nixpkgs
git checkout 63678e9f3d3a
# because of https://github.com/NixOS/nix/issues/9251
sudo touch /dev/kvm
sudo chmod a+rwx /dev/kvm
# because of https://github.com/NixOS/nixpkgs/issues/263730
nix-shell -p nix -I nixpkgs=/home/demo/nixpkgs --option substitute false
# let's go!
nix-build nixos/release-combined.nix -A nixos.iso_minimal.x86_64-linux --option substitute false --max-jobs 6 --arg nixpkgs "{ revCount = 541036; shortRev = \"63678e9f3d3a\"; rev = \"63678e9f3d3afecfeafa0acead6239cdb447574c\"; }"
# all below because of https://github.com/NixOS/nixpkgs/issues/263898:
sudo -o remount,rw /nix/store
sudo touch -d "1980-01-01·00:00:00.000000000·+0000" /nix/store/bn8y1ibzcvbqbl7d43zszl180ghy4rsn-lingering-users
sudo chmod 644 /nix/store/bn8y1ibzcvbqbl7d43zszl180ghy4rsn-lingering-users
sudo touch -d "1980-01-01·00:00:00.000000000·+0000" /nix/store/31cmbwil5awd20rvcbb13nps9mrw6gmj-etc-netgroup
sudo chmod 644 /nix/store/31cmbwil5awd20rvcbb13nps9mrw6gmj-etc-netgroup
sudo touch -d "1980-01-01·00:00:00.000000000·+0000" /nix/store/6cn8wcl7c850rfs52sw2598c2qhh1njc-mdadm.conf
sudo chmod 644 /nix/store/6cn8wcl7c850rfs52sw2598c2qhh1njc-mdadm.conf
sudo touch -d "1980-01-01·00:00:00.000000000·+0000" /nix/store/*-etc-mdadm.conf
sudo chmod 644 /nix/store/*-etc-mdadm.conf
rm result
nix-store --delete /nix/store/*-squashfs.img
nix-store --delete /nix/store/*-nixos-minimal-*.iso
# final build
nix-build nixos/release-combined.nix -A nixos.iso_minimal.x86_64-linux --option substitute false --max-jobs 6 --arg nixpkgs "{ revCount = 541036; shortRev = \"63678e9f3d3a\"; rev = \"63678e9f3d3afecfeafa0acead6239cdb447574c\"; }"

The --option substitute false makes sure ‘everything’ is built on the machine itself, instead of fetching them from the binary cache.

Aren’t there bootstrap problems with the above approach?

Well, yes: if the 2020 ova or the downloaded git contained elaborate backdoors, those might still be attack vectors. It would have been better to do the rebuild on a fully-bootstrapped system, but we’re not quite there yet - see this thread for exciting progress in this area. Still, this test gives a high confidence in the reproducibility of the ISO.

Weren’t we here before?

You may remember 2021 announcement that the minimal ISO was 100% reproducible. While back then we successfully tested that all packages that were needed to build the ISO were individually reproducible, actually rebuilding the ISO still introduced differences. This was due to some remaining problems in the hydra cache and the way the ISO was created. By the time we fixed those, regressions had popped up (notably an upstream problem in Python 3.10), and it isn’t until this week that we were back to having everything reproducible and being able to validate the complete chain.

Where to from now?

Successfully rebuilding the minimal ISO once is an extremely satisfying milestone - but a somewhat arbitrary one. There is a lot more to do to reap the benefits of reproducibility, such as:

  • Remove the hacks above (nixpkgs#263730, nix#9251, nixpkgs#263898)
  • Making more packages reproducible. Reproducing the other installation media such as the Gnome ISO seem like nice next milestones
  • Setting up infrastructure so that we can regularly independently rebuild these artifacts
  • Create tools to share and consume build attestations, such as trustix
  • … what would you like to see? Tell us in this thread!

Want to help?

97 Likes

Amazing achievement! Congratulations!!

2 Likes

That is so cool! Such tedious work, thank you for doing it! With the funding for the full-source bootstrap chain, I’m looking forward to seeing a full-source reproduced build of NixOS in the future. That would be an amazing achievement.

One question: did you measure how long this full rebuild took? If not exactly, what time-frame are we talking about here? Hours? Days? Weeks even?

2 Likes

I already asked on Matrix: around 4 hours.

2 Likes

Well done on achieving this. It’s clear a lot of effort went into it. It’s hard to describe but Nix is one of few technologies that still feels very rewarding whenever you learn something new or theres news like this.

3 Likes

It would have been better to do the rebuild on a fully-bootstrapped system

I’ve got a small reproducible distro with Nix bootstrapped from TinyCC. Would be interesting to reproduce your achievement from there, but it currently cannot use disks and it sounds like I’d need more that 64GB RAM for that.

If Nix runs on Guix, maybe you can try building from Guix?

1 Like

Epic work!

Do you see this kind of work on binary reproducibility enabling better behaviour (i.e. less cache misses and spurious rebuilds) for Nix’s upcoming content-addressed derivations feature?

I’m particularly excited to see whether this kind of work will enable trusted distributed caches in an easier manner as the current UX around custom caches is a bit of a PITA. Cool to see you mention trustix on this note!

I have not worked with content-addressed nix in depth, but from what I’ve read (notably this section of the RFC and the part of the thesis linked there) it definitely seems so.

1 Like