Security advisory: OpenSSH CVE-2024-6387 “regreSSHion” – update your servers ASAP

emily · July 2, 2024, 7:39pm

And we’re done. Phew! The fixed packages are now in all of the unstable, 24.05, and 23.11 channels. If you haven’t already updated, you should do so now (in fact, you might as well do it again anyway, just in case you’re not on the channel or revision you thought you were). You can and should remove any workarounds and mitigations when you update. If you’re on 23.11, please take the opportunity to upgrade to 24.05 as soon as possible; it was out of official support by the time all this happened, and the next security vulnerability probably won’t get a fix backported.

Thank you for all the kind words, but I didn’t make this happen alone – shout‐outs are also due to the others who helped get this done, including ari (no Discourse account I think), @qyliss, @vcunat, @hexa, @R-VdP, @ElvishJerricco, @leona, @tgerbet, @j-k, and of course the OpenSSH developers. (I’m probably forgetting people, too.)

Here’s a timeline for the curious:

T+0: OpenSSH 9.8p1 is announced with the first public disclosure of this bug.
T+25m: Upstream posts minimal patches for previous versions. The 9.8p1 release notes are posted in the security triage channel.
T+35m: #323753 is opened to bump unstable to 9.8p1.
T+50m: #323753 is merged.
T+56m: #323761 is opened to apply the minimal patches to 24.05.
T+1h2m: #323761 is merged.
T+1h10m: #323765 is opened to apply the minimal patches to 23.11.
T+1h19m: #323768 is opened to patch openssh_{hpn,gssapi} on unstable.
T+1h29m: #323765 is merged.
T+2h56m: #323768 is merged.
T+3h18m: The openssh bump reaches nixos-unstable-small and the first users start getting a fix through updates.
T+4h: The openssh patch reaches nixos-24.05-small. (It reaches nixos-23.11-small around the same time, but I don’t have the exact time.)
T+4h2m: This advisory is posted.
T+4h30m: Backports of #323768 to 24.05 and 23.11 are merged.
T+4h34m: #323796 is opened to fix SSH in initrd on unstable, which would have blocked the large channel bumps and potentially broken people’s setups. Stable versions are not affected due to the use of minimal patches rather than a full OpenSSH upgrade.
T+5h29m: #323796 is merged.
T+8h9m: The openssh bump reaches nixpkgs-unstable.
T+16h43m: The openssh and openssh_{hpn,gssapi} patches reach nixos-24.05. The first users on the large channels start to get fixes.
T+1d2h56m: The openssh bump, openssh_{hpn,gssapi} patches, and initrd SSH fix reach nixos-unstable.
T+1d6h23m: The openssh_{hpn,gssapi} patches reach nixos-23.11-small.
T+1d8h4m: The openssh and openssh_{hpn,gssapi} patches reach nixos-23.11.
T+1d11h25m: The openssh_{hpn,gssapi} patches and the initrd SSH fix reach nixpkgs-unstable. (Hopefully you’re not using nixpkgs-unstable (as opposed to nixos-unstable) on a NixOS machine anyway, though.)

(I’m not sure when the openssh_{hpn,gssapi} fixes reached nixos-{unstable,24.05}-small, or when the initrd SSH fix reached nixos-unstable-small, but this is probably already more exhaustive than anyone cares about.)

This is my first time doing this kind of thing; lessons learned:

Our ability to get security updates that don’t cause mass rebuilds into the small channels is pretty good! A four‐hour timespan from the issue being publicly announced to shipping binaries to users isn’t the fastest release a distro managed, but I believe we beat some distros that were in on the embargo. If you’re running a server and are worried about getting security fixes quickly, then using the small channel (and potentially making your system builds include the NixOS tests that aren’t part of the small channel but seem relevant to your server) is a good idea. Unfortunately this doesn’t do as much for fixes that have to go via staging.
Large channels are slow to update and it’s hard to fix this. It’s probably good to include information about how users on large channels can pull in security updates on the small ones. On the other hand, the large channel tests are useful; they uncovered 9.8p1 breaking SSH in initrd on unstable. Someone who was relying on that to be able to boot their remote server might have had a nasty surprise if they tried to get ahead of the channel bump.
It’s important for people to have an easy way to check whether their system is vulnerable. I don’t know whether this just means having appropriate Nix‐fu commands ready or whether there’s more that should be done as part of the fix PRs themselves to surface it better.
Monitoring the oss-security list and posting important stuff to the triage channel is probably a reasonably high‐impact practice. (Though it’s possible or even likely that people are already doing this; 25 minutes is a very quick turnaround from a mailing list message to it being raised on Matrix anyway.)
Hydra requires more active management than you’d think and the people who do it are thankless heroes.
Unix signal delivery semantics are a mistake that we continue to pay the price for decades later.
It’s good to eat breakfast when you can.