I am really surprised by the timeline in the Lix blog post suggesting that there was a patch three months prior to disclosure, and patches for additional mitigations for two months prior.
The Nix commit log shows these fixes being backported from Lix two days ago, after the pre‐disclosure announcement.
I am concerned that this means that users were unknowingly exposed to a known root escalation vulnerability for three months, despite a fix being written within days, because Nix only backported the fixes months later at the very end of the embargo period. If I’m misinterpreting the situation, it would be good to get clarification.
Fair enough; added “known” to clarify. I think time of report to time of disclosure/patches is still a significant measure (hence the existence of disclosure deadlines to begin with) and from an external perspective it seems like a large portion of that here was spent waiting.
I saw @jade post here: FOD sandbox bypass - HackMD
I think it did a good job of explaining it but I was wondering if someone had an even more ELI5 lol.
The CVE makes me think it’s even more trivial – just passing the FD to $out and writing it but the example uses inotify and 2 FODs etc…
(or is it on account of the namespacing that it has to be between two FODs to leverage the bind-mount?)
Question: if my understanding is correct, this attack is NOT related to any kind of kernel bug. But if my memory is correct, linux’s kernel have/had some bugs allowing to escape from a a sandbox, like docker. Are these bugs still around, and if so does it mean that it is anyway unsafe to build untrusted derivations?
Remark about similarly dangerous unfixed attack: more than one year ago, after emailing nix’s security team, I submitted this CVE and discussed it in re-fetch source when url changes · Issue #969 · NixOS/nix · GitHub and Provide a binary cache for builds · Issue #68 · NixOS/ofborg · GitHub. This describes an attack that is almost impossible to detect (for a PR reviewer) allowing a malicious PR to inject arbitrarily malicious code in basically any software, by simply changing the hash. The attack is very easy to run and can have devastating effects (trapdoor, viruses…). Yet, as far as I know, one year later, it is still effective, even if I proposed some proposals for potential fixes (the best fixes are not completely trivial as they require nix & caches to also maintain a list of verified couple hash/url, but as far as I see this is the only really robust solution to this issue). Is there any update on this?
This means that you can switch to -small channel and obtain the bugfix right now from the cache or you can wait for full integration testing to pass and obtain the upgrade.
(again, if you are that much in a hurry, you probably should eat the rebuild.)
Hello,
As a noobie user of NixOS, I’m not sure what I am supposed to do ?
I am using flakes with nixpkgs.url = "github:nixos/nixpkgs/nixos-25.05";
nix --version returns nix (Nix) 2.28.3 which is affected from what I understand
Running nix flake update followed by a nixos-rebuild switch did not change that
It seems to be related to the fact the build(?) is not done yet as showed by Ten in this message
If yes, how much should I expect it to take ?
And what should I do then ?
Overall I think it would have been nice to have very straightforward steps to follow explained in the original post
(And it’s not too late to add them ! Especially when there is no more waiting needed)
Yeah, the nixpkgs builds aren’t finished yet. Honestly, the impression I get is that unless you’re running a public nix CI system, there’s no great panic. As a “normal” NixOS user, anything you build is presumably ending up in your system configuration anyway, potentially getting executed as root, or the very least your normal user. If any of that’s coming from untrusted sources, you presumably have bigger problems than a sandbox escape. [Edit:looks like I’ve missed something or misunderstood - see emily’s analysis downthread.]
I’ve swapped my VMs/servers to Lix out of an abundance of caution, everything else I will just wait for the nixpkgs hydra builds to complete.
nixos-25.05 updated a couple hours ago (7284e2decc98)
nixos-unstable updated right now (30a61f056ac4)
nixos-24.11 in about half a day, I assume (f25c1bd2a6b3 most likely)
Unfortunately, the thousands of NixOS tests are pretty expensive for Hydra, and changing Nix rebuilt really lots of them (not surprising, as every NixOS surely has Nix or Lix), and this got repeated on the three supported branches and mixed with some other smaller rebuilds on the same branches.
Moving fast for security issues was the main motivation for creating the -small channels/branches, I believe.
I’ve checked with @tomberek yesterday and the Nix team has implemented mitigations for all known vectors. That’s based on the original report. Unfortunately I’m not clear exactly what this number refers to because it hasn’t been published yet. (CVE - CVE-2025-46416, as of writing)
The commits are recent due to rebasing. Only the check that build-dir isn’t world-writable was added in the past weeks. Most of the work was done early on.
That vast majority of tests do not need a Nix implementation in them; just a store directory and perhaps the profile symlinks, if specialisations are used.
If as a test author you need anything beyond the trivial nix-env -p, you would know, because building anything inside a test is currently a pain (finicky, and slower startup).
“thousands” suggests that Nix could still be disabled in most of them. Only tests for nixos-install, nixos-rebuild and such need Nix to be available.
FWIW, this was what I was saying in the original thread but I somewhat regret the take:
I didn’t have a root escalation on my bingo card; I expected a mundane escape to a sandbox build user, which is not as exciting.
I forgot that we don’t set allowed-users by default, so it’s not just derivations you cause a build of yourself that are at risk, it’s anything that can access the Nix daemon, potentially including random service accounts if they don’t have particularly hardened systemd service setups.
I stand by “it’s pointless to try and avoid building your configuration”, but I do actually think this is a really serious vulnerability, one of the worst it’s possible for Nix to have; you should prioritize getting a fix deployed on your systems. (I guess even worse would be arbitrary code execution as root in response to a binary cache response prior to TLS certificate validation…)
We should probably harden the default allowed-users in NixOS, but it’s hard to do backwards‐compatibly – it might break automated build setups.
Thanks, that makes sense. But I’m still confused about the timeline here for a vulnerability of this severity; what was happening for the months in between? The build-dir check wouldn’t have been necessary to mitigate this on common systems using the default configuration, right? If the Lix patches the Nix fix was based on were written 2–3 months ago and most of the Nix‐side porting work was already done around then, wouldn’t it have been possible to coordinate to get an advisory and fixes out sooner, rather than running right up to the 90 day deadline?
I realize that the Nix team have other commitments and that it’s not always simple to rush out a security patch immediately. But any user being able to escalate to root on a stock NixOS system is a big deal, and if there were adequate patches to fix that within days to months after report I’m struggling to understand what went wrong here for the process to take 90 days.
This isn’t the first time recently that handling of a Nix vulnerability has reportedly stalled out for months and the NixOS security team reported in Change security policy to report directly to the Nix team · Issue #11468 · NixOS/nix · GitHub that they have spent months asking for triage updates on Nix security issues. I know there were changes made after that, so I’m hoping we can get a better understanding of what happened this time.
I’m worried that there are process failures here that are systematically undermining the security of NixOS and users of Nix on other Linux distributions and macOS. The Nix daemon is a pretty substantial attack surface that is present on almost all NixOS systems. I know that nobody on the Nix team is doing it as their full‐time job and that it’s hard to say whether things could have gone differently in the absence of a counterfactual. But in this case we appear to have one: it seems like the Lix team was ready to deploy final fixes over a month and a half ago and had a basic fix within days. I’m hoping we can get more light shed on this by the Nix and NixOS security teams so that we’re better prepared for the next time there’s a serious vulnerability in Nix.
If everyone was ready earlier, there would be no reason to wait the full 90 days. So to be very direct, if the Lix timeline is true, who were the ones that needed the full 90 days? CppNix, Guix, or some other third party?
Right. Coordinated disclosure just means that every party has to agree to disclosure before the hard deadline. If every party is ready earlier then there’s certainly no need to wait until then, and reporters of vulnerabilities would generally prefer earlier disclosure under the constraints of ensuring a fix can be rolled out as quickly as possible – the deadline is just to ensure that users get notified eventually even if a patch doesn’t materialize before then. I still feel that I do not fully understand what happened here.
We got a very good write up in the original report. (Thanks Rory!) Within a week the Nix team had gone through the report, analyzed, and assessed the suggested fixes. By this time Ryan Lahfa had patched most of these in Lix. By the following week we had ported those patches, attempting to preserve history in spite of some codebase divergence. The major remaining problem was to address the long-standing issue regarding abstract unix domain sockets. The various implementations explored a few approaches and experimenting. This took the most amount of time. No clear or obvious best solution emerged, with each implementation taking a different approach. Then the patches got backported to earlier versions, some rebasing, minor tweaks, and final review in preparation for release according to the timeline.
The biggest slowdown was due to dealing with abstract AF_UNIX class of issues. From what I can tell of the various teams, exploring and testing a solution to this is what took the most time for everyone. It is interesting to see various approaches (pasta, slirp4netns, LSM, etc.) and I am hopeful a good long-term default solution becomes clearer with more experience with the approaches.
Personally, I started working on porting the patches, but eventually reach my limits and needed to hand off polishing and finishing the effort to others, moved into more a supporting role. I’m thankful that that was possible.
Thanks for the detail, it is great to hear that the Nix team have been working on mitigating the risks of abstract domain sockets in general!
However, as I understand it, these additional mitigations are not required for the immediate vulnerability of any user being able to escalate to root, and the Nix team have not yet shipped any such mitigation, right? So given the statement that Nix had a fix for the immediate critical vulnerabilities within two weeks I’m confused by this as an explanation of what caused the multiple-month gap in the timeljne here.
If all parties to an embargo for a vulnerability this severe have fixes for the immediate issue that they feel confident in, it would usually not be normal to delay disclosure for months to explore further mitigations for things that have no known remaining exploits. Is the Nix team’s position that they did not fully fix the issue by the deadline due to the lack of mitigations for abstract domain sockets similar to what Lix deployed, and that Nix is currently still vulnerable? If so, that seems concerning by itself. Or did the Nix team not consider mitigating the abstract domain sockets issue to be necessary for disclosure and getting fixes out to users, but there was a request from another party to the embargo to delay fixes and disclosure to explore such additional hardening?
Again, I’m glad that the Nix team are working on additional mitigations! But I’m very confused by the communication around the gap in the timeline here, and continue to feel worried that something went really wrong with the process here - whether around understanding of the coordinated disclosure process or communication about readiness and what was considered blocking or something else entirely.
I hope that is not the case! But if fixes were ready within weeks but Nix considered the additional mitigation around build-dir@roberth mentioned necessary before disclosure, then I don’t understand what was happening between the two points considering the simplicity of the commit. If the answer is that the Nix team considered the abstract domain socket mitigations necessary before disclosure, then should Nix still be considered vulnerable?
And if the explanation for the timeline running right up to the maximum deadline is neither of those things, then I don’t understand it yet, and hope we can get elaboration, since delays addressing serious security vulnerabilities like this can directly impact users and I think it would be good for everyone to understand if there are any lessons to be learned or remaining procedural problems that need additional mitigations of their own