Where are the cryptographic signatures? (SHA256SUMS, Release.gpg)

Atemu · December 25, 2024, 9:18pm

Well, if you could determine this hash without using Nix, Nix would be quite useless. A lot of work goes into determining this hash; both computationally as well as human work.

Nix evaluates Nix expressions in order to build a DAG of nodes called “derivations”. Each derivation’s hash is the hash over all nodes of its incoming edges; the inputs to the derivation.

The result of this evaluation is entirely determined by the data which the Nix expressions evaluate to and it cannot be determined without evaluating Nix expressions.

What you’re clicking through in Hydra is what Hydra has once evaluated from a set of Nix expressions.

Please look up a basic guide on how to search for Nix packages. It’s a bit of a mess right now.

It’s not really critical to what you’re attempting to achieve though because the vim derivation is just the vim attribute in the Nixpkgs package set.

Atemu · December 25, 2024, 9:23pm

Each and every thing that is fetched from the internet by Nix must have a hash provided ahead of time.

You cannot download arbitrary data at any point other than these fixed-output-derivations; Nix forbids that.

If you look at practically any package definition Nix expression, you’ll see this hash recorded for that package’s particular version.

maltfield · December 25, 2024, 9:24pm

Each and every thing that is fetched from the internet by Nix must have a hash provided ahead of time.

Hashes provide no assurance of authenticity in this type of attack, unless they’re signed.

NobbZ · December 25, 2024, 9:24pm

Again, we rely on a content hash of the sources.

A simple sha256. Not of the tarball, but its contents. Nix uses its own reproducible way to determine a hash of a filetree. This is outlined in the thesis IIRC.

Whether we receive this via HTTP, HTTPS, FTP, or otherwise is not relevant. And we are sure that no MITM maddled with the contents of the file tree, as then the hash wouldn’t be correct anymore.

NobbZ · December 25, 2024, 9:24pm

Irrelevant.

Any signature that is not by the original creator is as good as no signature.

But feel free to go through your attack step by step. If the content is not the one expected the hash will not be verifyable and therefore the build fail.

Also: if your attack would work against unsigned hashes, it will very likely also work for signed ones.

Atemu · December 25, 2024, 9:36pm

To expand on this further: The Hydra build farm must also be able to independently fetch what is dictated by the Nix expressions. None of us maintainers get to manually decide the exact content that hydra uses as a source, only its metadata (how to fetch it and the expected hash).

Even if anyone were to MITM the build farm in order to modify the downloaded content, the hashes would be wrong and the FOD builds would simply fail; causing all dependent builds to fail.

In order to maliciously modify a source of a package such that code derived from it lands on the system of actual users, an adversary must MITM:

The author of a PR changing a package’s source who sets the FOD hash based on the source content they download on their machine (or in some cases based on other records such as lock files)
The CI checking whether the PR’s changes can be built
Any potential reviewer attempting to build the changed package for themselves in order to test its functionality
Hydra building the package independently of all of the above

It’s not impossible but exceedingly unlikely. There are far more effective attack vectors to worry about IMHO.

TLATER · December 26, 2024, 7:48am

They could also introduce a malicious branch upstream. Then they would only have to impersonate the nixpkgs committer, and get a reviewer who doesn’t look too closely.

Though to be fair, by the time you’re performing supply chain attacks like that I don’t think source signatures like debian provides would help much anyway, just makes impersonating nixpkgs contributors subtly harder.

I’m curious now though, as you did not cover this afaict @maltfield - what exactly do source signatures protect against in the debian world? Maybe it’s easier to understand if we figure out the differences in the build/packaging process from that perspective.

maltfield · December 26, 2024, 3:50pm

This is trivial for an APT, unless the data is signed or you’re using a pinned certificate when communicating through the public Internet

I’m curious now though, as you did not cover this afaict @maltfield - what exactly do source signatures protect against in the debian world? Maybe it’s easier to understand if we figure out the differences in the build/packaging process from that perspective.

It prevents an attacker (doing a MITM) from delivering a maliciously-altered package into my system when I install it via apt. X.509 is horribly broken, and does not provide an assurance of authenticity. The signature (whoose private key is not stored on the publishing infrastructure) would be invalid if it were altered, and apt would refuse to install it.

waffle8946 · December 26, 2024, 4:07pm

But what is the mechanism of attack? Who cares if there’s MITM if the code delivered is identical?

Atemu · December 26, 2024, 4:07pm

I’m not sure what APT has to do with MITM’ing multiple independent parties at the same time.

As discussed before, this is mitigated/prevented using store path signatures.

The only possible methods to add a store path to your system as an unprivileged user are:

Result of a local execution that occurred on your (trusted) hardware
Fixed-Output-Derivation where the hash is fixed ahead of time by a trusted party and verified locally
Substitution by a trusted substituter (e.g. https://cache.nixos.org), verified using its public key

Unless an attacker has access to the private key of a trusted substituter or privileged access to your Nix store, there is no possible way for them to cause the creation of a valid store path with arbitrary content by way of network MITM in this model.

Additionally, there also still TLS in the way of MITM’ing any network connection made by Nix.

7c6f434c · December 26, 2024, 4:37pm

Note that Nixpkgs does not try to review the upstream source code. Unlike Debian, although Debian has a mixed track record in terms of outcomes there. Also, we are often defaulting to pretty large configurations of dependency set, with little support for trimming down. Geodistributed MITM is on average harder than slipping a backdoor into a bugfix proposed to the upstream, so if you have a threat model including an APT, you probably need something more minimalistic and with more source review for core packages.

SpiderUnderUrBed · December 27, 2024, 12:05am

Would there be any benefit to introducing some form of OPTIONAL package signing for certain applications? I mean, during this discussion we did go over several unlikely but possible attack vectors, again they are unlikely, but I think that it might discourage more people than just this package maintainer (I am referring to maltfield who opened this to decide if they want to package buskill for nix). @Atemu covered a few different ways a malicious actor would MITM and so did @TLATER . I do want another opinion on whether X.509 is bad, and if it would be worth it to introduce other ways of signing, or if there is any value for introducing signing as apt does it. The past RFC’s on this got denied and the maintainers asked for smaller RFC’s that cover this:

Incentivise signed commits
Requiring signed commits
Verifiable channel update integrity

I think someone should maybe make a RFC that goes over some things talked about in this discussion and to implement one of the things mentioned above.

pyrox · December 27, 2024, 3:04am

My issue here is that there’s no clear definition of what the threat model is here. So to determine why additional security measures should be implemented, and what measures those are, it’s important to determine what we’re trying to secure against. Are we just trying to protect against malicious contributors? Nation state actors? This is a question that I think can provide some critical information on what we can look to in order to effectively increase the security provided by Nix, while also balancing any usability concerns that are then introduced by those mitigations.

The above suggestions from @SpiderUnderUrBed provide assurance only that there are no MiTM attacks performed, which I would disagree on being needed, as long as our threat model does not need to concern itself with a compromised CA or potential nation-state somehow MiTMing a secure channel(most source code in nixpkgs is from https or otherwise encrypted connections, thereby reducing the risk of it being tampered with in transit excluding more specific circumstances such as compromised CAs or DNS poisoning, but then to serve bad code, the hash would have to match exactly. Considering that Nix uses SHA-256 hashes for source checksums, this is basically a nonexistant chance considering the probability of hash collisions, even with billions of messages, is, for all practical purposes, nonexistant^[1])

I still fail to understand the exact security benefits that are trying to be specified by the posts in this thread. Binary package signatures(for things downloaded from caches like https://cache.nixos.org) already exist, and sources fetched externally(from github, sourceforge, gitlab, whatever) are almost always fetched over https or a similarly secure mechanism. Therefore, additional security measures seem redundant unless we are specifically trying to prevent compromised CAs(the last such attack happened in 2018^[2], and the root cert was removed from the Mozilla root CA store within ~7 months) or MiTM attacks performed by Nation-state attackers(which I’m not sure how we could ever prevent practically.)

My background is in practical cybersecurity and user experience, so I am trying to understand the additional needs that are lacking here that aren’t addressed by the current security measures. Further, any additional security measures need to be weighed against the needs of the larger Nix community, the willingness of someone to push these changes through the RFC system, and the usability impact for end users(which in an ideal world, there would be none of! That almost never happens, sadly).

hash - Is it safe to ignore the possibility of SHA collisions in practice? - Stack Overflow ↩︎
Timeline of Certificate Authority Failures - SSLMate, could be inaccurate, I don’t have the energy to go find even more sources. ↩︎

maltfield · December 27, 2024, 3:26am

As I’ve said a few times now, we’re trying to protect against someone who has control of a CA (or a subordinate CA) in the common root stores. Historically, this has happened several times. It includes APT, but it’s not limited to nation-state actors.

The fact that many in this community think that compromised CAs in unlikely is highly concerning. If you’re basing your authenticity on X.509 (or any key that resides live on your publishing infrastructure), then you’re going to cause lots of people harm.

waffle8946 · December 27, 2024, 3:52am

Okay, let’s assumed CAs are compromised. Then, what are they going to do? You never addressed that, i.e. this is what we meant earlier by pointing out that you haven’t actually explained your threat model. Repeatedly handwaving that away with FUD isn’t much of an explanation.

Fixed-output derivations (FODs), generally used to fetch sources, in nixpkgs, are defined with a SHA-256 or SHA-512 hash. If nothing matching that hash + name is already present in the nix store, then nix will try to fetch the code, from the URL specified in the FOD. If someone MITMs your connection, at the end of the day, they have to somehow deliver malicious code that matches that SHA-256 or SHA-512 hash, i.e. collision attack.

For now, although there is a theoretical attack on reduced-round SHA-256 (using 31 instead of 64 steps, as of this year), we do not consider collision attacks on SHA-2 to be a reasonable possibility in the next few years. Naturally, we will probably want to open up the possibility to switch to a more secure hashing algorithm, and in any case, if SHA-2 is cracked, a lot more than just nixpkgs would fall.

If malicious code is inserted upstream, and we bump to that version, sure, we wouldn’t catch that, since we don’t look at that code in general. But I fail to see code signing preventing that scenario. And of course, if we never bump to that version, users will never get the malicious code until we do.

pyrox · December 27, 2024, 3:55am

What evidence is there to suggest otherwise? The last documented breach that resulted in a CA being compromised, was, as I mentioned, in 2018, more than 6 years ago at the time of writing. That is more than long enough to suggest that this is not exactly an everyday occurance.

I don’t see where X.509 fits into this. Nix packages are signed by the binary cache, with an ed25519 key. How else can we prove package authenticity than with a key we control that resides on infrastructure we control? In fact, Debian does the same thing^[1], using packages signed by the server, which is the exact same mechanism that Nix uses. I fail to see how Debian prevents this problem in a way that Nix does not.

I would even argue that Debian’s package signing is worse, relying on OpenPGP/GPG for its package signing, which is a bad idea in 2024 and was a bad idea in 2019 ↩︎

TLATER · December 27, 2024, 6:35am

I mean, what @waffle8946 explains is way more important here; nix has a different mitigation for CAs falling, or MITM attacks, or any number of exploits that would modify build sources, that at face value seems just as valid as source signatures. Nix doesn’t rely on the CA at all - in fact, it did not verify https certs until people realized that this potentially leaks auth credentials for basic auth-protected sources.

This is why I’m curious about the need for this in the debian world; @maltfield claimed earlier in the thread that an unsigned hash is not sufficient, neglecting to explain why (though I think we all appreciate a hash is needed, I just don’t understand why it has to be signed) - knowing the debian community, that’s probably been established fact for a decade or two, after a room of greybeards mulled over it for two days at a conference.

I’d really like to understand why this is considered insufficient in context of the debian threat model, because I think that would allow us to put nix’ threat model into debian’s context, which might help bridge the gap in understanding from @maltfield 's perspective.

The odds that there’s a gaping hole here somehow not considered by thousands of build reproducibility nerds are slim, of course, but it’s at least interesting to find out if this is just a difference in process that gives rise to different needs or if the difference lies in a different underlying threat model. My bet is on the former, but it’s way harder to browse debian mailing lists than nix source code. And hey, we’ve clearly overlooked gaping security holes around this before.

TLATER · December 27, 2024, 7:27am

Hmm, after thinking some more, I think there is a missing link where we rely only on CAs - at least with nix flakes, you have to trust that github.com is actually serving the real nixpkgs repository whenever you nix flake update.

If that is MITM’d, nix will blindly trust any hashes served up by the attacker in subsequent builds, and users would be unlikely to spot the changes (unless they think it’s weird that a package that usually doesn’t build is building, but that also just happens from time to time if hydra fails). Package signatures are irrelevant in this case, because nix will never download anything from the cache here, it just runs a build.

Debian doesn’t rely on this trust, since an impersonator could not provide correct signatures for any substituted packages, unless the initial installation was compromised and included false certificates, and it won’t just build stuff it cannot substitute, but that seems much more prone to being discovered.

I have to admit that I never looked closely enough at how nix channels are hosted to know if there are other signatures there, but I don’t believe there are.

This isn’t a weakness in the same vein as what we’d usually consider in the context of build security and reproducibility, so I can see how this would be overlooked, or hand-waved as “we can trust CAs”. Commit signatures could indeed help mitigate such an attack. Maybe this is what @SpiderUnderUrBed means?

I’ve never been involved in whatever groups consider the threat model for NixOS, so I might be missing details, but I do know that there’s a general assumption that GitHub is not malicious.

RaitoBezarius · December 27, 2024, 7:40am

Note that regarding this, some folks explored the possibility to equip fetchers with commit signature verification.

Guix also uses commit signature chaining of their expressions code to close that gap.

TLATER · December 27, 2024, 7:55am

Thanks for that pointer, looks like features around this are actually well underway: Commit Signature Verification by flandweber · Pull Request #8848 · NixOS/nix · GitHub

Seems the discussion around this goes way back, too:

And indeed, the earlier responses were “but we can trust CAs!”.

Likely going to be a while before this could be part of the NixOS deployment story, though.