What guarantees do signatures by binary caches give?

I have two questions about what it means to obtain a store path with a valid signature from a binary cache.

  1. Does the signature relate the store path (input address) and the contents by signing both of them together, or is it just a signature of the content?

  2. Is it a design goal of the signature to identify who originally built the package (so that if I set up my own binary cache which obtained some store paths from https://cache.nixos.org it will only contain that original signature and I can attribute the build process to https://cache.nixos.org in that way)?

  1. both. At least I think so. Iā€™m lazy to look for evidence.
  2. Iā€™m not sure if I misunderstand you. When you verify a signature, you know which key you use. And surely you know where that key came from. Example like from .narinfo:
    Sig: cache.nixos.org-1:5p4rJZA7peW79kmsna4JN8TQ986qexRRubaYiI8y+tkuX3Qn2me7kgDWcSQEOz0gfS46+tTILllz9cZZueCwCA==
    
1 Like
  1. Both, plus its references, evidence path-info.cc
  2. Nope, just that the owner of the key trusts the path by some means (either it builds the path itself, or feeling lucky and sign it whatsoever, see nix store sign)
2 Likes

Thanks for helping me with this.

I created a feature request for some additions that address the second point: trusting substituters, but not *their* subsituters (non-transitively) Ā· Issue #9644 Ā· NixOS/nix Ā· GitHub

IMO it provides very little interesting guarantees at the moment.

If the signature was both over the derivation content and all its output contents you at least have some kind of ā€œprovenanceā€ that a certain store path came from a certain derivation. We currently totally lack that. and in the presence of fixed output deeivations , two different derivations can produce the same output path. Which makes things kind of whacky (Iā€™ve observed many times on cache.nixos.org that the Deriver field in narinfo doesnā€™t match the derivation on my local system for the same output path. And even scarier Iā€™ve find cases where the Deriver field points to a derivation that isnā€™t in the cache at all!) . Weā€™re currently missing the thing that links a derivation to its outputs cryptographically.

The only guarantee at the moment it currently gives is ā€œit was signedā€ which is kind of tautological. Honestly donā€™t see the point in presence of TLS.

1 Like

Iā€™m struggling to interpret your message, are you referring to the issue that the signature only accounts for the .drvs but not for the CA hashes of the actual built (ā€œsampledā€) dependencies that the builders spit out, like nar hashes?

1 Like

For reference, here is some more details on signing keys, and how they are used in nixbuild.net to ā€œvirtualiseā€ the Nix store for users.

@arianvp what you are saying about there being no kind of ā€œprovenanceā€ is not correct.

Cloud build systems like Nix always associate some hash of the build inputs with some hash of the build outputs.
Nix does this in a trustworthy way by signing the address of the derivation together with the NAR hash.

@rickynils There is one thing in those linked docs, which I find a bit confusing.

The fact that store paths are based solely on build inputs (input addressed) also has a disadvantage. There is no way to verify that the store path contents you download from a substituter actually was produced by the same Nix expressions you used when calculating the store path hash. You simply must trust that the substituter ran the build in an acceptable way, and not just stuffed the store path full with malware. This trust is what the Nix signing keys formalises.

To me, that makes it sound like a content addressed store would not suffer from the same problem.

Maybe Iā€™m reading too much into that passage, but I wanted to debunk that implication:
Even with a content addressed store, you would need a trustworthy mapping from the input address to the content hash of the output if you want to consume any cached outputs in a secure manner.
Trusting such a signed mapping always means having to trust the builder with the build process itself.
The difference between input addressed and content addressed derivations in Nix is how the hash of the inputs is constructed.
Due to this difference in input addressed Nix you also have to trust the builder with dependency resolution, because the hash of inputs contains unresolved dependencies (input addresse) and not resolved ones (content addresses).

I wrote about this stuff in more detail a paper that I posted here: Extending cloud build systems to eliminate transitive trust

So far, I donā€™t know the security implications of what you are describing, @arianvp. I think I have to learn more about the deriver field to get it.

1 Like

I think question #1 is questioning if the signature verifies the .drv too, but the link you provided does not include that in the fingerprint. Which makes sense, because you can verify the signature of a path that youā€™ve copied from another store without ever instantiating its .drv.

In the original question I did not aim to ask about .drv at all, but Iā€™m interested in learning more about this issue.

Maybe whatā€™s unclear here is that even if signatures prove a relationship between input address, NAR Hash and the derivation which defined those other two (assuming I donā€™t misunderstand what the deriver is), how can those relationships be verified?

I think if you start from the derivation, you can compute the input address locally, and then use that to find the right signature linking those two, and that tells you that the three of them belong together. Meaning all is well and good.

But if you want to go the other way, from the metadata in the .narinfo, which is associated with the input address, to the derivation, that does not work reliably? Just guessing, but maybe because the same store path can actually be created from different derivations, so that the link in that direction is not unique?
Can this happen for specific kinds of derivations only, or for all of them?
Because I can totally see that being the case at least for fixed output derivations, not sure about the others.

If that problem only occurs with fixed output derivations, in the sense that itā€™s not clear what produced their output, I think with how Nix works right now thatā€™s not a signature problem, but a ā€˜we trust the hash inside the FODā€™ problem.
It would actually be great if a trusted signature was required additionally in order to trust the FOD. I think thatā€™s not the case right now, because otherwise updates to curl would cause mass redownloads. I wish the semantics were better, and I have some ideas.