What guarantees do signatures by binary caches give?

mschwaig · October 30, 2023, 3:16pm

I have two questions about what it means to obtain a store path with a valid signature from a binary cache.

Does the signature relate the store path (input address) and the contents by signing both of them together, or is it just a signature of the content?
Is it a design goal of the signature to identify who originally built the package (so that if I set up my own binary cache which obtained some store paths from https://cache.nixos.org it will only contain that original signature and I can attribute the build process to https://cache.nixos.org in that way)?

vcunat · October 31, 2023, 8:24pm

both. At least I think so. I’m lazy to look for evidence.
I’m not sure if I misunderstand you. When you verify a signature, you know which key you use. And surely you know where that key came from. Example like from .narinfo:
```
Sig: cache.nixos.org-1:5p4rJZA7peW79kmsna4JN8TQ986qexRRubaYiI8y+tkuX3Qn2me7kgDWcSQEOz0gfS46+tTILllz9cZZueCwCA==
```

NickCao · November 1, 2023, 1:03am

Both, plus its references, evidence path-info.cc
Nope, just that the owner of the key trusts the path by some means (either it builds the path itself, or feeling lucky and sign it whatsoever, see nix store sign)

mschwaig · December 19, 2023, 1:07pm

Thanks for helping me with this.

I created a feature request for some additions that address the second point: trusting substituters, but not *their* subsituters (non-transitively) · Issue #9644 · NixOS/nix · GitHub

arianvp · August 14, 2024, 2:55pm

IMO it provides very little interesting guarantees at the moment.

If the signature was both over the derivation content and all its output contents you at least have some kind of “provenance” that a certain store path came from a certain derivation. We currently totally lack that. and in the presence of fixed output deeivations , two different derivations can produce the same output path. Which makes things kind of whacky (I’ve observed many times on cache.nixos.org that the Deriver field in narinfo doesn’t match the derivation on my local system for the same output path. And even scarier I’ve find cases where the Deriver field points to a derivation that isn’t in the cache at all!) . We’re currently missing the thing that links a derivation to its outputs cryptographically.

The only guarantee at the moment it currently gives is “it was signed” which is kind of tautological. Honestly don’t see the point in presence of TLS.

SergeK · August 14, 2024, 5:47pm

I’m struggling to interpret your message, are you referring to the issue that the signature only accounts for the .drvs but not for the CA hashes of the actual built (“sampled”) dependencies that the builders spit out, like nar hashes?

rickynils · August 28, 2024, 12:48pm

For reference, here is some more details on signing keys, and how they are used in nixbuild.net to “virtualise” the Nix store for users.

mschwaig · September 3, 2024, 10:40pm

@arianvp what you are saying about there being no kind of “provenance” is not correct.

Cloud build systems like Nix always associate some hash of the build inputs with some hash of the build outputs.
Nix does this in a trustworthy way by signing the address of the derivation together with the NAR hash.

@rickynils There is one thing in those linked docs, which I find a bit confusing.

The fact that store paths are based solely on build inputs (input addressed) also has a disadvantage. There is no way to verify that the store path contents you download from a substituter actually was produced by the same Nix expressions you used when calculating the store path hash. You simply must trust that the substituter ran the build in an acceptable way, and not just stuffed the store path full with malware. This trust is what the Nix signing keys formalises.

To me, that makes it sound like a content addressed store would not suffer from the same problem.

Maybe I’m reading too much into that passage, but I wanted to debunk that implication:
Even with a content addressed store, you would need a trustworthy mapping from the input address to the content hash of the output if you want to consume any cached outputs in a secure manner.
Trusting such a signed mapping always means having to trust the builder with the build process itself.
The difference between input addressed and content addressed derivations in Nix is how the hash of the inputs is constructed.
Due to this difference in input addressed Nix you also have to trust the builder with dependency resolution, because the hash of inputs contains unresolved dependencies (input addresse) and not resolved ones (content addresses).

I wrote about this stuff in more detail a paper that I posted here: Extending cloud build systems to eliminate transitive trust

So far, I don’t know the security implications of what you are describing, @arianvp. I think I have to learn more about the deriver field to get it.

ElvishJerricco · September 3, 2024, 10:48pm

I think question #1 is questioning if the signature verifies the .drv too, but the link you provided does not include that in the fingerprint. Which makes sense, because you can verify the signature of a path that you’ve copied from another store without ever instantiating its .drv.

mschwaig · September 3, 2024, 11:26pm

In the original question I did not aim to ask about .drv at all, but I’m interested in learning more about this issue.

Maybe what’s unclear here is that even if signatures prove a relationship between input address, NAR Hash and the derivation which defined those other two (assuming I don’t misunderstand what the deriver is), how can those relationships be verified?

I think if you start from the derivation, you can compute the input address locally, and then use that to find the right signature linking those two, and that tells you that the three of them belong together. Meaning all is well and good.

But if you want to go the other way, from the metadata in the .narinfo, which is associated with the input address, to the derivation, that does not work reliably? Just guessing, but maybe because the same store path can actually be created from different derivations, so that the link in that direction is not unique?
Can this happen for specific kinds of derivations only, or for all of them?
Because I can totally see that being the case at least for fixed output derivations, not sure about the others.

mschwaig · September 3, 2024, 11:43pm

If that problem only occurs with fixed output derivations, in the sense that it’s not clear what produced their output, I think with how Nix works right now that’s not a signature problem, but a ‘we trust the hash inside the FOD’ problem.
It would actually be great if a trusted signature was required additionally in order to trust the FOD. I think that’s not the case right now, because otherwise updates to curl would cause mass redownloads. I wish the semantics were better, and I have some ideas.

arianvp · February 19, 2025, 4:48pm

I can also happen for any derivation for which the only difference between them is what FODs were given as inputs

For example take in the below example both bar and bar_ are not fixed-output derivations.

rec {

  foo = derivation {
    name = "foo";
    builder = "foo";
    system = "x86_64-linux";
    outputHash = "sha256-47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=";
  };

  foo_ = derivation {
    name = "foo";
    builder = "malicious script with a sandbox escape that installs a rootkis";
    system = "x86_64-linux";
    outputHash = "sha256-47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=";
  };

  bar = derivation {
    foo = foo;
    name = "bar";
    builder = "bar";
    system = "x86_64-linux";
  };

  bar_ = derivation {
    foo = foo_;
    name = "bar";
    builder = "bar";
    system = "x86_64-linux";
  };
}

This produces

$ nix-instantiate
/nix/store/3ng854555c9d50kpyggnlhj0zlh28q1j-bar.drv
/nix/store/jklxys67c6vrm5dsdyfqcj42bscaf3m0-bar.drv
/nix/store/sfjbr1ji7b9inis88hgi7mrbrnn5ywr1-foo.drv
/nix/store/z47qlj0s288q04l3z42d8fardqgmhlc1-foo.drv

but

$ nix-instantiate | xargs nix-store -qu

/nix/store/pxjjdmrlh36nivgc923jz03b1bz2wiwr-bar
/nix/store/pxjjdmrlh36nivgc923jz03b1bz2wiwr-bar
/nix/store/254159y72qk9fyp9xm5vjs7s1dj735l9-foo
/nix/store/254159y72qk9fyp9xm5vjs7s1dj735l9-foo

My point is; because our signatures do not sign over the Deriver: field in the .narinfo
we do not know whether the builder built the compromised foo_ or the okay foo. And thus when we later build bar we do not know if the builder is in a pristine non-backdoored state and thus we can not trust the output of bar

This is what I mean with that our signatures lack provenance tracking

arianvp · February 19, 2025, 8:58pm

Why i am worried about this exact scenario is that i have had multiple occasions where the Deriver field does not match what i have locally whilst evaluating a package. That is. Nix is substituting a path for my derivation but cache.nixos.org has a different derivation in the Deriver field.

We can debug what happens if we upload derivations to the cache and sign over the Deriver field as well.

But we do neither.

The fact that I am detecting these issues actively and have no way to figure out what cache.nixos.org actually built is frustrating and kind of worrying too.