A common public nix cache?

The uploader doesn’t trust all downloaders.

For example, you use nix to provision your servers, and if the nix store and/or cache were public, and and someone learns the nix path (e.g. from a error message), they can download the code and look for secrets or vulnerabilities.

The downloader doesn’t trust all uploaders.

By configuring a nix cache as a substituter on my machine, I am essentially fully trusting those who can upload to that cache. If they were malicious, they could upload a malicious store path and maybe for my next system upgrade, I’ll fetch their bad package.

Just wanted to say that it hasn’t been true for a while that you can access other people’s builds in the garnix cache if you know the hash. We use tokens, and base access on whether you have access on GitHub to any of the repos that built it. Even somewhat safer than separate caches, because there aren’t two sources of truth about who has access and access is the union of both sets.

We also never allowed people to upload directly - we only upload what we build. So I think 2 is actually much better in the single-cache approach.

3 Likes

I think there is a different, more interesting point in the design space than what your suggesting, which is what I am building in my ‘in active development’ project laut, based on my paper Extending Cloud Build Systems to Eliminate Transitive Trust.

the problem with consensus protocols

If the consensus protocol is just an implementation detail of the ledger, meaning the ledger does not care what you put on it that’s fine, it does not cause any issues. Trying to find consensus over mappings from build inputs to build outputs directly on the ledger however, means that all of the consumers from the cache are subject to a common notion of who they trust and all of the builds need to be reproducible (so there is something to have consensus about). This also generally makes ecosystem evolution and migration harder. If the set of builders in the consensus-mechanism is open, that is also what introduces the need for anti-collusion features.

The alternative

The alternative is that consumers aggregate mappings from build inputs to build outputs from a set of builders they trust, combined with & and | operations, using a policy engine, totally independently of each other.

Trustix as designed and implemented goes the consumer-side policy engine route, my project does the same.

What I’m doing differently is dropping the ledger part from the initial requirements, and implementing the other parts based on as a signature format only. I think it will be better to design and implement putting this on a ledger instead later. Besides Trustix, other people are also working on suitable ledgers.

I’m also explicitly decoupling dependency resolution from building, which is what enables various consumers from caches to cache different sets of builders without having to worry about which caches they in turn trust. Effectively you don’t need to trust the builder with dependency resolution anymore, because consumer also does it independently themselves. This is what you get from content-addressed derivations, if you are pedantic enough about how dependency resolution has to work. There is also ongoing work to make Nix be pedantic like that. Trustix does not do this, it could, but it would depend on CA derivations and related things in the same way I do.

There’s other aspects to what I’m doing, there are technically ways to get around the CA derivations requirement, because the signed representation does not have to exactly match the one in the store, but I think this post is already long enough.