Peer-to-peer binary cache RFC/working group/poll

I would like to add (unless mistaken) that the nature of the binary cache is that derivation outputs are meant to be reproducible. So anyone in the swarm can reintroduce the data on demand. We don’t need fallback storage for that.

To be doubly clear, we are talking about build output artifacts which can always be recomputed, not source inputs which are considered valuable in cases where the origin URL is no longer reachable.

Was the issue that we can’t easily distinguish valuable inputs that came from the Internet and are still in cache.nixos.org ?

2 Likes

I would like to add (unless mistaken) that the nature of the binary cache is that derivation outputs are meant to be reproducible. So anyone in the swarm can reintroduce the data on demand. We don’t need fallback storage for that.

Ideally yes, but in practice a lot of derivations are not bit-for-bit reproducible. So while you may be able to reproduce a functionally identical build artifacts, it would likely be a different file.

To be doubly clear, we are talking about build output artifacts which can always be recomputed, not source inputs which are considered valuable in cases where the origin URL is no longer reachable.

We are talking about both: the cache also keeps the source archives. As it frequently happens that URLs break and source become unavailable, this is actually the invaluable part of the cache.
For example, I just adopted an unnmatained software that was recovered from the binary cache: the author website and repository were lost.

3 Likes

Right. On that note, I wonder if it isn’t too unreasonable to also take this valuable data and offer it to https://archive.org/ and/or https://www.softwareheritage.org/, on top of distributing it with the swarm, if possible.

Doesn’t Nix support attempting to fetch from known mirrors like IA ? I think I remember packages having a mirror: scheme.

Actually, I think I remember a Nix blog post about a collaboration with Software Heritage, from which I first found out about them. It was about reaching 100% reproducibility.

EDIT: Long-term reproducibility with Nix and Software Heritage & Expanding coverage of the archive: welcome Nixpkgs! – Software Heritage

1 Like

Right. On that note, I wonder if it isn’t too unreasonable to also take this valuable data and offer it to https://archive.org/ and/or https://www.softwareheritage.org/, on top of distributing it with the swarm, if possible.

They would probably interested, yes, but the data need to be presented in an appropriate way.
I think it would be quite a lot of work to selected what’s worth preserving and attaching useful metadata to the raw cached paths.

That’s just a shortcut to using multiple URLs or domains, which are stored in pkgs/build-support/fetchurl/mirrors.nix with per-project definitions. I think using a global mirror is possible but it would require to map between different naming convetions (I mean like domain/project-name/package-version.tar.xz).

This could be done gradually by package maintainers wishing to help clean up the cache if it’s a pressing matter.

Perhaps documenting the process could be great for preservation efforts.

Wasn’t someone working on an analyser to list dead sources in cache ?

1 Like

If this storage network loses the last copy of an artifact, it has failed. Rebuilding derivations that are bit-for-bit reproducible so that the hydra.nixos.org signature still validates is a neat trick, and might be used in a sophisticated way to set the redundancy target of different artifacts (either as fun tuning and tinkering after everything else is working, or as a response to volunteer capacity being too low to support the full archive at full redundancy), but I think we should first attempt to find/assemble a mechanism that Just Works as a simple storage/serving service.

I was kind of hoping that we could make a thing out of:

  1. A p2p daemon that volunteer cache providers run that just coordinates IPFS pins.
    • It would have some signal of trustworthiness of cache providers (maybe just participation longevity) that it uses for shard balancing, to avoid the scenario where N brand new transient nodes come on line, are given responsibility for a new artifact, and all N of them go away, losing that artifact.
  2. Cache providers then just run normal IPFS that does all the content acquisition, storage, and serving, with its pins managed by #1.
  3. Nix is extended to fetch from multiple sources in parallel.
    • I.e., attempt to fetch a thing, and if it hasn’t made reasonable progress in 2 seconds (configurable), without giving up on the first source, also begins fetching the same artifact from a second source. And after 5 seconds, a third source, etc. As soon as the fetch from any source finishes, the other concurrent requests are cancelled.
    • Bonus if this can be done at a fixed-size-block level rather than with entire NARs.
  4. (And it sounds like maybe an IPFS lookup caching service if normal IPFS DHT lookups are too slow?)

I was also hoping the IPFS community would just have a component that does #1, either ready to go or in development, but I don’t see one. I bet the IPFS community would love to have such a component. Maybe we could build it together.

Edit: There is an IPFS component for #1: ipfs-cluster! Specifically, the collaborative clusters feature.

So maybe the only new functionality we need to implement is IPFS fetching and parallel fetching in nix? See also Obsidian Systems is excited to bring IPFS support to Nix

3 Likes

Could Tahoe-LAFS be suitable? It is designed to handle sharding, distribution, redundancy and more. Although, probably it does not scale to a world-wide network. (Managing redundant shard distribution for resiliency and performance.) That problem might be solvable by using region-based mirrors of Tahoe filesystems, which would provide even more resilient redundancy.

1 Like

Tahoe-LAFS is designed with a relatively stable set of nodes in mind. We have another ongoing effort to self-host the cache on bare metal and we might use it there.

That’s a bit of the problem regarding availability on a full P2P network; if you can’t reason about the probability of the nodes going down, the only way to compensate is to increase the number of duplicates, which then increases the total cost of storage proportionally.

Where P2P could truly shine is by becoming part of the distribution network. Even if the discovery only happens on the local network, it could help organizations and clusters to both get the NAR files faster and also save up on Internet bandwidth. This is something Microsoft is also using to distribute Windows updates.

The scheme doesn’t need to be too fancy either; have hosts discover each other through rendez-vous on the local network, and then query all of the hosts with a set of requested hashes. It’s boring, and probably quite effective.

8 Likes

I only just noticed this thread. Here is my two-cents in reply to another post page.

1 Like

TLDR: I fully believe in a hybrid validator/peer-to-peer solution.
The cache.nixos.org is the validator which keeps copies of official build hashes.
The peer-to-peer system dynamically supplies the storage and bandwidth.

In the end we need a real-world experiment to start doing this and collect data on how well the system runs.

Seeders easily opt into the swarm thanks to a new services.nix-serve-p2p.enable: bool = false Nix option enabling a service to join the swarm. This could be added as a comment in the generated config to raise awareness and help it more easily gain traction.

I think first we should start with an experimental feature: services.experimental.nix-serve-p2p.enable: bool = false

Real-world testing can help us focus our efforts in the correct direction.

In the short-term cache.nixos.org will have to keep holding on to all its data until we can see how the peer-to-peer system really works in the real world. A slow transition is most likely the only way forward. But it is nice to see we have 12 months to breathe easy:
NixOS S3 Short Term Resolution!

I believe there is enough community support to start the experiment.
We can easily talk about it, but a lot of issues raised might not even be much of a problem when the rubber meets the road…

6 Likes

I just had a thought regarding the storage guarantee.
Perhaps cache.nixos.org should monitor the health of a given package in the peer-to-peer system. A dynamic garbage collection model.

If there are less than X nodes holding a given package, then cache.nixos.org must retain a full copy.
But if the nodes for a given package are really high, then cache.nixos.org can do a garbage collection on itself for that package, as it is confident that it is fully available in the wild.
Then if the node counts start to drop, it can require that package for archiving purposes.

Yes, there is a small risk of losing a package for good, but that requires losing everything: the cache, the source URL going down and the original source code being lost and no one has an old copy to reupload to a new URL. Such a failure cannot really be our fault. This is just life. Sometimes things get lost with no way of recovering them. In this modern age were everything “needs to be preserved for the historical record” can be at times simply sacrificing the future for the past. Not a good way to live. Not a good way to run a cache. (A temporary file storage).
We are not the archive of the world.

3 Likes

Regarding IPFS, I’d link some info about the last experiment I’m aware of:
https://github.com/NixIPFS/nixipfs-scripts/issues/11#issuecomment-1520711673

6 Likes

TLDR: I fully believe in a hybrid validator/peer-to-peer solution.
The cache.nixos.org is the validator which keeps copies of official build hashes.
The peer-to-peer system dynamically supplies the storage and bandwidth.

I like this model very much. It makes sense to not try to solve the problems around distributing storage and trust at the same time. If we keep the “centralized trust” model of cache.nixos.org, much of the infrastructure (ie Hydra) can remain basically the same. Once the p2p distribution of nar-files (based on their hashes) is working, nothing is stopping us from also work on distributing the trust in some way (like CA-derivations, trusted builders, Trustix, etc).

9 Likes

I’d like to offer up the idea of a Trust DB here. Right now the mapping from input hash to NAR file is implicit in the nixos cache.

However, by simply publishing a mapping of input hash → content hash, a user can choose to trust a certain publisher, and then it doesn’t matter what cache system you use, the hash can be verified locally.

This is orthogonal to any other solutions. Having the outputs be CA is nice but not required. You don’t need to know what the input hash means, what attribute makes it, etc.

3 Likes

However, by simply publishing a mapping of input hash → content hash, a user can choose to trust a certain publisher, and then it doesn’t matter what cache system you use, the hash can be verified locally.

Is this not just the narinfo-files? When substituting, Nix already works in two phases. First it fetches the narinfo file for a store path from a substituter (cache.nixos.org). The narinfo contains the content hash of the nar and the url where the nar can be found. In the second phase, Nix fetches the nar-file itself. The url is always assumed to be a sub-path of the substituter’s domain, but you could imagine just querying some p2p network for the content hash in that second phase instead.

3 Likes

Nix currently has 2 ways to address a single “thing”.

  1. Input addressing. This is the default. Here nix will calculate the hash of a derivation, and prior to realizing it thorugh build, it will ask substitutes whether they know the product of the drv with the hash calculated. The product again has a hash based on its inputs, the content hash in the NAR file is not just a content hash, but a signed hash, to verify no one tampered with the NAR during transfer.
  2. Content addressing. Here nix will also first calculate the input based hash for a derivation. Then it will ask substitutes what the content address was. Then nix will remap that IA drv to a CA drv, and ask substitutes again if they know about that CA hashed drv, and then substitute that. Here the content hash is not signed, but reflects the bare content. Content addressed drvs do not need to be signed, but the mapping from IA to CA needs to be trusted.

In general, from what I understood, trustix wanted to provide the software and infrastructure to provide a decentralised “web of trust” for IA->CA mappings, but when I asked in the matrix how to set up things, it was pointed out to me, that the project is mostly dead.

2 Likes

Well, the narinfo provides the content hash (called the “nar hash”). This is just the sha256sum of the nar file. The narinfo can also contain a signature which is based on the store path, the nar hash, the referred store paths and the private key of the substitutor. So I’d argue that cache.nixos.org is already a trust database. Just fetch the narinfo for the store path you are interested in, verify its signature and then you are free to fetch the nar file corresponding to the nar hash from whatever storage provider you want. Of course, you then also have to validate that the nar hash matches what you receive.

3 Likes

Yes, it definitely is a trust DB, regardless of considering CA substitution or not. Though a definitiv problem in this scenario is, that this trust is coupled to substitution, and there is currently no way to ask one authority for a mapping, and another for the content.

The substitution process doesn’t allow to seperate these concerns from my understanding.

In general the substitution process is rather limited from what I can tell, and really only considers substitutes. It does not ask configured remote builders if they have a “product” available.

Instead, if everything goes south, remote A will happily build from scratch what builder B already has, just because of a race in the queue…

Not quite related to the topic of the thread though… Or perhaps it is, and the substitution and remote and local build should be unified into a single abstraction rather than 5?

1 Like

It doesn’t explicitly allow to separate these concerns today, but it is already making two separate requests, one for the narinfo and one for the nar. So adding support for additional “nar providers” wouldn’t be a big thing, conceptually. The signature verification and configuration of trusted keys would remain as it is today.

In general the substitution process is rather limited from what I can tell, and really only considers substitutes. It does not ask configured remote builders if they have a “product” available.

Yeah this not optimal, and somewhat confusing. However, the trust model Nix uses for remote builders is different than the one for substituters. When using a remote builder it is generally the coordinating machine that handles signing of the resulting store paths. So while builds might be available on remote builders, they could be missing the required signatures. I would love to see some unification in that area…

Not quite related to the topic of the thread though… Or perhaps it is, and the substitution and remote and local build should be unified into a single abstraction rather than 5?

Yes! Internally in Nix, these things are actually already somewhat unified, with the Store abstraction. However, I want to see that abstraction also on the outside, so you could mix and match different “stores” used for building, fetching, storing. See this Nix issue: Separate stores for evaluation, building and storing the result · Issue #5025 · NixOS/nix · GitHub

9 Likes

I must say I didn’t consider narinfo files, they indeed contain all the information for a trust DB.

Now I wonder what their overhead is, currently. Suppose we have an efficient way to distribute trust DB deltas, and you can join multiple providers, and each provider signs the deltas, how much local storage would you need to store these mappings?
And supposing we keep the trust DB mostly as an online thing, it looks like each narinfo is a separate HTTP request, would it make sense to allow bulk requests?