Peer-to-peer binary cache RFC/working group/poll

chkno · June 24, 2023, 5:43pm

If this storage network loses the last copy of an artifact, it has failed. Rebuilding derivations that are bit-for-bit reproducible so that the hydra.nixos.org signature still validates is a neat trick, and might be used in a sophisticated way to set the redundancy target of different artifacts (either as fun tuning and tinkering after everything else is working, or as a response to volunteer capacity being too low to support the full archive at full redundancy), but I think we should first attempt to find/assemble a mechanism that Just Works as a simple storage/serving service.

I was kind of hoping that we could make a thing out of:

A p2p daemon that volunteer cache providers run that just coordinates IPFS pins.
- It would have some signal of trustworthiness of cache providers (maybe just participation longevity) that it uses for shard balancing, to avoid the scenario where N brand new transient nodes come on line, are given responsibility for a new artifact, and all N of them go away, losing that artifact.
Cache providers then just run normal IPFS that does all the content acquisition, storage, and serving, with its pins managed by #1.
Nix is extended to fetch from multiple sources in parallel.
- I.e., attempt to fetch a thing, and if it hasn’t made reasonable progress in 2 seconds (configurable), without giving up on the first source, also begins fetching the same artifact from a second source. And after 5 seconds, a third source, etc. As soon as the fetch from any source finishes, the other concurrent requests are cancelled.
- Bonus if this can be done at a fixed-size-block level rather than with entire NARs.
(And it sounds like maybe an IPFS lookup caching service if normal IPFS DHT lookups are too slow?)

I was also hoping the IPFS community would just have a component that does #1, either ready to go or in development, ~~but I don’t see one. I bet the IPFS community would love to have such a component. Maybe we could build it together.~~

Edit: There is an IPFS component for #1: ipfs-cluster! Specifically, the collaborative clusters feature.

So maybe the only new functionality we need to implement is IPFS fetching and parallel fetching in nix? See also Obsidian Systems is excited to bring IPFS support to Nix