NixOS S3 Long Term Resolution - Phase 1

Yeah, I have a solution…

I made a prototype that distributes NAR’s over a p2p bit torrent like protocol, but it needs more work to become production ready. I believe changes to the nix store layer may help.

The proposal was put into NGI, however at the time it was not successful.

So here we are, I am firm believer the cache can securely distributed by it’s users for it users who want to donate bandwidth to Nix/OS, reducing the reliance on centralised CDN’s,

I’m a believer in IPFS for ‘certain’ types of data… but the size of the DHT will never scale for locating and publishing content, unless the project can somehow break the laws of physic’s and computer science all in one fell swoop.

Unfortunately, I’ve heard NGI are no longer directly funding Nix projects, so here we are ,a bit dead in the water with the idea.

There are some other internet architecture things that need to change for p2p to work, one of those is IPV6 and computers being able to connect end to end without NAT.

By the looks of it, this will never happen… … If the funding had come in all those years ago…, we would be in a much stronger position to do it, that was 3 years ago.

There are some changes to the store layer that need to happen, the ability to keep the original nar, while ‘using’ the nar, would reduce the storage requirements on the peers by a large magnitude…

interesting stuff!

3 Likes

I am firm believer the cache can securely distributed by it’s users for it users who want to donate bandwidth to Nix/OS, reducing the reliance on centralised CDN’s,

If this were to work, it would surely be awesome!

In the meantime however:

Seeing this thread and the underlying problem now for the first time, I might actually consider donating some money for this! I am not that wealthy so it won’t be much. But still, I am using this service regularly and it is very reliable. I believe many users might not even be very much aware of this sadly.

Edit: I also have to say given the sheer size of the opening post and my tendency to just glance at it, I was lucky to even see that I actually could donate any money for this. Maybe it would be a good idea to make people somehow more aware of this?

5 Likes

I am unsure if it has been mentioned before, I could not find it with a skim of the discourse and matrix, but wouldn’t Sippy (beta) · Cloudflare R2 docs solve the bandwidth cost issue of migrating off of S3 to R2 if that is still a migration path you are considering?

Latency would be a bit higher for the first few reads, but the popular packages should become quickly cached.

2 Likes

This may just robbing peter to pay paul. Your still reliant and at the whim and control of third party CDN’s.

The whole object of the exercise as i see is, is to remove third party distribution mechanisms a keep them to an absolute minimum.

I’d rather nixos foundation buy actual skin with the donations they receive…, not bandwidth or tin, rather give it to companies with a very dubious ethics.

5 Likes

Have you considered enabling intelligent tiering? I’ve used it myself for our internal nix cache to great success, easy to set up and it’ll lower storage costs for things that are accessed infrequently.

1 Like

I would appreciate knowing how the testing of this software went.
If you lacked the time to do a deep test why not enable a nix config option like nix.exerpimental.p2p-caching.enable = true; and let the community help test an official p2p solution? If that software doesn’t work, then you could swap the p2p-caching option for a different solution. (p2p-caching.package = attic;)

I believe that if you wish to explore a P2P option then you should tell the community to focus on testing a particular solution. We just need the Foundation to tell us which solution we should focus on to test first.

This way you can continue to focus on the other solutions and let the community work on the p2p solution.
But we do need @ron or someone with some authority to tell the community what p2p solution you would like us to focus on testing and tell us what and how we should report bugs and the like.

One of the main issues with the P2P solution right now is we lack focus. Many different ideas of how to do it. But for a P2P solution to have any chance of working, we need leadership to tell us what to focus on testing.

2 Likes

Hi there, I will speak for myself (i.e. not the Foundation) as a person who coordinated the efforts on the cache, I don’t think the Foundation can offer focus, this is really up to the P2P experts there to build a plan, build a solution and showcase it on interesting scales. This takes time, efforts, energy and sometimes even resources.

I think a lot of folks did say that P2P solutions are not really realistic for the time being at the nixpkgs scale for storage. For distribution, they could work but as you can see, this is a storage problem, not a distribution one.

Otherwise, I would recommend starting with Tahoe-LAFS and build a 100TB cache with that or more and see how that goes w.r.t. to all classical properties a cache may require.

5 Likes

I understand your points and can see that you understand the problem very well.
However, my main point is that the only real way to test a P2P solution properly is to have an official test case.
If we have multiple different P2P solutions that are all testing in a fragmented way then even when a solution shows real promise people will still say that they are not sure if it can scale properly.

I propose that the foundation sit down just for an hour together, pick the best P2P option that they believe has a chance and add a nix service nix.experimental.p2p-cache.enable then we can test that solution at scale right now. Then if it breaks, then we will know exactly why and either say once and for all that it just can’t scale or find the main cause and solve it either by switching to a different P2P solution or patching the existing once.

Otherwise, I believe we will be forever stuck in small isolated testing environments and be forever waiting for “will it scale” question to be answered.

If P2P is going to be a win, then the Foundation willl have to pick a winner at some point any way.

Based on the amount of community interest in orginal The NixOS Foundation's Call to Action: S3 Costs Require Community Support - #171 by Federico post I believe the community really wants to be able to have a chance at helping more than just giving more money or finding the cheapest hosting solution. While interest is still high in this issue it is the best time to test a P2P solution. If we wait a few years then the interest in this issue could drop resulting in less desire to work on a community run solution.

Also, the nix community is full of really smart people who can handle working through an experimental caching test.

Nix is mostly used by devs who are very invested in this ecosystem and are more than willing to work on making it last for 100 years.

1 Like

I propose that the foundation sit down just for an hour together, pick the best P2P option that they believe has a chance and add a nix service nix.experimental.p2p-cache.enable then we can test that solution at scale right now.

I have to admit that does sound like a nice and pragmatic way forward. I have no idea about the underlying technology though. I’d love to try whatever possible solution might be found!

2 Likes

The Foundation does not do technical decision, so you can sit down for an hour together but I fear this may not lead to the outcome you are looking for.

All the data about the scale and what not is public and someone among the P2P group interest has to drive an effort to build a proof of concept which can answer a bunch of questions.

Adding a nix.experimental.p2p-cache.enable is something that you can already do today in nixpkgs or in anything out of tree. But someone has to build it.

For this, it is necessary that the P2P working group come forward with a working example given the public data, again.

Definitely, but for this, there’s a need for the P2P working group to work on an implementation and such a thing has not been happening so far. No matter how much interest is there for X, if no group can build it, we will have trouble to have X.

You do not need an official test case because you seem to be misunderstanding that no amount of official can build the test case in your stead. Build the technology, the code, send it to nixpkgs, convince a working group to join the experimentation, collect the data and publish it, would be what I would have done if I was interested in P2P (which I am not).

6 Likes

What’s your plan to handle data persistence? Someone has to keep all this expensive storage available.

In my opinion, P2P would only help for content distribution, but there is already a deal with a CDN provider and it costs almost nothing for the NixOS project :thinking:

3 Likes

It would be nice to have the ability to “mirror” the nixpkgs cache, similar to other linux distributions. For example, having a tool that copies all nars for a given nixpkgs commit or something like that

1 Like

@Solene Firstly, I have discussed my views on the benefits of a global P2P cache at length in the previous posts:

But to summarize:

First we need to separate caching from archiving as these two have different optimization requirements. Archives require large storage but are not so concerned with bandwidth. However, caches care more about bandwidth than storage as they are updated and pruned regularly. For example: a package that is only being used by less than 10 people should probably be not in the cache as those 10 people can build it on their own machines or host a local cache if it is needed for a large set of machines. In such a case there should be a Network Admin who is running the show anyway…

It is my understanding that the only thing that needs data persistence is the tarball archive for software where the source code is not freely available for compiling from source. All other data can be rebuilt if it is missing from the P2P network when or if needed.

Yes, I know that the data persistence problem has been solved for now via the deal with a CDN.
However, bandwidth cost are also an ongoing concern if the Nix community continues to grow. My concern is that it seems the current model has the cache system merged with the archive system as a few people have been asking me how P2P solves data persistence, and I feel like they are missing the point I am trying to make. P2P solves the cache (distribution) side of the problem not the archive (persistence) side. But if Nix is reproducible (I know there are some limits to this) then persistence is not the main problem in the first place, it is the cache (distribution) side that is the main ongoing problem of concern.

If P2P is successful, then we can have Hydra be the system that seeds the P2P network as the “trusted-node”, and if there are a large number of nodes hosting a certain piece of data then the CDN can be pruned without fear of a loss of cache performance (as the P2P network is supplying the bandwidth). This pruning can be done organically instead of by necessity (if we have this crisis arise again in the future requiring a serious prune all at once manually by a person which could lead to real headaches for many people.)

I know there are people who are not keen on P2P for various reasons, but one of the main reasons why I am so strongly for it is because I believe it will make the Nix ecosystem future proof and build redundancy into into the system as a whole. (I am thinking 100 years future proof here)

@RaitoBezarius I see that you said someone just needs to make it and add nix.experimental.p2p-cache.enable? But who approves that if not the Foundation? I seem to be missing something here…

If that is the case, then there are a couple of working packages already in nixpkgs from what I have seen. If they can be ported under a main Nix config option then we can all test together globally then work out the bugs as we find them. For P2P to work it requires everyone to be on the same network and same protocol to concentrate the collective bandwidth, othewise P2P won’t work at scale.

There is a chance (10 years in the future) that a successful P2P network can replace most of what is on the CDN, which will save a serious amount of money for someone, which can then be used to pay more Nix devs at the Foundation. Money spent on talent is better than it spent on infrastructure IMHO.

Thanks for the comments I appreciate them very much.

2 Likes

Why do you need approval to add an experimental option in nixpkgs? Nixpkgs is open to anyone to contribute and there’s no need to get buy-in beyond from the code owners / maintainers, etc.

The Foundation is clearly not one of them, FYI.

4 Likes

Ok, thanks for that. So we just need a PR then? That clears things up for me. Thank you very much.

Correct, but again, I strongly recommend to start in another namespace than nix.experimental, I am not sure that Nix code owners will appreciate that usage. Once the solution is proven to be interesting and can be adopted, this can move to a more official namespace, of course. :slight_smile:

Again, YMMV because I am not a Nix developer.

3 Likes

Thanks for the tip. The namespace is obviously a point to discuss with the Nix devs. (I was just using it as a placeholder. :slight_smile:

1 Like

Amazon just announced that in preparation of the European Data Act that they will waive any fees for moving out of AWS. Does this impact our decisions and plans? It sounds like a huge game changer.

11 Likes

This may be shortsighted, but I would wager that most users aren’t benefiting from a good deal of the historic caching. Holding on to so much history is most likely to benefit businesses maintaining projects with pinned dependencies.

If things are not sustainable I do not see an issue in culling some old cache. There is nothing stopping these users from maintaining their own cache if building from source is an issue for them.

I am personally fine building from source more often. I have a number of options for caching my own projects dependencies if needs be.

While I agree that P2P would be a good solution there are attempts such as trustix which seem to be somewhat of a WIP & it sounds like this needs to be solved with fairly immediate action.

Additionally S3 seems massively overpriced for what is essentially a gated ftp API. It doesn’t seem to align with Nix’s values. Perhaps on prem storage backed by a foss S3 compat protocol like minio could be an alternative for a smaller cache?

The community can contribute to & experiment with P2P solutions for caching niche dependencies. Not building from source is a luxury & the money could be spent incentivizing core contributors to improve the ecosystem further.

8 Likes

Also want to mention that Cloud Storage Pricing: Wasabi vs Azure, Google & AWS S3 Pricing has much better pricing for s3 compliant storage

1 Like