The NixOS Foundation's Call to Action: S3 Costs Require Community Support

Can you explain the reason why for those of us who weren’t present for this discussion?

2 Likes

No, I wasn’t present either. Found this message from Domen in the scrollback:

Based on my experience, I’d avoid blackblaze.
Cachix used to use it many years ago and it was too unstable.
Might have changed, but I wouldn’t risk it.

7 Likes

I host a website on IPFS (I also ran a few ipfs nodes manually), it’s been quite unstable overall and generally slow (which is th reason Dhall moved away from it as well, see Use GitHub as official source of Prelude? · Issue #162 · dhall-lang/dhall-lang · GitHub). However this was 5 years ago, if IPFS actually works readonably now I’m all for this proposal. Different parts of the cache can be hosted on different systems, and I already sacrifice my ssd to the nix gods, might as well contribute whatever cache I have to help with hosting costs. I wonder if anyone tested it enough to see if it works well in practice.

5 Likes

Two more things came up w.r.t to the costs in case we do the migration:

  1. Since we use S3intelligent tiering, the cost for migration will be significantly higher than anticipated, since there’s an extra cost for each retrieval.

  2. If we can use fastly endpoint for the migration, unknown amount of cost is saved since we’ll get entries from the cache.

10 Likes

There have been many suggestions here but it seems like feedback from those that make decisions is missing (or at least I can’t find it).

How/where can we follow the decision making process? Are there calls we can listen in to? Maybe even recordings thereof as the community is worldwide and thus not in the same timezone.
Maybe due to the spread out nature of the community decisions are made async? (That’d be awesome BTW) If so, where do we follow this?

Maybe some people here work in the industry or have some server space and bandwidth they could share, but aren’t jumping in as things seem quite opaque. It’s even possible that a decision has already been made, so somebody with a solution and just getting back on the forum isn’t posting due to that assumption.

Where do we/you stand?

3 Likes
6 Likes

Hello , I have a few ideas about this situation. Check out Linode, very cheap and can reduce cost data transfer.

I have a question. Do peoples /nix directories also contain this information in a useful form to recreate parts of the cache? If so, perhaps that could be used in a scenario where there is a desire to recreate the cache with less egress costs? There could be a tool that given a list of wanted derivations checks what is present and uploads that.
Hashes could be computed and compared to validate the integrity, and eventually only what could not be retrieved from the community need to be exported.

Cumbersome and hopefully unneccessary but possibly an option?

2 Likes

Yes, they do, trivially. But I suspect that vast majority of the paths won’t be on people’s disks (nor in Fastly’s), due to the heavy-tailed nature.

9 Likes

I hope this doesn’t rub anyone up the wrong way but a short term solution to fundraising may be to add some perks to the donation subscription levels.

By this I’m really talking about “token” rewards like… an @nixos.dev / @nixos.org email address, a flashy “Contributor” title which appears on this very board. Essentially some kind of gamification may encourage more people to donate.

Obviously just raising more and more money isn’t viable long-term but other more promising solutions like IPFS can be the slightly longer-term solution. Right now it seems like the aim is just to keep things ticking over financially until a more permanent plan can be implemented.

6 Likes

Thanks for adding this in!
This would also be a relevant topic as we go into the topic of a general yearly Nix fundraiser and a more structured sponsorship model that can help create further incentive to a reasonable extent.

1 Like

Selling “Contributor” titles and email addresses sounds like a sure way to alienate the “code-contributors” and create a pay-to-play community.

21 Likes

What about a model where companies using the cache to obtain more than x-gigabytes per month need to pay some amount per gigabyte after x? Instead of being purely reliant on donations. If that means they end up hosting the cache themselves, thatbincreases the chance of having mirrors which offset the hoeting costs to begin with.

4 Likes

We need an opt-in peer-to-peer cache-sharing system…
basically torrents with the built-in hash checking
Users of Nix Packages and NixOS can opt-in to allow sharing of their private bandwidth and storage to share cached builds.
Then we can use cache.nixos.org to manage the torrent links, then add garbage collection to the cloud storage to clean up old data.
I am sure there are many Nix fans who will gladly share their raw bandwidth and data storage to help the NixOS foundation out.

7 Likes
4 Likes

Does trustix work? Is it moving forward? I tried asking a question in their matrix a couple of weeks ago. After a couple of days all I had was a “:man_shrugging:” reaction…

All I wanted to know want, how I can use my machines as substitutes mutually and/or providing to a higher global network of trust.

6 Likes

Trustix and P2P don’t solve the “guaranteed storage” problem, i.e. the S3 replacement. We have signatures by trusted hydra infra (and they’re small). We have a CDN distributing all data wolrdwide and fast (for free).

9 Likes

Given the fact that one of the official solutions offered was to do garbage collection, I don’t think anyone should think cache.nixos.org’s S3 is a guaranteed storage solution.
It is for our convenience and ease of use.
If you want guaranteed storage, you should host your own copies of vital binaries.
If you are doing that, then please share with the community. (Unless they are just custom builds).
And it is my understanding, (someone correct me if I’m wrong), that if push comes to shove Nix can compile everything from source if it has to.
So as long as the source code/binaries are available on the internet somewhere, we will be fine in the long run. (I am talking about Linux kernel binaries and the like)
Finally, if a binary is in high demand and commonly used then generally speaking P2P solutions reflect that, so we avoid the Tragedy of the Commons.

3 Likes

This is mostly true, but the most important and vital part of the Nixpkgs infrastructure in my opinion is tarballs.nixos.org which hosts the tarballs for source code that has gone missing. Before nix build even touches the real internet to get the source code for derivations in Nixpkgs, it talks to tarballs.nixos.org. Thankfully, as I found out via Marsnix, the sources (inputs) are only 400~GB for a given revision of Nixpkgs, and there is tons of overlap between Nixpkgs versions. It should be easier to host this yourself, and I’d love to provide a mechanism. Imagine services.nix-mirror = { enable = true; revisions = [ "nixos-23.05" inputs.nixpkgs.rev ]; tarballs = true; outputs = false; }, but I have very little time to work on these things, since I’m getting paid to work on other problems instead of this one.

10 Likes

I think there might be some FODs that are in cache and not in tarballs.nixos.org? Would be nice to check, the move-FODs-to-tarballs was not always performed strictly regularly.

(Separately, right now Nix makes GC-pinning just the FODs your system needs impossible, and all build-deps as impractical as it can get…)

3 Likes