The NixOS Foundation's Call to Action: S3 Costs Require Community Support

Wow that really seems like a great solution.

We will cover egress and migration fees for customers migrating over 10 TB of data from US, Canada, and Europe regions, and storing it with us for at least 12 months.

So that’s already 32k$ cheaper than most alternative egress strategies.

And by Backblaze’s own calculation, costs could be reduced from 9000$/month to 2000$/month.

Though if we can get a cloudflare OSS sponsorship, the egress costs might amortize after 16 months, if “sponsorship” means storage is 100% free.

7 Likes

I’m with @vs49688 : we probably want to hold all source archives and patches, but in general, we will have to be very careful when doing this. I’m sure there are other ways the cache has cached derivations from nixpkgs which cannot be rebuilt from scratch. (It took a long time before people noticed that fetchZip/postBuild changes broke many fonts).

If you set up a VPC with an S3 Gateway Endpoint (free), you then get free transfers for S3<->VPC. So you could shove a bunch of machines in there (which you still pay for, of course) to do this without paying to egress the entire contents of the bucket.

Every time I’ve looked at S3 Intelligent Tiering in my own work, the $0.0025 per 1,000 objects automation fee makes me nervous. According to @edolstra, there are 667M objects in the cache.nixos.org bucket, so you’re paying $1667.50/month in automation fees, and 3/4 of the bucket is already in Infrequent-Access tier by some mechanism or other. So Intelligent Tiering needs to move a lot of stuff to smarter storage classes to come out ahead (or we only turn it on for large NARs, or something).

3 Likes

For this to stay unnoticed, you need fixed-output derivations anyway, though (and I don’t believe we have any better alternative for finding all-that-is-fetched than FOD).

2 Likes

I’m aware of some mirrors of hydra, which is very helpful. It should be easier to set up a mirror, and it should actively be promoted. There are more concerns than just storage, such as DNS Blocks, which effect reproducibility (another benefit of using something like bittorrent/ipfs as a substitution mechanism)

https://github.com/NixOS/nixpkgs/issues/32659

https://nix-mirror.freetls.fastly.net/

2 Likes

This seems like the most reasonable option, and it has free egress bandwidth as well, a huge win.

2 Likes

I’m well versed with cdn’s but shouln’t it be enough to use nginx proxy_pass https://cache.nixos.org with proxy_cache of some size?

1 Like

Pretty much, here’s an example - https://github.com/nh2/nix-binary-cache-proxy/blob/b144bad7e95fc78ab50b2230df4920938899dab0/nginx-binary-cache-proxy.nix#LL44C17-L44C17

2 Likes

@domenkozar mentioned on Matrix that he had to migrate Cachix away from B2, so I can understand some hesitancy to move cache.nixos.org onto it.

5 Likes

Can you explain the reason why for those of us who weren’t present for this discussion?

2 Likes

No, I wasn’t present either. Found this message from Domen in the scrollback:

Based on my experience, I’d avoid blackblaze.
Cachix used to use it many years ago and it was too unstable.
Might have changed, but I wouldn’t risk it.

7 Likes

I host a website on IPFS (I also ran a few ipfs nodes manually), it’s been quite unstable overall and generally slow (which is th reason Dhall moved away from it as well, see Use GitHub as official source of Prelude? · Issue #162 · dhall-lang/dhall-lang · GitHub). However this was 5 years ago, if IPFS actually works readonably now I’m all for this proposal. Different parts of the cache can be hosted on different systems, and I already sacrifice my ssd to the nix gods, might as well contribute whatever cache I have to help with hosting costs. I wonder if anyone tested it enough to see if it works well in practice.

5 Likes

Two more things came up w.r.t to the costs in case we do the migration:

  1. Since we use S3intelligent tiering, the cost for migration will be significantly higher than anticipated, since there’s an extra cost for each retrieval.

  2. If we can use fastly endpoint for the migration, unknown amount of cost is saved since we’ll get entries from the cache.

10 Likes

There have been many suggestions here but it seems like feedback from those that make decisions is missing (or at least I can’t find it).

How/where can we follow the decision making process? Are there calls we can listen in to? Maybe even recordings thereof as the community is worldwide and thus not in the same timezone.
Maybe due to the spread out nature of the community decisions are made async? (That’d be awesome BTW) If so, where do we follow this?

Maybe some people here work in the industry or have some server space and bandwidth they could share, but aren’t jumping in as things seem quite opaque. It’s even possible that a decision has already been made, so somebody with a solution and just getting back on the forum isn’t posting due to that assumption.

Where do we/you stand?

3 Likes
6 Likes

Hello , I have a few ideas about this situation. Check out Linode, very cheap and can reduce cost data transfer.

I have a question. Do peoples /nix directories also contain this information in a useful form to recreate parts of the cache? If so, perhaps that could be used in a scenario where there is a desire to recreate the cache with less egress costs? There could be a tool that given a list of wanted derivations checks what is present and uploads that.
Hashes could be computed and compared to validate the integrity, and eventually only what could not be retrieved from the community need to be exported.

Cumbersome and hopefully unneccessary but possibly an option?

2 Likes

Yes, they do, trivially. But I suspect that vast majority of the paths won’t be on people’s disks (nor in Fastly’s), due to the heavy-tailed nature.

9 Likes

I hope this doesn’t rub anyone up the wrong way but a short term solution to fundraising may be to add some perks to the donation subscription levels.

By this I’m really talking about “token” rewards like… an @nixos.dev / @nixos.org email address, a flashy “Contributor” title which appears on this very board. Essentially some kind of gamification may encourage more people to donate.

Obviously just raising more and more money isn’t viable long-term but other more promising solutions like IPFS can be the slightly longer-term solution. Right now it seems like the aim is just to keep things ticking over financially until a more permanent plan can be implemented.

6 Likes

Thanks for adding this in!
This would also be a relevant topic as we go into the topic of a general yearly Nix fundraiser and a more structured sponsorship model that can help create further incentive to a reasonable extent.

1 Like

Selling “Contributor” titles and email addresses sounds like a sure way to alienate the “code-contributors” and create a pay-to-play community.

21 Likes