The NixOS Foundation's Call to Action: S3 Costs Require Community Support

Just for curiosity sake, how many paths are there?

I don’t know if it’s just me but occasionally (almost often) when I am writing derivations the build stucks at querying for the narinfo of the derivation I just built. It is stuck only in the situations where it checks for the narinfo of a path that is not in the cache.

Assuming it stays reasonably competitive, we should strongly consider whether we could pay our infrastructure team for this as well — the alternative of being okay paying (indirectly) AWS employees’ salaries, while not being okay to pay our own community members for the same purpose wouldn’t make any sense.

32 Likes

From the mentioned post:

2 Likes

Obviously, what I’m asking for, is to compute what are we willing to guarantee vs. what can we afford realistically.

If we have enough to pay for this level of durability, maybe via AWS sponsoring! (https://cdn-aws.deb.debian.org/ !) — why not!

1 Like

I’ve frequently thought that staging cycles might be expensive in terms of storage (build artifacts are mostly unused). Maybe there’s a benefit in splitting master/release branches and staging/glibc updates/etc to different storage spaces for easier garbage collection of development build cache.

8 Likes

At dotsrc.org we would love to help but have only ~17 TB of free space since our mirrors mostly run on decommissioned hardware and a relatively low budget.
I do not know what kind of storage other mirrors have, I guess it varies a lot (see TUNA Nix mirror), but I would be surprised if this route is feasible (especially on a 1 month deadline).

3 Likes

Another thing to keep in mind is that the NAR format is pretty wasteful on its own, no matter if we compress the NAR files or not.

A lot of the files inside a store path don’t actually change, and “exploding” the NAR file and the structure into smaller parts gives a lot of possibility for deduplication.

As part of my work on tvix, I’ve been spending quite some time on tvix-store, which is using a model similar to git trees/blobs (but not quite) as an underlying storage.

It is still able to produce NAR files on demand, just using another internal, content-adressed storage model for the underlying data.

We are already using all that logic in other places, but I also have some code wiring this up to a HTTP server handler, so Nix clients would still be able to “download NAR files” even though we don’t need to store them as such.

If there’s interest to explore this idea further, maybe start ingesting some store paths and see how much dedup we get, and potentially write some fetch-through migration tool, I’d be happy to discuss this further.

38 Likes

I know one thing at the time, etc.
However what do we do when GitHub doesn’t want to provide free services to us anymore?
Is there maybe another thread where we can have this discussion?

7 Likes

We already discussed migrating alternatives from GitHub, e.g. to Codeberg, IIRC, @davidak explored it. It didn’t scale.

When we will have this situation, we will cross the bridge, I believe. For now, I imagine that GitHub has an incentive to provide us free services in exchange for a user base.

A bigger disaster would be Fastly :stuck_out_tongue:.

7 Likes

Small update, we are continuing to work on multiple communications thread with AWS to figure out if there is a possibility to receive further support on the topic.
I’ve submitted us for the Cloudflare OSS program mentioned and will aim to have a conversation with the team there as soon as possible.
Let’s also aim to have a call to brainstorm and discuss anything on the matter early next week. I’ll post details over the weekend.

Again, thank you to everyone that is getting involved! <3

23 Likes

(Originally mentioned on matrix in a discussion with @RaitoBezarius)

Another consideration for the s3 migration would be to leverage AWS Snowball (Offline Data Transfer Device, Petabyte - AWS Snowball - AWS). It seems like it would help with the egress cost (~0.03/GB instead of ~0.09/GB) and not require the destination to have a large network pipe. Its essentially the modern version of a station wagon full of tapes.

Specs of the storage-optimized snowball (AWS Snowball Edge Device Specifications - AWS Snowball Edge Developer Guide) which we would need 2x of:

Storage specifications:
NVME storage capacity 	210 TB usable (for object and NFS data transfer)

Dimensions and weight specifications:
Weight 	49.7 pounds (22.45 Kg)
Height 	15.5 inches (394 mm)
Width 	10.6 inches (265 mm)
Length 	28.3 inches (718 mm)
5 Likes

I like the confluence of the two points around garbage collection and deduplication. Both have a chance to recover (potentially rather large) total storage space. Staging builds (and older builds from unstable, and a progressive list of others) are unlikely to ever be used, and could be garbage-collected. Lots of things deduplicate (and compress) well in an expanded store, as my own zfs-based instances demonstrate.

Storage provides like AWS undoubtedly use this for their own cost advantage, we should make sure we can take our own advantage of the data we understand best.

Doing either can get us some space back, and each can reduce the work needed for (or benefit available from) the other: stuff that deduplicates from a staging build was unchanged when that staging landed, for example.

The nice thing here is that despite this interaction, they’re not really in competition with each other, except perhaps in some ivory-tower view of wanting a perfect single solution. They can operate on different time-scales to provide practical benefit and shrink the problem to more manageable levels as things progress; we could collect some garbage now to reduce immediate storage costs and potential transfer/migration costs (and time) while more extensive storage format changes supporting dedup are developed and finalised. I can even imagine an approach where some of this historical data is archived off elsewhere for a while, gc’d from the expensive cache, and maybe reinjected again later.

Choosing which garbage to collect might be helped with some better data. We have a split of warm vs cold storage already, I assume that’s based on S3’s automatic migration, and it holds some clues (but there are caches in front, so regularly-used items might not get S3 activity). Do we have stats on what items are hit from Fastly, and a way to turn that into a view of which closures are pulling things in?

What is the actual value of historical builds, in the abstract (assuming we can identify and exclude particular items that are in current use for various accidental or deliberate reasons)?

There are a lot of more extensive changes along these lines that can benefit everyone, applying similar benefits to local stores and network transfers, as has been discussed before. Because of that, though, they will take longer, even if this situation gives some impetus to revive the effort. What happened to the content-addressible store work, for example?

6 Likes

Hey friends! Nix user and Chief Strategy Officer of Storj here. We’re somewhat like BitTorrent meets S3.

Storj is S3 compatible $4/TB/mo and $7/TB egress. This isn’t $0/TB egress of course, but might perhaps be a long-term sustainable option.

Migrations from S3 are free (via our partnership with https://www.cloudflyer.io/, see their site for details). Even though we are decentralized storage, there isn’t any funny business with new protocols or challenging technical adoption. We work with any S3 or HTTP clients.

We have 20,000 global points of presence, so you don’t have to worry about multiregion replication or content delivery. We handle that for you as part of the base product.

We are open source and would love to help support the Nix community. Could we help here?

Edit - Please feel free to hit me up directly @ jt@storj.io. Maybe we can find an arrangement that works for you!

22 Likes

Telnyx has an S3 compatible service. No egress fees and storage costs looks to be right at $1k a month for 425GB (Ignoring the contact us for better pricing button.)

Caveat. I know very little about the details on it. We have had great luck with using them for VOIP service but I have never looked into using their storage service.

2 Likes

I think the less likely but sustainable solution for the future is distributing the cache.
I bet that something like 95% of the users need a tiny fraction of the cache: the most recent derivations. If Nix had a built-in file sharing mechanism where you could “seed” your store to other machine on your LAN or even the internet (with some security considerations), this would substantially reduce the bandwidth of the binary cache.

22 Likes

I have a suggestion with respect to how to get it out, although it has the potential risk of costing even more money in the process: can we deduplicate the data by stuffing it into a content addressed store on top of s3 before pulling that deduplicated content address store out? It might be worthwhile looking at a random sample (also, how large would be necessary?) of the stores to see how much that would actually save before trying it at a full scale however.

4 Likes

Free solutions are Cool and Good however: I don’t have an understanding the scale of Nix in relation to cloud users in general, but it would probably be necessary to discuss with any platform we migrate to, especially if they aren’t huge, because the providers do in fact have costs, and the free solutions probably aren’t meant for a sudden ingress of 150TB and increasing, in an ongoing manner. :stuck_out_tongue:

3 Likes

I specifically remember seeing some EU related open science / reproducible research related information, which may be a string worth pulling on in the long term. I don’t know what to expect, maybe these solutions will not be a good fit for our requirements. Perhaps the NLNet connection can yield some information here?

A cursory search yielded this https://open-research-europe.ec.europa.eu/for-authors/data-guidelines#approvedrepositories , which has some lists of recommended data providers. Though I expect most scientists wont be using hundreds of terabytes either. :stuck_out_tongue:

These look superficially interesting

These also look superficially interesting from policy / community perspective:

Unrelatedly, I also found About RDA | RDA (might be a global thing).

6 Likes

Yes. Providers which credibly belong to the bandwidth alliance for the foreseeable future can be considered.

Scaleway also has an opensource sponsoring program, though its budget may be limited (presumably it’s possible to ask for more than the base credit of 2400 €, but not sure). Their object storage charges 12 €/TB-month, so it would seem to be significantly less expensive than AWS and R2.

Is a mix of the three possible? I mean, using the new storage for the new objects and the most frequently accessed objects (presumably the 27 or 91 TiB of store paths mentioned above), while keeping AWS S3 for the rest until the egress costs for it can be handled. (This doesn’t help much if the storage alone costs more than the egress.)

One provider which on paper offers such a service is BunnyCDN’s perma-cache. Put it between Fastly and AWS S3, get it populated gradually as objects are requested (meaning no egress costs except what you’d have had anyway, if people behave the same), then at some point use it as source for migration to the destination storage.

2 Likes

I personally can’t give you advice about the migration process to something cheaper. But I encourage people to think about the size of the repo. It’s currently 425 TiB of data that need to be served in high availability.

If you look at nixpkgs commit graph Contributors to NixOS/nixpkgs · GitHub it’s easy to see the activity increased drastically over the last few years. Even nixpkgs tarball size went from 25 MiB to 36 MiB in just one year!

In my opinion, the project must consider pruning some data from archives, it would be interesting to see the size of the biggest unused closures instead of pruning everything that is past a certain date or hasn’t been accessed for a while.

The current storage rate isn’t sustainable in the long run, except if you want to throw money away. This also has some ecological implications, the more you store, the more hard drives must be spinning (and it’s not linear due to redundancy).

29 Likes