The NixOS Foundation's Call to Action: S3 Costs Require Community Support

Yes, I am a fan of hashing both inputs and outputs. I think a centralized trust db is just what we need. An API validator for all publicly shared builds. So that the trust db can be queried by any Nix system wanting to confirm a build’s validity.
A hash repository shouldn’t take too much storage/bandwidth.

1 Like

Using AWS at beginning was mistake. There are much cheaper options out there

1 Like

AWS at the beginning was literally free (for us). It’s as cheap as mathematically possible.

5 Likes

I read a few people suggesting IPFS, if there is any path for it, Scaleway just opened an IPSFS service: /en/ipfs-pinning/

1 Like

Hey Adam,

If you require an official working solution instead of a P2P system to reduce bandwidth cost, then Protocol Labs can help you. I am working on the Decentralized CDN, called Saturn, which has been built by Protocol Labs. We can help you reduce your retrieval cost by around 60% with a very powerful community-run CDN (>2500 nodes with 10 Gbit/s connections globally). As long as your origin files are content-addressed and have a CID, they are automatically available via Saturn if announced on IPNI or DHT using IPFS.
As it seems that your bandwidth cost are much higher than the storage cost, a first step could be to make the origin data content-addressable (you could even keep them on AWS for now) and then retrievable via Saturn.

4 Likes

Absolutely. If you use IPFS and store a build in a CAR file, then all the blocks will be deduplicated and can be automatically served by the Saturn CDN which cache misses to any IPFS or Filecoin servers.

2 Likes

Hmmmm. I think paying the AWS bill sounds better than using a Ponzi network.

3 Likes

I am for stoj as an option, could it be possible for people to run stoj nodes to help fund NixOS as well? Say people have nodes they could host but can’t donate the money directly? We could even make a NixOS module that enables this and sets disk size and bandwidth limit. This would reduce the difficulty to contribute.

Amendment: For note, I’m merely suggesting ways this could be made an even more feasible option.

3 Likes

It could actually be interesting in more ways than one, though due to the relative urgency of the matter the mode of “direct S3-compatible replacement” could IMHO be feasible first, especially if Storj were to give favourable conditions.

For the far(ther) future, one could technically consider for Nix (Foundation) to host their own satellites (coordinators of a storj network), and each nix user could (on a voluntary basis) run a modified storagenode that would serve part of the /nix/store (i.e. the ones that are “commonly indexed”) and in parallel storage space that the volunteer contributes and where other peers would upload more “historical” store contents.

Of course the latter idea would require significant implementation effort, but the idea that many participants would by default have a relatively “modern” nix store would be extremely fitting since the majority of traffic is likely generated for upgrades/maintenance, and this part would be by default peer-to-peer when using storj (or a derivative). This would also basically be a “self-seeding” system, in that “the first users of new packages” would get a cache-miss and build it, and after say x (some tens of) people have done so the availability of the package at those peers will be known to the satellite and there would be a cache hit and the package would be downloaded in parallel in shards from mentioned x people.

For (very) old nixpkgs the volunteered extra space would come in, which would be pushed by the nixpkgs cache maintainers (automatically) via storj uplink to said peers, and be accessible in the same way as in the above example.

The modification of storagenode would be necessary to make sure the local /nix/store is published to the satellite in the same way the storagenode’s own storage (where other peers store data) is.

A further precondition for this whole concept to work is that the volunteer’s system has to be permanently online and have a “flat-rate” or at least significant monthly bandwith.

3 Likes

And hopefully symmetric bandwidth. Lots of people (like me) have, say, 400Mbps down and 100Mbps up (which, of course, would be serving the files).

1 Like

Hopefully, sure, but that’s just a nice-to-have, and I think a large majority of (home) users doesn’t have symmetric. But I’m a pretty heavy duty user (small team dev infrastructure supporting remote work) with asymmetric (250/40 !) and it all works decently. But indeed the proposed concept would work well/better with a significant number of users (thousands), which should spread upload demand well enough.

2 Likes

Perhaps tailscale and/or tor could be leveraged as well.

Approach governments. French have rolled out Nixos in some education establishments. Russia has banned Windows. Must be state players and educational institutions who see the importance of Nixos. You could offer a teaching resource in exchange for support. Try Nigeria or somewhere. (I am just throwing it out there)

5 Likes

I was just fed an ad for this service. Might be an option. No egress fees. $0.01/GiB/month for 100TiB to <1PiB.
Not sure about public access, etc.
https://www.rsync.net/pricing.html

1 Like

Not sure about a plan for the egress costs to move the data, but Wasabi is a great S3 storage solution for hosting the data. Its $6.99 per TB per month, which at 400 would be roughly $2300 a month.

Wasabi doesnt charge for egress or API calls

For the hosting side, F4 Network and ByteHosting, both offer better hosting prices than Linode, D.O, etc:

https://store.f4.network

Hope this information helps.

1 Like

I haven’t seen much talk in this thread about growth. What can be done to make less data in the future?

Could there be weekly or monthly archived releases of the packages in the rolling channels, with everything in between being thrown away?

I wonder if there’s any deduplication possibilities left with the archival data. Maybe some new file format could be created that’s just a bunch of pointers to binary chunks, that the client can reassemble, and old stuff could be converted over time.

I am a big fan of BitTorrent in general, and it could solve some user side issues even if it doesn’t address storage.

Do you have any reference to the above?

Hi Jos,

Well I saw an article on it; I’ll have to search a bit for it though.
This might stir some interest though:

https://lafrenchtech.com/en/

1 Like

Why not just use Backblaze B2 for object storage? They have an S3-compatible API and their pricing is insanely good: Cloud Storage Pricing Comparison: Amazon S3 vs Azure vs B2 (backblaze.com).

1 Like

For the amount of storage NIX uses, Wasabi S3 would be better suited as their pricing would be $72 per TB per year compared to Blazes $180 per TB per year. Wasabi doesnt charge for egress or API usage.

So, with the amount of storage NIX currently has @ 425 TiB:
Wasabi - $30,549/yr
BB - $76,500/yr

that large amount adds up significantly. And this reply isnt to bash anything, just merely showing the advantage wasabi would have for them as their s3 storage, and even for anyone else needing s3 storage.