Dear NixOS Community,
we write to share important news regarding an upcoming garbage collection process scheduled for end of February on cache.nixos.org.
This initiative is driven by our commitment to optimizing the repository and improving overall performance, while also addressing the substantial costs associated with the storage of our build artifacts. As part of our ongoing efforts, we aim to reduce these costs by implementing strategic measures such as garbage collection.
In this case, a garbage collection is the process of deleting store paths from cache.nixos.org. In the history of the cache, we always have taken an approach trying to preserve as much as possible all our history and all our store paths built once. (see below for details and implications)
Context: Navigating the Challenges of the Binary Cache
The NixOS Cache Working Group was formed in the fall last year with a bold idea: what if we could use chunk deduplication to reduce the costs of storage? Depending on the amount of deduplication achievable, this would allow us to reduce the ongoing costs, while also keeping the whole build history. Compressing/Deduplicating in place would also make it cheaper to egress from AWS in case we needed to change our hosting provider (for reference, the current extraction cost would be $25k).
The alternative was always that if the savings were not significant enough, we would have to resort to removing old build artifacts. We wanted to avoid doing so, because being able to checkout a 10 year old nixpkgs and still being able to run binaries from there without having to build is a neat property. Especially for scenarios such as reproducible science.
The efforts so far produced a lot of valuable tooling, such as a parquet index of all the NARInfo files, a bucket log processing pipeline, and better tooling to understand dependencies between store paths in the binary cache. All these tools are open-source, and we plan to make the data more available more widely in the near future too.
On the cost saving side, while the deduplication would reduce the amount of data, we unfortunately didn’t anticipate an additional point in the list of AWS charges - the cost of PUT requests to re-write the data in chunked fashion.
$0.0005 per 1000 requests is cheap, until you have 800M objects. If you also use per-blob chunking to deduplicate similar files, this number could easily be multiplied by 4-8.
Factoring in these costs, we’d be in a range that is higher than the extraction costs just to deduplicate things in-place, greatly diminishing the immediate benefits.
While this could have been worked-around by yet more engineering effort (“packing” chunks together and providing an index), we decided against it at this time, and instead to proceed with plan B. This is mostly to relieve some pressure, as the foundation currently has to pay an additional $6,000 and rising per month (on top of the $9000 cost offset by Amazon). This should give some more headroom for further cache compaction mechanisms.
Garbage Collection policy and implications
The goal is to remove enough objects to reduce costs to a sustainable level, while minimizing the impact for people using supported NixOS releases. We will have a more precise policy down the line.
As part of this work, we want to preserve all fixed output derivations to the best of our knowledge. That means that old nixpkgs checkouts might rebuild, but the source code needed will still be available from the cache. This is an important details as old projects often disappear.
We also want to keep store paths from current NixOS release channels (but might prune store paths that are part of older channel bumps of ancient NixOS releases).
If you are not using ancient store paths, you should not be affected by those operations.
If you are using ancient store paths, you might have to rebuild them from scratch. Even if the upstream sources were deleted, due to the fixed output derivations being kept in the cache, you should be able to rebuild.
Technical details of the process
We are currently working on a script/policy w.r.t. what to delete and what to keep, and will adjust parameters depending on the storage (and costs) saved.
As described above, we want to preserve all store paths that have no references (in the NAR jargon), which is a superset of almost all fixed output derivations.
We also want to trace the list of NixOS channel bumps and the store paths contained therein, and keep them for active releases (and keep the store paths for the last channel bump for each EOL’ed release).
Call to Action: Save Your Data Responsibly
-
If there is specific data on cache.nixos.org that you wish to be preserved, we kindly request you to reach out to us directly at foundation@nixos.org.
We are more than willing to assist you in safeguarding the data without causing unnecessary strain on the cache. Pulling the data from the CDN without coordination can lead to cache thrashing, decreased quality for everyone, and increased operational costs for us, which we aim to avoid. -
If cache availability is important to you, please also consider donating to https://opencollective.com/nixos/projects/s3-cache-long-term.
We want to thank all the individuals and companies that donated already.
We also want to thank Amazon that are working with us to offset the invoices by $9000.- every month.
Stay Informed
We will provide regular updates on the progress of the garbage collection in the Announcements category on Discourse.
Thank you for being an integral part of the NixOS community.