Quick Recap:
As we need to expedite our timelines in optimizing our S3 Cache, the NixOS Foundation is supporting a “Phase 1” effort via initial funding of 10,000 EUR.
The total estimated project funds are 30,000 EUR and we are also announcing the Open Collective Project for those that wish to donate and support the effort!
We hope that this effort also helps us learn how to go about allocating funds to the community and if you’re interested in partaking in the wider community discussion about that please do reach out and/or visit NixCon Governance Workshop - Announcements - NixOS Discourse.
Initial Background & Prior Recaps
After we reached a short term resolution for our S3 binary cache situation (NixOS S3 Short Term Resolution! - Announcements - NixOS Discourse), a number of awesome folks across the community stepped up to begin researching a long term solution.
The community members involved have been sharing updates on progress and discussion. (add links to github repo as well)
About a month ago, an ad-hoc working group was formed following the creation of the issue Garbage-collect cache.nixos.org · Issue #282 · NixOS/infra · GitHub. The first meeting happened on the 24th October (notes and further notes can be found in: 2023-10-24 re: Long-term S3 cache solutions meeting minutes #1).
A team composed of @zimbatm, @edolstra, @RaitoBezarius , @flokli and @edef quickly formed to answer multiple questions, notably the cache’s bucket growth per year and our need to reduce our cache footprint in AWS.
From this team’s activities, two solutions emerged:
- Garbage collection of historical data
- Deduplication of historical data
The team found that the cache’s bucket growth per year was increasing, implying that a brutal garbage collection would be a short term solution, one that causes us to lose historical data and buys us an unknown amount of time. Our estimates suggest it would only buy us 6 months to 2 years, depending on how Nixpkgs needs evolve, with estimates suggesting roughly 1 year.
After this, there was an initial agreement to prioritize the deduplication solution and only perform the garbage collection as a last-resort measure, if needed at all.
During the following month, @flokli and @edef, in charge of the deduplication solution, worked in their free time to build a bespoke set of tools to analyze cache.nixos.org. For example, they created. a fast .narinfo
parser that imports all of the data into Clickhouse to perform various analytics and guide the solution. Many more examples of the team’s tooling can be found in the detailed notes.
The team looked at:
- Fastly logs
- S3 bucket logs
- A SQLite database that @edolstra provided which contains a mapping of channel bumps to store paths (and their size)
Data analysis is still ongoing and is focusing on questions like “what would be the request rate to cold paths that would be deduplicated?” that might shed light on the scalability of the reassembly component, i.e. the piece of software responsible for reassembling the deduplicated pieces into a full NAR that gets cached by the Fastly CDN.
In the meantime, we discussed the potential gains might see with deduplication, and it is hard to answer. In the past, projects like nix-casync
provided deduplication down to 20% of the original uncompressed size.
In the past weeks, @flokli and @edef have started tuning the fast content-defined chunker parameters to determine the optimal parameters with respect to the cache data.
To do so, they took multiple channel bumps, ingested them on a server in the same AWS region as our bucket, read through all NARs of a given channel bump, uncompressed and decomposed the NARs, fed all blobs into a content-defined chunker, and recorded the deduplication metadata chunk length (compressed+uncompressed) and digest. The actual chunk data itself was not stored, as this process was mostly there to find good parameters, balancing chunk size and compression possibilities (i.e. when chunks are smaller, the compression context window is smaller and compression performs worse, but we get a better chance to deduplicate).
During that process, they uncovered that xz decompression is a big bottleneck while ingesting the existing NARs. After ingesting 3 channel bumps separated each by the months, the recorded total size of the data was 71% of the original compressed size (including metadata). Note that these numbers are just small samples of the entire dataset, using one picked chunk size and store paths further apart in time than we usually have in channel bumps, constructing a bad-case scenario. Adding another two channel bumps, each two weeks apart brought the recorded size down to 65% of the original compressed size. One of the next steps are to further explore the parameter space to figure out which ones make sense for cache.nixos.org as a whole.
Next Steps & Initial Plan
As we approach the halfway mark of our 12-month AWS funding (9,000 USD/month), the urgency for a sustainable solution is higher, especially given the significant growth in our cache expenses (November charge 13,728 USD split between S3 storage - 8,696.88 and data transfer - 4,776.67).
We’re excited to announce a major step forward: an initiative led by @flokli and @edef, targeting an accelerated milestone for the long term resolution plan:
- Cache analytics: support technical decision-making on where to deploy things with respect to our needs for performance (latency, parallel requests, etc.)
- Deduplication analytics: explore the parameter space of the chunker and data structures for metadata.
- NAR reassembly: extend “nar-bridge” to support operating where NAR reassembly would happen (AWS Lambda or Fastly Compute@Edge), and to support the used storage model
- Enablement: extend Fastly 404 handler to reroute historical data to this new S3 store and delete the old data from the main S3 binary cache
As timing is crucial on this project, the NixOS Foundation will fund the first 10,000 EUR to enable us to get going. In parallel, we plan to open up an Open Collective project to raise the remaining ca. 20,000 EUR for those who want to take part in helping make progress on our S3 cache needs.
If you want to support the project please visit the Open Collective project page or reach out to us at the foundation (foundation@nixos.org).
Background on Funding/AWS and Additional Context
As cache.nixos.org serves as a critical resource to the community, it is essential for the Foundation to empower active contributors to expedite work on critical areas via funding. In this case, deduplication efforts can significantly enhance the efficiency of cache.nixos.org. This, in turn, can lead to additional benefits such as supporting contributors relying more on the cache, e.g. debug symbols. This task is difficult and requires prior expertise with how Nix has been storing things in the past and knowledge of state-of-the-art solutions that are better at storing things without compromising on performance. Furthermore, all of this has to be done in a tight timeline without compromising the integrity of the data manipulated.
In this instance, the Foundation will be investing into the deduplication group to provide a sustainable solution to the cache’s size. We considered multiple alternatives, such as:
- Performing garbage collection
- Removing more aggressively staging data
- Removing the copies of NixOS ISOs published on releases.nixos.org
- Using off-the-shelf community software such as GitHub - zhaofengli/attic: Multi-tenant Nix Binary Cache
As mentioned above, some of those solutions may only buy us a short amount of time and will require another intervention. Other solutions may significantly degrade the contributing experience in Nixpkgs, since active contributors rely on the presence of the staging data to perform large-scale bisections and root cause analysis of the whole ecosystem ( https://git.qyliss.net/hydrasect/tree/README is an example of such a tool). Finally, in regard to existing tooling, we uncovered that such tools would not necessarily have an optimal out-of-the-box performance, and we have not dug into them to gain familiarity with all the failure modes they could exhibit.
Why does this matter? A wider recap can be found in the NixOS Discourse (The NixOS Foundation’s Call to Action: S3 Costs Require Community Support - Announcements - NixOS Discourse)
- Cost Efficiency: Reducing our cache size directly impacts our ongoing expenses, making our operations more economical.
- Future-Proofing: As Nix keeps growing, we need to deploy long term strategies that can keep matters as sustainable as possible. This will also help in demonstrating our commitment to sustainability which strengthens our case for continued AWS support or other potential partnerships.
- Community Involvement: We’re inviting more hands and minds to join in. Your participation, be it through volunteering, funding, or sharing ideas, is critical.
We want to thank everyone again for rallying around the S3 and Cache needs. Please reach out, participate, and share feedback whenever you can. https://matrix.to/#/#binary-cache-selfhosting:nixos.org
We’d like to extend a heartfelt thanks to the Open Source teams at AWS for their support and funding. Their assistance has provided us with the essential time and space needed to thoroughly explore and address these challenges at a manageable pace.
This announcement was written by @edolstra @ron and @zimbatm with the help of @RaitoBezarius and @fricklerhandwerk.