I built a Nix binary cache backed by Git (82% storage reduction)

I recently explored the structural similarities between Nix and Git. This led me to build Gachix, a decentralized binary cache that uses Git internals as the backend.

I wrote a blog post detailing the design, the mapping of Nix stores to Git objects, and benchmarks against tools like harmonia and nix-serve.

https://www.ephraimsiegfried.ch/posts/nix-binary-cache-backed-by-git

Some key results:

  • Storage: Achieved an ~82% reduction in size compared to a standard Nix store due to Git’s deduplication and compression.
  • Latency: Achieved the lowest median latency for retrieval, though average performance lags behind due to some outliers with large files.
  • Decentralization: Because it’s Git, you get a replication protocol for free.

I’d love to hear your thoughts on this!

16 Likes

Interesting idea!

If I understand correctly, gachix serves .narinfo files, does it also support serving debuginfo/{buildid} files like binary caches created with the ?index-debug-info=true option?

Also did you compare size to a “live” nix store or to a compressed file:// binary cache? the latter is compressed with xz which I expect would offset the size benefits you see.

Is the reduction due to compression? You can compare to a compressed binary cache store (can use xz, zstd, various levels, etc) versus a local store fully expanded into usable form.

That would help distinguish the benefits between compression and deduplication.

Also worth knowing about this comparison with a local store that has done the hard-linking optimization.

That ratio is better than just compression. I compress my stores with ztsd:3 and get ~2.25-2.5 on the ratios, which is about half of what he achieved. I’m curious about this one as well, and interested to see more.

3 Likes