Niks3: 1.5 release

Niks3 is a binary cache server for nix that uses s3 as it’s baking store.

Highlights of this release

Thanks to @PhilipTaron we have now support for a background daemon, that will use post-build-hook in nix to automatically upload derivations (Auto Upload · Mic92/niks3 Wiki · GitHub, @philiptaron)

Thanks to @lovesegfault we now have improved github action integration. The integration is now on the same level as other action like the cachix action: it will automatically upload all build derivation in the background and flush when the github action is done: GitHub - Mic92/niks3-action: GitHub Action for pushing Nix build outputs to a niks3 binary cache · GitHub

Thanks to mannp we also now have support for mtls as an alternative to token authentication.

The whole changelog is available here: Releases · Mic92/niks3 · GitHub

9 Likes

Any plans for deduplication like attic has?

The goal of the project is to have the cache be directly readable by nix without having to rely on the server, because high-available s3 storage providers are a commodity - having to set up postgres cluster on the other hand is not so trivial. Also I still see reports for attic about having very slow uploads: Celler: An Attic Fork - #19 by greg-hellings This also matches my own experience from when I tested it. I rather take performance/high-availability/stability over storage space. S3 storage is dirt cheap for projects I have used so far, given you have a garbage collection to be able to garbage collect eventually. Downloads speeds of 10-20mb/s don’t sound great. With cloudflare r2 I can get easily gigabit speed and more.

5 Likes

I’m curious how much deduplication happens and the cost of the excess storage.

I think “ease of installation” is easily the biggest driver for binary caches right now.

I look forward to trying it out, thanks a lot for all this great software. Joining the Nix sphere truly feels like my brain was already uploaded and has been doing independent work for a number of years before my consciousness caught up.

1 Like

dedupstats.md · GitHub I run this on the nar level some time ago for some datasets, which didn’t produce a lot of savings. But I heard you get better numbers if you only do it on file contents.

2 Likes

Since I already have S3 compatible up and running in my infrastructure, this seems like a no-brainer to switch over. I’m curious if you or anyone has a straightforward script to help with the switchover. I’m not eager to possibly lose the history in my Attic cache. Knowing how to pull a list of all items in a cache it should be easy enough with a `nix copy` command, I imagine.

I am mainly use niks3, because I can make S3 someone’s else problem and downtime often sucks a bit once people actually rely on it. Not judging in any way but I am curious what makes you want to switch, given you already invested time in setting up the current solution.

Back to your question, I had to switch S3 providers within niks3 in the past and it can take a while to copy all those small narinfo files. If I had to do this for myself I would probably not bother and just push to both caches for a little while and than burn the old one down. Maybe the project will have a migration plan eventually, I had some business that might want to switch over. Nix copy could potentially a bit slow, also newer nix version have better code for this. I already made the niks3 code as parallel as possible. For bigger caches it might need some deeper integration to get reasonable performance.

I’m the only person using my setup - it’s just for my home lab. But I have 7 NixOS hosts, a half dozen container images that I use, two nix-darwin configurations, and I’m about to add a dozen more NixOS hosts. Since MinIO is already in use in my infrastructure for Git LFS, caching ISOs that I use for VM builds, caching Git actions, and doing several backup targets, the S3 is already available for me. And I don’t have anyone else relying on it who will get mad if it is inaccessible.

The Attic infrastructure is just running a container image on my NAS, then enabling the systemd unit on my build systems. So there wasn’t a whole lot of setup involved. The most effort went into patching the Attic client app to handle multi-gig file uploads.

As for the pros to niks3 that I see: downloads from Attic are about a 50 to 1 ratio for me. Downloading at 100Mbps speeds means that many of my builds fail due to nix-buildbot timeouts when downloading the large Zim files I mentioned in that cellar/attic thread. If I could get closer to line speed on those from the MinIO instance, it would be a minimum of 10x speed increase, 25x on most of my hosts, and 100x on the newer hosts I’m installing.

Attic is actually running slower than my WAN connection (1Gbps), meaning that my effort to not repeatedly pull from the public infrastructure with more than necessary is giving me a 90% speed decrease over pulling from cache.nixos.org every time.

Sorry if my response was a bit rambling. Basically: S3 is already inside my firewall, no one else is depending on this, and my nix-buildbot runs will probably see a 25-100x speedup by moving off of Attic as the cache. So, for me, it seems like a no-brainer to move to niks3 since Attic has gone to abandonware.

It doesn’t seem impossible to build a ca store with good throughput. I benchmarked @lovesegfault ‘s rio-build yesterday and was actually surprised that it was faster than minio also it was using minio as a baking store…: feat(rio-store): end-to-end ingest/upload throughput benchmark by Mic92 · Pull Request #27 · lovesegfault/rio-build · GitHub

1 Like

Would it make sense to be able to do something like this (with a private r2 bucket):

  • when I’m home, my laptop fetches from my niks3 home server (with nginx cache)
  • when I’m not home, my laptop fetches from my niks3 home server but gets a pre-signed URLs and downloads the big files directly from R2

I’m not sure if it would cause problems with the local nix cache.

1 Like

Maybe we want local network discovery for caches same way guix does it?

1 Like