Hi Nix community, I would like to show you something that I’ve been working on this holiday:
Attic is a self-hostable Nix Binary Cache server backed by an S3-compatible storage provider. It has support for global deduplication and garbage collection. It’s easy to set up and scale, from one user (you!) to hundreds.
⚙️ Pushing 5 paths to "demo" on "local" (566 already cached, 2001 in upstream)...
✅ gnvi1x7r8kl3clzx0d266wi82fgyzidv-steam-run-fhs (29.69 MiB/s)
✅ rw7bx7ak2p02ljm3z4hhpkjlr8rzg6xz-steam-fhs (30.56 MiB/s)
✅ y92f9y7qhkpcvrqhzvf6k40j6iaxddq8-0p36ammvgyr55q9w75845kw4fw1c65ln-source (19.96 MiB/s)
🕒 vscode-1.74.2 ███████████████████████████████████████ 345.66 MiB (41.32 MiB/s)
🕓 zoom-5.12.9.367 ███████████████████████████ 329.36 MiB (39.47 MiB/s)
But why?
Currently, the experience of having a private Nix binary cache is less than ideal:
- Plain
s3://
store: Any machine interacting with the private cache needs its own S3 Access Key, with machines needing to push also having the private signing key available. Furthermore, there is no simple way to clean up an S3 cache based on access recency. - Plain
ssh://
store: Pushing and pulling both require SSH access to the machine. For pushing, the private key requirement still exists and the user also needs to be trusted bynix-daemon
. As a single-machine setup, it does not scale. -
nix-serve, eris, harmonia: These tools are able to perform signing on-the-fly, but they are still serving from the local
/nix/store
. Pushing still requires SSH access as a trusted user, and the setup is still single-machine and single-tenant. - Cachix: Provides a much better user experience than the others, with a sleek CLI client that allows users to push with a token. Signing can be managed centrally and done on-the-fly. However, it’s a SaaS service and cannot be self-hosted without a custom “Contact Us” arrangement.
I’ll elaborate later in this post. Before I lose your attention, however, allow me to convince you with a demo that you can try
Try it out (15 minutes)
Let’s spin up Attic in just 15 minutes (yes, it works on macOS too!):
nix-shell https://github.com/zhaofengli/attic/tarball/main -A demo
Simply run atticd
to start the server in monolithic mode with a SQLite database and local storage:
$ atticd
Attic Server 0.1.0 (release)
-----------------
Welcome to Attic!
A simple setup using SQLite and local storage has been configured for you in:
/home/zhaofeng/.config/attic/server.toml
Run the following command to log into this server:
attic login local http://localhost:8080 eyJ...
Documentations and guides:
https://docs.attic.rs
Enjoy!
-----------------
Running migrations...
Starting API server...
Listening on [::]:8080...
Cache Creation
atticd
is the server, and attic
is the client. We can now log in and create a cache:
# Copy and paste from the atticd output
$ attic login local http://localhost:8080 eyJ...
✍️ Configuring server "local"
$ attic cache create hello
✨ Created cache "hello" on "local"
Pushing
Let’s push attic
itself to the cache:
$ attic push hello $(which attic)
⚙️ Pushing 1 paths to "hello" on "local" (0 already cached, 45 in upstream)...
✅ r5d7217c0rjd5iiz1g2nhvd15frck9x2-attic-0.1.0 (52.89 MiB/s)
The interesting thing is that attic
automatically skipped over store paths cached by cache.nixos.org
! This behavior can be configured on a per-cache basis.
Note that Attic performs content-addressed global deduplication, so when you upload the same store path to another cache, the underlying NAR is only stored once. Each cache is essentially a restricted view of the global cache.
Pulling
Now, let’s pull it back from the cache. For demonstration purposes, let’s use --store
to make Nix download to another directory because Attic already exists in /nix/store
:
# Automatically configures ~/.config/nix/nix.conf for you
$ attic use hello
Configuring Nix to use "hello" on "local":
+ Substituter: http://localhost:8080/hello
+ Trusted Public Key: hello:vlsd7ZHIXNnKXEQShVnd7erE8zcuSKrBWRpV6zTibnA=
+ Access Token
$ nix-store --store $PWD/nix-demo -r $(which attic)
[snip]
copying path '/nix/store/r5d7217c0rjd5iiz1g2nhvd15frck9x2-attic-0.1.0' from 'http://localhost:8080/hello'...
warning: you did not specify '--add-root'; the result might be removed by the garbage collector
/nix/store/r5d7217c0rjd5iiz1g2nhvd15frck9x2-attic-0.1.0
$ ls nix-demo/nix/store/r5d7217c0rjd5iiz1g2nhvd15frck9x2-attic-0.1.0/bin/attic
nix-demo/nix/store/r5d7217c0rjd5iiz1g2nhvd15frck9x2-attic-0.1.0/bin/attic
That was easy!
Access Control
Attic performs stateless authentication using signed JWT tokens which contain permissions. The root token printed out by atticd
is all-powerful and should not be shared.
Let’s create another token that can only access the hello
cache:
$ atticadm make-token --sub alice --validity '3 months' --pull hello --push hello
eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJzdWIiOiJhbGljZSIsImV4cCI6MTY4MDI5MzMzOSwiaHR0cHM6Ly9qd3QuYXR0aWMucnMvdjEiOnsiY2FjaGVzIjp7ImhlbGxvIjp7InIiOjEsInciOjF9fX19.XJsaVfjrX5l7p9z76836KXP6Vixn41QJUfxjiK7D-LM
Let’s say Alice wants to have her own caches. Instead of creating caches for her, we can let her do it herself:
$ atticadm make-token --sub alice --validity '3 months' --pull 'alice-*' --push 'alice-*' --create-cache 'alice-*'
eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJzdWIiOiJhbGljZSIsImV4cCI6MTY4MDI5MzQyNSwiaHR0cHM6Ly9qd3QuYXR0aWMucnMvdjEiOnsiY2FjaGVzIjp7ImFsaWNlLSoiOnsiciI6MSwidyI6MSwiY2MiOjF9fX19.MkSnK6yGDWYUVnYiJF3tQgdTlqstfWlbziFWUr-lKUk
Now Alice can use this token to create any cache beginning with alice-
and push to them. Try passing --dump-claims
to show the JWT claims without encoding the token to see what’s going on.
Going Public
Let’s make the cache public. Making it public gives unauthenticated users pull access:
$ attic cache configure hello --public
✅ Configured "hello" on "local"
# Now we can query the cache without being authenticated
$ curl http://localhost:8080/hello/nix-cache-info
WantMassQuery: 1
StoreDir: /nix/store
Priority: 41
Garbage Collection
It’s a bad idea to let binary caches grow unbounded. Let’s configure garbage collection on the cache to automatically delete objects that haven’t been accessed in a while:
$ attic cache configure hello --retention-period '1s'
✅ Configured "hello" on "local"
Now the retention period is only one second. Instead of waiting for the periodic garbage collection to occur (see server.toml
), let’s trigger it manually:
$ atticd --mode garbage-collector-once
Now the store path doesn’t exist on the cache anymore!
$ nix-store --store $PWD/nix-demo-2 -r $(which attic)
don't know how to build these paths:
/nix/store/v660wl07i1lcrrgpr1yspn2va5d1xgjr-attic-0.1.0
error: build of '/nix/store/v660wl07i1lcrrgpr1yspn2va5d1xgjr-attic-0.1.0' failed
$ curl http://localhost:8080/hello/v660wl07i1lcrrgpr1yspn2va5d1xgjr.narinfo
{"code":404,"error":"NoSuchObject","message":"The requested object does not exist."}
Let’s reset it back to the default, which is to not garbage collect (configure it in server.toml
):
$ attic cache configure hello --reset-retention-period
✅ Configured "hello" on "local"
$ attic cache info hello
Public: true
Public Key: hello:vlsd7ZHIXNnKXEQShVnd7erE8zcuSKrBWRpV6zTibnA=
Binary Cache Endpoint: http://localhost:8080/hello
API Endpoint: http://localhost:8080/
Store Directory: /nix/store
Priority: 41
Upstream Cache Keys: ["cache.nixos.org-1"]
Retention Period: Global Default
- Local Cache : When an object is garbage collected, only the mapping between the metadata in the local cache and the NAR in the global cache gets deleted. The local cache loses access to the NAR, but the storage isn’t freed.
- Global NAR Store : Orphan NARs not referenced by any local cache then become eligible for deletion.
- Global Chunk Store : Finally, orphan chunks not referenced by any NAR become eligible for deletion. This time the storage space is actually freed and subsequent uploads of the same chunk will actually trigger an upload to the storage backend.
What just happened?
In just a few commands, we have:
- Set up a new Attic server and a binary cache
- Pushed store paths to it
- Configured Nix to use the new binary cache
- Generated access tokens that provide restricted access
- Made the cache public
- Performed garbage collection
Goals
With the quick demo out of the way, let’s talk about what I’m aiming to achieve with Attic:
- Multi-Tenancy : Create a private cache for yourself, and one for friends and co-workers. Tenants are mutually untrusting and cannot pollute the views of other caches.
- Global Deduplication : Individual caches (tenants) are simply restricted views of the content-addressed NAR Store and Chunk Store. When paths are uploaded, a mapping is created to grant the local cache access to the global NAR.
- Managed Signing : Signing is done on-the-fly by the server when store paths are fetched. The user pushing store paths does not have access to the signing key.
- Scalabilty : Attic can be easily replicated. It’s designed to be deployed to serverless platforms like fly.io but also works nicely in a single-machine setup.
-
Ease of Use: Newcomers and casual Nix users (think your teammates who you are trying to sell the
shell.nix
you just added to the repo to) shouldn’t need to wrangle with configuration files. - Garbage Collection : Unused store paths can be garbage-collected in an LRU manner.
Next steps (if you followed the demo)
Note: Attic is an early prototype and everything is subject to change! It may be full of holes and APIs may be changed without backward-compatibility. You might even be required to reset the entire database. I would love to have people give it a try, but please keep that in mind
For a less temporary setup, you can set up atticd
with PostgreSQL and S3. You should also place it behind a load balancer like NGINX to provide HTTPS. Take a look at ~/.config/attic/server.toml
to see what you can configure!
While it’s easy to get started by running atticd
in monolithic mode, for production use it’s best to run different components of atticd
separately with --mode
:
-
api-server
: Stateless and can be replicated. -
garbage-collector
: Performs periodic garbage collection. Cannot be replicated.
Coming soon
As an early prototype, what Attic can do is fairly limited. Here are a few things that are on the way:
- Better error reporting
- Metrics
- A lot more tests
FAQs
Does it replace Cachix?
No, it does not. Cachix is an awesome product and the direct inspiration for the user experience of Attic. It works at a much larger scale than Attic and is a proven solution. Numerous open-source projects in the Nix community (including mine!) use Cachix to share publicly-available binaries.
Attic can be thought to provide a similar user experience at a much smaller scale (personal or team use).
What happens if a user uploads a path that is already in the global cache?
The user will still fully upload the path to the server because they have to prove possession of the file. The difference is that instead of having the upload streamed to the storage backend (e.g., S3), it’s only run through a hash function and discarded. Once the NAR hash is confirmed, a mapping is created to grant the local cache access to the global NAR. The global deduplication behavior is transparent to the client.
In the future, schemes to prove data possession without fully uploading the file may be supported.
What happens if a user uploads a path with incorrect/malicious metadata?
They will only pollute their own cache. Path metadata (store path, references, deriver, etc.) are associated with the local cache and the global cache only contains content-addressed NARs that are “context-free.”
How is authentication handled?
Authentication is done via signed JWTs containing the allowed permissions. Each instance of atticd --mode api-server
is stateless. This design may be revisited later, with option for a more stateful method of authentication.
On what granularity is deduplication done?
Global deduplication is done on two levels: NAR files and chunks. During an upload, the NAR file is split into chunks using the FastCDC algorithm. Identical chunks are only stored once in the storage backend. If an identical NAR exists in the Global NAR Store, chunking is skipped and the NAR is directly deduplicated.
During a download, atticd
reassembles the entire NAR from constituent chunks by streaming from the storage backend.
Data chunking is optional and can be disabled entirely or for NARs smaller than a threshold. When chunking is disabled, all new NARs are uploaded as a single chunk and NAR-level deduplication is still in effect.
How are you hosting your instance?
My personal instance is running on fly.io with database provided by neon.tech. The object storage service is Cloudflare R2 which has no egress fees.
How does the client do all the work?
I implemented an async Rust binding to the C++ libnixstore
which allows the client to compute closures, lookup path metadata, and stream NARs directly to the Attic server. It’s a bit hairy but hopefully with something like Tvix all of this won’t be necessary anymore.