Nix-casync, a more efficient way to store and substitute Nix store paths

rickynils · December 21, 2021, 7:41pm

I tested nix-casync (rev 25eb0e59e23fd580187cab3c8e7860d9c0044e0c) on a bunch of nar files from nixbuild.net. In nixbuild.net, we store all nar files in ZFS pools, using the builtin lz4 compression in ZFS (with default settings). I compared how nix-casync performs (storage-wise) to that, and also tried out how the zstd compression of ZFS performs (by copying all nar files into a new, zstd-enabled, ZFS pool).

I didn’t look into the performance in terms of throughput and CPU usage, because I ran most things with nice/ionice and also ran multiple disk-intensive tasks at the same time as the nix-casync ingestion. I know (from the HTTP log of nix-casync) that many nar files took multiple minutes to ingest. It would probably be interesting to run a performance test under more controlled circumstances, especially against a large existing chunk store.

This was the data set i tested with (all “real-world” nar files, a subset of the nixbuild.net store):

| | |
|—|—|—|
| number of nar files | 671652 |
| total nar file size (MB) | 8012630 |

After ingesting this into nix-casync, I ended up with 34219994 chunks in the chunk store. I then summed up the sizes of all zstd-compressed chunks, and also of the uncompressed sizes of all chunks to get a feeling for how much deduplication vs zstd-compression matters.

	size in MB	compress ratio
zfs+lz4	3816761	2.10
zfs+zstd	2978241	2.69
nix-casync (uncompressed)	2487089	3.22
nix-casync (zstd)	1223264	6.55

As you can see, nix-casync managed to achieve an impressive compression rate of 6.55!