I tested nix-casync
(rev 25eb0e59e23fd580187cab3c8e7860d9c0044e0c
) on a bunch of nar
files from nixbuild.net. In nixbuild.net, we store all nar
files in ZFS pools, using the builtin lz4
compression in ZFS (with default settings). I compared how nix-casync
performs (storage-wise) to that, and also tried out how the zstd
compression of ZFS performs (by copying all nar files into a new, zstd-enabled, ZFS pool).
I didn’t look into the performance in terms of throughput and CPU usage, because I ran most things with nice/ionice and also ran multiple disk-intensive tasks at the same time as the nix-casync
ingestion. I know (from the HTTP log of nix-casync
) that many nar files took multiple minutes to ingest. It would probably be interesting to run a performance test under more controlled circumstances, especially against a large existing chunk store.
This was the data set i tested with (all “real-world” nar files, a subset of the nixbuild.net store):
| | |
|—|—|—|
| number of nar files | 671652 |
| total nar file size (MB) | 8012630 |
After ingesting this into nix-casync
, I ended up with 34219994
chunks in the chunk store. I then summed up the sizes of all zstd-compressed chunks, and also of the uncompressed sizes of all chunks to get a feeling for how much deduplication vs zstd-compression matters.
size in MB | compress ratio | |
---|---|---|
zfs+lz4 | 3816761 | 2.10 |
zfs+zstd | 2978241 | 2.69 |
nix-casync (uncompressed) | 2487089 | 3.22 |
nix-casync (zstd) | 1223264 | 6.55 |
As you can see, nix-casync
managed to achieve an impressive compression rate of 6.55
!