nix store is written maybe 99% of times from cache and my internet vs disk speeds are 1.x orders of magnitude in favor of disk. Unless deduplication slows down more than 10x, it’s going to be helpful for me.
dedup is almost never worth it, but nix store is significantly different than most other use cases.
I would, however, recommend using compression. I get compressratio’s around 1.6 to 2. Which is definitely nice for users who use spinning disks, as it essentially doubles your I/O bandwidth
sudo zfs get all <MYPOOLNAME> | grep compressratio and sudo zpool get all <MYPOOLNAME> | grep dedupratio
I don’t know how to check zfs ram usage but would you happen to know how it has changed after enabling dedup? (I got the sense from skimming the linked article that a rule of thumb would be 1/200 of used size of the dataset)
I just migrated my old installation to new one on zfs.
Old- ext4, new zfs pool with dedup=on and compression=on
old:
du -sh /nix/store : 46GB
new (after only copying /nix/store, not actual installation):
➤ sudo zfs get all tank/nix | grep compress
tank/nix compressratio 1.84x -
tank/nix compression on local
tank/nix refcompressratio 1.84x -
➤ sudo zpool get all tank | grep dedup
tank dedupditto 0 default
tank dedupratio 1.70x -
➤ zpool list tank
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
tank 113G 17.1G 95.9G - - 10% 15% 1.69x ONLINE -
Goes without saying that dedup is pretty useful (only if you don’t use store optimization, if you do use store optimization then dedup wins you nothing). That said, my computer was unusable for 3 hours for just 46 GB. It’s an old machine but I think I regularly get 100 MB/s copying on my HDD. Since /nix/store is ready-heavy, I guess I won’t mind the occasional slowdown …
Note also that copying stores is probably not the right approach since it means that your sqlite db is not built which means everything is at risk for garbage collection. I did this only for testing … (see: Rebuild sqlite db from scratch? · Issue #3091 · NixOS/nix · GitHub)
The big issue with dedup is that it uses quite a lot of memory, and it’s scattered randomly across the disk, making it slow to read (and AFAIK, the entire table has to be read just to import the pool). Like it can take hours just to import a large array made of HDDs.
For the nix store… Eh, for most people I guess it’s small enough to not be such an issue. I wouldn’t count on it being all that much better than auto-optimize though.
ZFS dedup actually works while writing data. It calculates the checksum of a new block and then checks its table if a block with the same checksum is already on the disk. If yes, instead of writing the new block, ZFS just points to the old block in the metadata. This means that ZFS needs the full table of all blocks and their checksums in RAM while writing and a fast CPU. If it does not fit into the ZFS ARC, then ZFS will happily re-read the missing part of the table from the disk for each write. ZFS dedup is heavily biased towards servers with loads of RAM and enabling it without calculating the required RAM and adjusting the ZFS ARC size for it may cause massive performance hits instantly or later on, when the block list becomes too large.
Due to the way ZFS ARC size is set, you may not even see a memory increase, as ZFS happily uses ~50% RAM if it would otherwise be free for its own cache. But that cache may shrink due to the size of the block table it needs for dedup, while ARC size remains the same.
ZFS dedup is completely transparent during read, as it’s just a block pointer. If that block is used multiple times, it simply does not matter. For that reason it also should not affect pool import times.
Good thing is that you can just disable it at any time.
Careful with the set dedup=off, that or set atime=off commands, one of these very likely destroyed my zfs partition (I was on root and issued these after creation and installation)
Also the link I gave in my first post touch on the RAM usage and it depends on your data size and it’s not that bad …
EDITED: Sorry my original post was unclear.
EDIT2: I meant running these commands afterwards. I otherwise have great experience with a dedup=off pool which started as such from the beginning.
I’ve been using atime=off and haven’t add any issues. Essentially it just prevents another write when accessing files. And since the nix-store doesn’t care about time it seems like a good fit.
this is with auto optimize I believe, at least I had it on and I’m not aware of a way to disable it.
One thing to note is that my server now floats like a baseline of 100-180 GB of zfs arc + dedup (out of 256GB). However, I haven’t really suffered memory pressure so it hasn’t affected performance too much.
Can’t believe I’m telling you … but what does nixos-option nix.autoOptimiseStore return?
I ask because 1.79 with auto-optimise on is very suspicious … There’s nothing zfs does over /nix/store optimize (zfs does block level dedup but saying that 79 out of 179 blocks are equal yet files are not doesn’t seem right to me).
Can you keep an eye out during your next garbage collection and see what savings it reports?
After botching up my previous zfs install, I reinstalled nixos on zfs and this time (by accident) I had both zfs’s dedup and nix.autoOptimiseStore on. Amazingly zfs still managed to give me a 1.35x deduplication. Substantial but not sure if I would like to pay the RAM cost.
Compression ratio went up to 1.95 (maybe just a difference of data …?)