Zfs dedup on /nix/store -- Is it worth it?

although it sounds like it still writes out files (during build) and dedups later

I doubt that this is a significant cost at all. Files that aren’t dedup’d this way are simply moved to the .links directory, and files that are dedup’d are immediately deleted and so likely never even made it to disk since they probably weren’t written with synchronous writes.

Also, I ran a fio random read test (not testing writes here; I want to know the impact on read performance due to cryptographic hashing). I set the size of the test to something that wouldn’t fit entirely in ARC on my machine. And I didn’t even enable dedup, because I want to show the hashing alone is enormous.

fio --name=random-read --ioengine=posixaio --rw=randread --bs=1m --size=48g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1

It’s about half as fast with checksum=sha256 on my Threadripper 1950X + Samsung 960 Pro + 64G of memory (32G ARC limit, so even this test benefitted tremendously from ARC). Without the extra overhead of dedup. There are significant performance costs to the checksumming alone. I bet this would have a fairly large effect on boot times, as ARC is completely cold.