I encountered 2 issues with pixz
, one about compression ratio, and one about using all available cores:
Anybody know how can I get pixz
’s compression side to use all cores? On my 6-core/12-thread machine, it uses only 400% CPU for the chromium
store path, which over time of the compression drops to 300% and then to 200%. The averace usage according to time
is then 270%.
Passing -p6
or -p12
doesn’t seem to change anything about that.
I have now added benchmarks of pixz 1.0.7
to the table (including the above-mentioned problem).
I have also added maxres
outputs from command time
, showing how many MB maximum RAM were needed for compression and decompression:
|--------------------- compression ------------------------------------| |------ decompression ------|
per-core total total
size user(s) system(s) CPU total(m) throughput throughput maxres total(s) throughput maxres comments
chromium: tar c /nix/store/620lqprbzy4pgd2x4zkg7n19rfd59ap7-chromium-unwrapped-108.0.5359.98
uncompressed 473Mi
compression (note `--ultra` is given to enable zstd levels > 19), zstd v1.5.0, XZ utils v5.2.5
xz -9 102M 216.34 0.54s 99% 3:37.07 2.29 MB/s 2.29 MB/s 691 MB 6.227 79 MB/s 67 MB
pixz -9 137M 216.98 1.71s 271% 1:20.57 2.28 MB/s 6.19 MB/s 2951 MB 2.551 194 MB/s 657 MB did not use all cores consistently, for both compression and decompression
zstd -19 113M 176.42 0.56s 100% 2:56.66 2.81 MB/s 2.81 MB/s 241 MB 0.624 794 MB/s 10 MB
zstd -19 --long 111M 200.84 0.52s 100% 3:21.07 2.46 MB/s 2.46 MB/s 454 MB 0.686 722 MB/s 133 MB
zstd -22 108M 210.77 0.74s 100% 3:31.44 2.35 MB/s 2.35 MB/s 1263 MB 0.716 692 MB/s 133 MB
zstd -22 --long 108M 214.96 0.64s 100% 3:35.53 2.30 MB/s 2.30 MB/s 1263 MB 0.716 692 MB/s 133 MB bit-identical to above for this input
pzstd -19 114M 270.05 1.20s 1064% 25.47 1.83 MB/s 19.83 MB/s 1641 MB 0.244 2032 MB/s 564 MB
pzstd -22 108M 224.17 0.66s 100% 3:44.80 2.21 MB/s 2.21 MB/s 1392 MB 0.721 687 MB/s 245 MB single-threaded comp/decomp!
Oddly, pixz
produces a much worse compresison ratio than any of the other approaches.
xz
/pixz
need 5x more RAM for decompression
pzstd
needs disproportionately much RAM for decompression. I suspect this is because with the invocation pzstd -d ... > /dev/null
, outputs from the various threads need to be bufferend in-memory to write them in order into the output pipe.
However, pzstd
does this even when writing outputs to a regular file with -o
.
I also checked how much RAM plain zstd
needs to decompress the pzstd
outputs; there is no difference compared to decompressing the zstd
outputs.
For the chromium derivation in my table above, single-threaded zstd
has 10x higher decompression throughput than single-threaded xz
, and for pzstd
vs pixz
it’s also 10x.
Summary of my findings so far
On this chromium
tar:
-
single-threaded:
-
zstd -19
wins against xz -9
on decompression speed (10x) and decompression memory usage (5x)
-
zstd -22
still wins against xz -9
on decompression speed (same 10x) but loses against decompression memory usage (0.5x)
-
xz -9
wins on best compression ratio: By 10% against zstd -19
and 5%
against zstd -22
-
multi-threaded:
-
pixz -9
loses against pzstd -19
on all metrics
- decompression memory usage can apparently be reduced 10x by decompressing 6
zstd -19
archives independently, rather than using pzstd -d
to decompress 1 zstd -19
archive. That seems wrong, at least when writing to regular files. I filed a zstd issue.
Please point out if you spot any mistakes!