Switch cache.nixos.org to ZSTD to fix slow NixOS updates / nix downloads?

I don’t expect that additional S3 expenses would be significant if the additional archives were removed after a short-ish period of not being needed (like one month), but I’m not sure if someone can easily produce infra code that does deletion like this.

Checking that a path is alive in this model seems difficult, as referencing from a different path has been noop AFAIK. In particular, fixed output derivation might remain alive for much longer than the chosen period, and on stable branches we may not do a full rebuild every month. Well, maybe one month after upload could be considered a good enough approximation (and not hard hopefully?), given that we’ll have xz fallback anyway :man_shrugging:

1 Like

It could be helpful to collect some statistics on compression used in substitutions. It’d be great to get insight into how many substitutions still happen via Nix client versions which don’t request zstd and how many substitutions would have liked zstd but it wasn’t available (and why).

FODs are an interesting point. On the one hand, we could simply offer them as xz-only and that’d probably be fine. That could even prove beneficial as they’re source code archives most of the time for our use-cases where xz likely will achieve greater compression and decompression might be bottlenecked by drive speed more often.
OTOH we could also offer them as zstd-only and let users without zstd support fetch the FODs themselves. In the case of FODs, our cache isn’t really much of a cache but rather a mirror.

Side note about the OP of large downloads: I expect there’s still some lower-hanging fruit in closure reductions. Dependencies or files that are rarely or even never used. It just seems that most contributors don’t mind these costs too much.

3 Likes

Nix does not support multiple compression methods per .narinfo file. So we cannot offer store paths using both xz and zstd compression.

BTW, if we really care about download speeds, then the real focus should be on closure size optimisation. My desktop NixOS 18.09 closure was 4.6 GB; the mostly equivalent 22.11 configuration is 13.7 GB, including 5 versions of ffmpeg, 2 versions of qtwebengine, a gdb that has ballooned to 672 MB, something named mbrola that takes up 676 MB, and 121 -dev outputs.

21 Likes

Can you add zchunk to the comparison?
Just out of curiosity. On would still have to solve the problem of finding a suitable local reference to benefit from the chunking.

@wamserma That doesn’t look to useful for our case. We probably need something tailored towards Nix’ use-cases for the case of chunking. My CrossOver binary tarball from before came out over 100MiB more than any of the other options and it took a minute to compress with no parallelism.

@Atemu I had no idea how it would perform, hence I asked for a test run. Nix- (or rather nar-)specific chunking has been discussed a few times, e.g. in the Attic-Thread.

btw: mbrola is a TTS package and probably pulled in due to this: okular pulls in mbrola worth > 600 mb · Issue #207204 · NixOS/nixpkgs · GitHub

1 Like

zstd is now backported: [2.3-maintenance] libutil: add ZstdDecompressionSink by edef1c · Pull Request #9221 · NixOS/nix · GitHub thanks to @edef !

8 Likes

I opened Tag 2.3.17 from `2.3-maintenance` branch · Issue #9244 · NixOS/nix · GitHub, so this can ideally trickle into a new version number, so (smart) HTTP caches can detect if zstd support is available.

2 Likes

Nix does not support multiple compression methods per .narinfo file. So we cannot offer store paths using both xz and zstd compression.

So it’s not possible to introduce this narinfo extension as non-breaking change, is it?

I have another issue which may or may not be related:

nix flake update --commit-lock-file

given nixpkgs is ± the only flake input, it takes at least 1 min (MBA M1, macOS ) past the download progress has stopped which seems like a bit forever.

not sure if nix-channel --update is any faster, but irrelevant for me as I’ve gone flakes-only due to #10247

Hello,

Looking at Support zstd compression for binary caches · Issue #2255 · NixOS/nix · GitHub, seems it got closed in favour of Use libarchive for all decompression (except brotli) by yorickvP · Pull Request #3333 · NixOS/nix · GitHub which is fine as I guess it introduced zstd support using libarchive.

❯ date
Wed Jul  3 10:01:14 CEST 2024

❯ nix-shell -v -p llvmPackages.stdenv
...
downloading 'https://cache.nixos.org/nar/0zk0synf3wgzb1grjfvwyj37kd3qp1wnqhvhq9xjch8dyvc0b1rk.nar.xz'...

suggests cache.nixos.org does not yet support ZSTD compression.

This is understandable as I believe it might be very well not a priority for you @domenkozar

However apart from the priorities, it’s not clear for me if consensus regarding introducing ZSTD for caches has been reached at all.

Can someone post update about this, please?

A bit off topic: reasons I’m asking is that my M1 Air (2020) still feels snappy in many aspects and despite it’s 2024 I’m not going to ditch it just out of blue, however nix currently gives it really hard time.

I mean, it feels painfully slow.

That’s said with respect to maintainers: enormous ticket pressure is evident, so it’s not about the reasons behind it, for me, but having a hope that ZSTD can bring significant relief to it seems reasonable.

In the end, if there are reasons not to go with it yet, I could’ve hacked small cache instance locally with ZSTD on, and that’s a way to go, but might be useless if the feature is coming soon, officially.

Thank you,
Peter

3 Likes

Right, so far all uploads to the cache.nixos.org are .xz

But is it planned already or maybe there are still some gotchas?

No real plans so far, I believe. (and addressing S3 storage amount most likely goes first)

I haven’t seen XZ decompression be a bottleneck on any system.
Most of the cache.nixos.org slowness I’ve seen is the 1 MB/s bottleneck between fastly and S3. Compressing the nixos cache with zstd instead of xz wouldn’t really help with that.

2 Likes

That’s interesting! Have you debugged that yourself (if so, how?), or is it a documented limitation of fastly/S3?

I haven’t seen XZ decompression be a bottleneck on any system.

My opening post shows exactly that 100x bottleneck:

despite only pushing ~12 MB/s, the nix process consumes around 100% CPU (varying between 70% and 130% in htop).

If additionally the network pipe is now slow, that’s also bad and should be fixed, but it doesn’t make the already-proven issue disappear.

In my test today on 10 Gbit/s,

wget -O /dev/null https://cache.nixos.org/nar/1w0nk8lhf3vbna1cl07qs835f13xj7w60mx7ny3xx3rxk6waxk1r.nar.xz

runs at 5 MB/s. That’s faster than 1 MB/s but still atrocious.

1 Like

I am getting more and more convinced that NixOS should have a community-run cache server at Hetzner. It would

  • Solve the immense S3 storage and transfer cost
  • Provide reliable 1000 MB/s instead of 5 MB/s
  • Hopefully offer zstd packages
5 Likes