I don’t expect that additional S3 expenses would be significant if the additional archives were removed after a short-ish period of not being needed (like one month), but I’m not sure if someone can easily produce infra code that does deletion like this.
Checking that a path is alive in this model seems difficult, as referencing from a different path has been noop AFAIK. In particular, fixed output derivation might remain alive for much longer than the chosen period, and on stable branches we may not do a full rebuild every month. Well, maybe one month after upload could be considered a good enough approximation (and not hard hopefully?), given that we’ll have xz fallback anyway
It could be helpful to collect some statistics on compression used in substitutions. It’d be great to get insight into how many substitutions still happen via Nix client versions which don’t request zstd and how many substitutions would have liked zstd but it wasn’t available (and why).
FODs are an interesting point. On the one hand, we could simply offer them as xz-only and that’d probably be fine. That could even prove beneficial as they’re source code archives most of the time for our use-cases where xz likely will achieve greater compression and decompression might be bottlenecked by drive speed more often.
OTOH we could also offer them as zstd-only and let users without zstd support fetch the FODs themselves. In the case of FODs, our cache isn’t really much of a cache but rather a mirror.
Side note about the OP of large downloads: I expect there’s still some lower-hanging fruit in closure reductions. Dependencies or files that are rarely or even never used. It just seems that most contributors don’t mind these costs too much.
Nix does not support multiple compression methods per .narinfo file. So we cannot offer store paths using both xz and zstd compression.
BTW, if we really care about download speeds, then the real focus should be on closure size optimisation. My desktop NixOS 18.09 closure was 4.6 GB; the mostly equivalent 22.11 configuration is 13.7 GB, including 5 versions of ffmpeg, 2 versions of qtwebengine, a gdb that has ballooned to 672 MB, something named mbrola that takes up 676 MB, and 121 -dev outputs.
Can you add zchunk to the comparison?
Just out of curiosity. On would still have to solve the problem of finding a suitable local reference to benefit from the chunking.
@wamserma That doesn’t look to useful for our case. We probably need something tailored towards Nix’ use-cases for the case of chunking. My CrossOver binary tarball from before came out over 100MiB more than any of the other options and it took a minute to compress with no parallelism.
@Atemu I had no idea how it would perform, hence I asked for a test run. Nix- (or rather nar-)specific chunking has been discussed a few times, e.g. in the Attic-Thread.
Nix does not support multiple compression methods per .narinfo file. So we cannot offer store paths using both xz and zstd compression.
So it’s not possible to introduce this narinfo extension as non-breaking change, is it?
I have another issue which may or may not be related:
nix flake update --commit-lock-file
given nixpkgs is ± the only flake input, it takes at least 1 min (MBA M1, macOS ) past the download progress has stopped which seems like a bit forever.
not sure if nix-channel --update is any faster, but irrelevant for me as I’ve gone flakes-only due to #10247
❯ date
Wed Jul 3 10:01:14 CEST 2024
❯ nix-shell -v -p llvmPackages.stdenv
...
downloading 'https://cache.nixos.org/nar/0zk0synf3wgzb1grjfvwyj37kd3qp1wnqhvhq9xjch8dyvc0b1rk.nar.xz'...
suggests cache.nixos.org does not yet support ZSTD compression.
This is understandable as I believe it might be very well not a priority for you @domenkozar
However apart from the priorities, it’s not clear for me if consensus regarding introducing ZSTD for caches has been reached at all.
Can someone post update about this, please?
A bit off topic: reasons I’m asking is that my M1 Air (2020) still feels snappy in many aspects and despite it’s 2024 I’m not going to ditch it just out of blue, however nix currently gives it really hard time.
I mean, it feels painfully slow.
That’s said with respect to maintainers: enormous ticket pressure is evident, so it’s not about the reasons behind it, for me, but having a hope that ZSTD can bring significant relief to it seems reasonable.
In the end, if there are reasons not to go with it yet, I could’ve hacked small cache instance locally with ZSTD on, and that’s a way to go, but might be useless if the feature is coming soon, officially.
I haven’t seen XZ decompression be a bottleneck on any system.
Most of the cache.nixos.org slowness I’ve seen is the 1 MB/s bottleneck between fastly and S3. Compressing the nixos cache with zstd instead of xz wouldn’t really help with that.