NixOS Cache fetching issue

[EDIT from @zimbatm]: I split out this thread

I get the same error since a few weeks. I am running nix 2.2.2, disabled IPc6 and also tried using nix from stable. Nothing works. It seems the issue is still not assigned to anybody.

I am unable to update and install anything on my system. Frankly, this is not a great user experience. Please, somebody fix this.

1 Like

One thing you can do is go to https://www.fastly-debug.com/ and post the output here. The more data that we get, the easier it will be for us to debug the issue.

The timeout seems to happen after a download takes more than 5 minutes.

Client IP Info
IP                 xxx
AS Name            TDDE-ASN1, DE
AS Number          6805
City               munich
Continent          EU
Country            germany
State              BY

Resolver IP Info
IP                 172.253.12.1
AS Name            GOOGLE - Google LLC, US
AS Number          15169
Country Code       US

Server Connection Info
IP                 151.101.0.64
Datacenter         FRA
BW to server       0.79mbps
Congestion Window  15
Next Hop           172.21.100.1
RTT                101.041ms
Delta Retransmits  0
Total Retransmits  32

POP Latency (ms)
LHR   55
LCY   52
CDG   53
MAD   94
MXP   102
FRA   97
HHN   76
AMS   69
BMA   82
HEL   111
CPH   100
OSL   98
JFK   155
HKG   298
ANY   61

Request Info
Time              Thu Jul 25 2019 18:33:48 GMT+0200 (CEST)
Host              www.fastly-debug.com
Accept            text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
User-Agent        Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0
Accept-Language   en-US,en;q=0.7,de-DE;q=0.3
Accept-Encoding   gzip
X-Forwarded-For   95.114.76.235

@fpletz @globin: do you know anyone else on that AS ?

BW to server       0.79mbps

The bandwidth seems very low. Are you on ADSL?

My wife was watching Netflix. Now it says 4.12mbps.
Yes, we have a shitty connection.

The problem is still there and was the last 4 or 5 weeks. Also tried nixos-rebuild switch --keep-going -j1 over night and at different times during the day.

Today I am getting the resolver ip 172.253.14.4, a BW to server of 2.72mbps and the same error.

What happens if you try to download something manually from the cache?

Eg:

$ curl -vvv -o ghc.tar.gz https://cache.nixos.org/nar/0qwrky7iz1lgaqs03q0zzi9rj20h80966mahaz57s00gpfkdzvnl.nar.xz

I run it two times. Once heavy load on my internet connection and once without any other traffic. Both where successful:

λ curl -vvv -o ghc.tar.gz https://cache.nixos.org/nar/0qwrky7iz1lgaqs03q0zzi9rj20h80966mahaz57s00gpfkdzvnl.
nar.xz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 151.101.2.217:443.
..
* TCP_NODELAY set
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 2a04:4e42::729:443
...
* TCP_NODELAY set
* Immediate connect fail for 2a04:4e42::729: Network is unreachable
*   Trying 2a04:4e42:200::729:443...
* TCP_NODELAY set
* Immediate connect fail for 2a04:4e42:200::729: Network is unreachable
*   Trying 2a04:4e42:400::729:443...
* TCP_NODELAY set
* Immediate connect fail for 2a04:4e42:400::729: Network is unreachable
*   Trying 2a04:4e42:600::729:443...
* TCP_NODELAY set
* Immediate connect fail for 2a04:4e42:600::729: Network is unreachable
* Connected to cache.nixos.org (151.101.2.217) port 443 (#0)
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
} [5 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.2 (IN), TLS handshake, Server hello (2):
{ [102 bytes data]
* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [4986 bytes data]
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
{ [333 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [70 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
{ [1 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: C=US; ST=California; L=San Francisco; O=Fastly, Inc.; CN=v2.shared.global.fastly.net
*  start date: Jun 27 09:55:20 2019 GMT
*  expire date: Feb 26 21:35:27 2020 GMT
*  subjectAltName: host "cache.nixos.org" matched cert's "cache.nixos.org"
*  issuer: C=BE; O=GlobalSign nv-sa; CN=GlobalSign CloudSSL CA - SHA256 - G3
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
} [5 bytes data]
* Using Stream ID: 1 (easy handle 0xbb1480)
} [5 bytes data]
> GET /nar/0qwrky7iz1lgaqs03q0zzi9rj20h80966mahaz57s00gpfkdzvnl.nar.xz HTTP/2
> Host: cache.nixos.org
> User-Agent: curl/7.65.0
> Accept: */*
>
{ [5 bytes data]
* Connection state changed (MAX_CONCURRENT_STREAMS == 100)!
} [5 bytes data]
< HTTP/2 200
< x-amz-id-2: muQjOQq/yqU+rHS7LZxioxqXrqXQyR9+11npBVhYSdxHOyRSgDKpk2EvPbg2lUco348l+uy6b2k=
< x-amz-request-id: 8B1815A47E99D14F
< last-modified: Fri, 26 Jul 2019 01:16:54 GMT
< etag: "e19957a0dae9830f01bcc311506dc09f"
< content-type: application/x-nix-nar
< server: AmazonS3
< accept-ranges: bytes
< age: 31578
< accept-ranges: bytes
< date: Fri, 26 Jul 2019 18:13:16 GMT
< via: 1.1 varnish
< x-served-by: cache-fra19123-FRA
< x-cache: HIT
< x-cache-hits: 0
< x-timer: S1564164796.193652,VS0,VE1
< content-length: 125587680
<
{ [5 bytes data]
100  119M  100  119M    0     0   171k      0  0:11:55  0:11:55 --:--:--  237k
* Connection #0 to host cache.nixos.org left intact

@grahamc , @zimbatm Can you please report this issue to fastly?

@zeec123 if curl is working it must be another issue. Can you run nix ping-store ?

@zimbatm nix ping-store doesn’t output anything. Is this normal?
Please let me know if I can help with anything to resolve this issue.

> nix --version
nix (Nix) 2.2.2

No output is good for nix ping-store. It means that it was able to download the https://cache.nixos.org/nix-cache-info file.

Now for a big download, can you paste the output of nix-store --realize /nix/store/qqqljs45s9a1knwqlbnmvyb0fmiynmnv-google-chrome-75.0.3770.90 ?

Google-chrome is already installed so used the mesa drivers:

nix-store --realize /nix/store/ks7aqyydpp0x9ygx89w71p1kaz0zy2fa-mesa-19.1.1-drivers
these paths will be fetched (77.32 MiB download, 288.49 MiB unpacked):
  /nix/store/ks7aqyydpp0x9ygx89w71p1kaz0zy2fa-mesa-19.1.1-drivers
copying path '/nix/store/ks7aqyydpp0x9ygx89w71p1kaz0zy2fa-mesa-19.1.1-drivers' from 'https://cache.nixos.org'...
warning: unable to download 'https://cache.nixos.org/nar/0k6ds6grjkwvcy27w5sbhdkqsn8ihvw8y4cd6pfr0pzqarcf78gb.nar.xz': HTTP error 200 (curl error: Timeout was reached); retrying in 259 ms
error 9 while decompressing xz file
error: build of '/nix/store/ks7aqyydpp0x9ygx89w71p1kaz0zy2fa-mesa-19.1.1-drivers' failed

@zimbatm isn’t error 9 while decompressing xz file fixed by https://github.com/NixOS/nix/commit/78fa47a7f08a4cb6ee7061bf0bd86a40e1d6dc91

I split out this thread since it was out of topic.

Looking at the commit, it seems to only be fixed in nix master for now. But it’s also a symptom of a deeper issue since it only happens because of a retry. For some reason nix on your system is retrying the download and I don’t know why yet.

Some observations:

  • IPv6 is not working
  • IPv6 is not disabled
  • the curl sample is falling back from IPv6 to IPv4
  • maybe nix is not able to do the same, or is counting as a retry, which triggers the error 9 while decompressing xz file issue.
  • maybe there are some deeper issues with the system disk or RAM

Can you try and disable IPv6 to see if nix is able to realize the mesa path?

What changed when the system stopped working, was it after a system update? What happens if you reboot to the oldest system revision you can find?

I’ll add that sometimes I experience the same timeout issues as well but it’s not as consistent as in the OPs experience - it’s every 1/2 trials or so and if I Ctrl-C and try again it works. I never thought of creating a thread about it since it feel it’s because I run ipfs and transmission along side together which probably consume a lot of bandwidth.

@zimbatm
yes, the system stopped working after a sudo nixos-rebuild switch --upgrade.

disabling ipv6

echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6

nix-store --realize /nix/store/ks7aqyydpp0x9ygx89w71p1kaz0zy2fa-mesa-19.1.1-drivers
these paths will be fetched (77.32 MiB download, 288.49 MiB unpacked):
  /nix/store/ks7aqyydpp0x9ygx89w71p1kaz0zy2fa-mesa-19.1.1-drivers
copying path '/nix/store/ks7aqyydpp0x9ygx89w71p1kaz0zy2fa-mesa-19.1.1-drivers' from 'https://cache.nixos.or
g'...
warning: you did not specify '--add-root'; the result might be removed by the garbage collector
/nix/store/ks7aqyydpp0x9ygx89w71p1kaz0zy2fa-mesa-19.1.1-drivers

This looks good, right? Although I am very sure, that I disabled ipv6 a few days ago.

sudo nixos-rebuild switch --upgrade
unpacking channels...
these derivations will be built:
  /nix/store/vgf431iw3ai7l45nh0zmgma53716v2y4-nixos-rebuild.drv
building '/nix/store/vgf431iw3ai7l45nh0zmgma53716v2y4-nixos-rebuild.drv'...
building Nix...
building the system configuration...
these derivations will be built:
  /nix/store/kf8a2v17f9dbc6s1d0a3fhx32ymp47zm-fish_patched-completion-generator.drv
[...]
  /nix/store/zy65hn7bii072b0jbdcr2aya55wz0k84-libkscreen-5.15.5
  /nix/store/zz40a8626rdy96kigkm7spgslrz0xvmd-SDL-1.2.15
  /nix/store/zzhdxlfmcics12kba6yxnqw9bpymf19x-xfconf-4.12.1
  /nix/store/zzjl2g84ms2kxqx9w21h30hfsrb7i92d-phonon-backend-gstreamer-qt5-4.9.0
copying path '/nix/store/q4dcn2ybm06x0xssjzz4k68gzar8ic54-linux-4.19.62' from 'https://cache.nixos.org'...
copying path '/nix/store/rrww1m72206fmskwfa33v0g3827bavl9-breeze-icons-5.58.0' from 'https://cache.nixos.or
g'...
[...]
.
copying path '/nix/store/0s35kcykjzdw3a3ffrj4y21k3jiqind8-nixos-icons-2017-03-16' from 'https://cache.nixos
.org'...
warning: unable to download 'https://cache.nixos.org/nar/1xf23nc4c1yvgm1xyszwjzm52ghj206wkdlafak7h4xq9idpz0
5i.nar.xz': HTTP error 200 (curl error: Timeout was reached); retrying in 273 ms
error 9 while decompressing xz file
copying path '/nix/store/sjl6fjz8kv5wbf2i3x92sm8br8gv4dsx-nixos-manpages' from 'https://cache.nixos.org'...
^Cerror: interrupted by the user

I booted an older system now and running the upgrade atm. It did not stop so far…

using an old system (2019-05-30)

nix --version
nix (Nix) 2.2.2
sudo echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6
sudo nixos-rebuild switch --upgrade
[...]
updating GRUB 2 menu...
Warning: the new NixOS configuration has an ‘init’ that is
incompatible with the current configuration.  The new configuration
won't take effect until you reboot the system.
warning: error(s) occurred while switching to the new configuration

This took a while, but was a success. So the bug was introduced after 2019-05-30?

Edit: After the successful upgrade I booted the new system. There, the error is still present.

@zimbatm I switched to nixos-stable and it also works. Let me know if can perform further tests to help finding this issue.

Ok this is an issue with Nix then. This is best reported in the https://github.com/NixOS/nix repo. Or try the nix from master, it contains a bunch of fixes that haven’t been published yet.

{ pkgs, ... }:
{
  nix.package = pkgs.nixUnstable.overrideAttrs(x: {
    name = "nix-master";
    src = builtins.fetchGit { url = "https://github.com/NixOS/nix.git"; rev = "master"; };
  });
}

Thanks. I created an issue: