Downloading Fixed-Output Derivations from cache.nixos.org

When building without a substituter (to avoid having to trust the substituter, because one is using a non-standard store path, or for other reasons), a common issue is that a needed source file/tree/patch cannot be retrieved (server down, file removed, file changed, et cetera). It can be difficult to impossible to track down an alternative copy of the expected source on the web with the expected hash.

For such cases, I created a script to download fixed-output derivations from cache.nixos.org. Since the derivation is fixed-output, the contents can be verified against the hash in nixpkgs and there’s no need to trust the cache.

#!/usr/bin/env bash
set -euo pipefail

usage() {
  echo "Usage: $0 [-r] [-o output] <sha256-hex> <name>" >&2
  echo "  -r          recursive (directory) mode" >&2
  echo "  -o output   output path (default: ./<name>)" >&2
  echo "  sha256-hex  expected content hash in hexadecimal" >&2
  echo "  name        derivation name (e.g., foo.tar.gz)" >&2
  exit "${1:-1}"
}

recursive=false
output=""
positional=()
while [ $# -gt 0 ]; do
  case "$1" in
    -r) recursive=true; shift ;;
    -o) output="$2"; shift 2 ;;
    -h|--help) usage 0 ;;
    -*) usage ;;
    *) positional+=("$1"); shift ;;
  esac
done

[ ${#positional[@]} -eq 2 ] || usage
hash_hex="${positional[0]}"
name="${positional[1]}"
output="${output:-./$name}"

store_hash=$(python3 - "$hash_hex" "$name" "$recursive" <<EOF
import hashlib, sys

nix32_chars = '0123456789abcdfghijklmnpqrsvwxyz'

def to_nix32(data):
    bit_count = len(data) * 8
    hash_len = (bit_count + 4) // 5
    result = []
    for n in range(hash_len - 1, -1, -1):
        b = 0
        for bit in range(5):
            src_bit = n * 5 + bit
            if src_bit < bit_count:
                byte_idx = src_bit // 8
                bit_idx = src_bit % 8
                if data[byte_idx] & (1 << bit_idx):
                    b |= 1 << bit
        result.append(nix32_chars[b])
    return ''.join(result)

def compress_hash(h, new_size):
    result = bytearray(new_size)
    for i, b in enumerate(h):
        result[i % new_size] ^= b
    return bytes(result)

hash_hex = sys.argv[1]
name = sys.argv[2]
recursive = sys.argv[3] == 'true'

if recursive:
    fingerprint = 'source:sha256:' + hash_hex + ':/nix/store:' + name
else:
    inner = hashlib.sha256(('fixed:out:sha256:' + hash_hex + ':').encode()).hexdigest()
    fingerprint = 'output:out:sha256:' + inner + ':/nix/store:' + name

digest = hashlib.sha256(fingerprint.encode()).digest()
compressed = compress_hash(digest, 20)
print(to_nix32(compressed))
EOF
)

echo Querying cache for $store_hash...

narinfo=$(curl -sfL "https://cache.nixos.org/$store_hash.narinfo") || {
  echo "error: not found in cache.nixos.org" >&2; exit 1
}

nar_url=$(echo "$narinfo" | grep '^URL:' | cut -d' ' -f2)
compression=$(echo "$narinfo" | grep '^Compression:' | cut -d' ' -f2)

echo "Downloading: $nar_url ($compression)" >&2

case "$compression" in
  xz)   decompress="xz -d" ;;
  bzip2) decompress="bzip2 -d" ;;
  zstd)  decompress="zstd -d" ;;
  none)  decompress="cat" ;;
  *)     echo "error: unsupported compression: $compression" >&2; exit 1 ;;
esac

curl -sfL "https://cache.nixos.org/$nar_url" | $decompress | nix-store --restore "$output"

actual=$(nix-hash --type sha256 $($recursive || echo --flat) "$output")
[ "$actual" = "$hash_hex" ] || { echo "error: hash mismatch: expected $hash_hex, got $actual" >&2; exit 1; }

echo "Written: $output" >&2
2 Likes

And here’s a script that uses the first to automate fetching a fixed-output derivation and importing it into the store, given the .drv file. It assumes the previous script is present in $PATH as fetch-nix-fod. Notably, this will work even if using a non-standard store path.

#!/usr/bin/env bash
set -euo pipefail

usage() {
  echo "Usage: $0 <path-to-.drv>" >&2
  exit "${1:-1}"
}

case "${1:-}" in
  -h|--help) usage 0 ;;
esac

[ $# -eq 1 ] || usage
drv="$1"

[ -f "$drv" ] || { echo "error: $drv not found" >&2; exit 1; }

drv_content=$(< "$drv")

extract() {
  echo "$1" | grep -o "$2" | sed "s/$2/\1/"
}

hash_algo=$(extract "$drv_content" '^Derive(\[("[^"]*","[^"]*","\([^"]*\)","[^"]*")')
hash_hex=$(extract "$drv_content" '^Derive(\[("[^"]*","[^"]*","[^"]*","\([^"]*\)")')
name=$(extract "$drv_content" '("name","\([^"]*\)")')

case "$hash_algo" in
  r:sha256) recursive=true ;;
  sha256) recursive=false ;;
  *) echo "error: unsupported hash algorithm: $hash_algo" >&2; exit 1 ;;
esac

echo "Name: $name" >&2
echo "Hash: $hash_hex" >&2
echo "Mode: $($recursive && echo recursive || echo flat)" >&2

tmpdir=$(mktemp -d)
trap 'rm -rf "$tmpdir"' EXIT

fetch-nix-fod $($recursive && echo -r) -o "$tmpdir/$name" "$hash_hex" "$name"

nix-store --add-fixed $($recursive && echo --recursive) sha256 "$tmpdir/$name"

Why overcomplicate things so much? nix copy does the job pretty well:

nix copy "/nix/store/wj7phsmi7ncidl8k00p489krqss7n9sd-hello-2.12.3.tar.gz" --from https://cache.nixos.org --to /tmp/store-test --substituters "" --trusted-public-keys ""
1 Like

It also works with a non-standard store path.

Are you sure? It gives me

error: binary cache 'https://cache.nixos.org' is for Nix stores with prefix '/nix/store', not '/Users/rkjnsn/.local/share/nix

Yup (to be clear I’m on Nix master, but I’m reasonably sure this works for some recent releases too):

nix copy "/nix/store/wj7phsmi7ncidl8k00p489krqss7n9sd-hello-2.12.3.tar.gz" --from https://cache.nixos.org --to "local:///tmp/store-test4?store=/aaaa/bbbb" --substituters "" --trusted-public-keys ""

nix path-info /aaaa/bbbb/zznlmkflk89h7ihyqwfrcsgj0j2wgqva-hello-2.12.3.tar.gz --store "/tmp/store-test4?store=/aaaa/bbbb"

/aaaa/bbbb/zznlmkflk89h7ihyqwfrcsgj0j2wgqva-hello-2.12.3.tar.gz

It seems like you’ve set the NIX_STORE_DIR/NIX_STORE env variable, so that’s why it overrides the default of the https://cache.nixos.org binary cache store. Probably doing an explicit store dir there would work:

https://cache.nixos.org?store=/nix/store

Okay, I got this to work:

nix copy "/nix/store/wj7phsmi7ncidl8k00p489krqss7n9sd-hello-2.12.3.tar.gz" --from 'https://cache.nixos.org?store=/nix/store' --to ./copy-test --substituters "" --trusted-public-keys "" --extra-experimental-features nix-command

However, that doesn’t actually help much, because the majority of the script is dedicated to calculating what the path to request should actually be (the hash part of the path changes with the store path, and the hard part is calculating what it should be under /nix/store for a given FOD), so it could replace the last few lines of the first script at best.

Also, it doesn’t seem to work for recursive FODs:

nix copy "/nix/store/0bqhsb3qawa3n5mx1dhgwfqz484y28kr-mkdep.sh" --from 'https://cache.nixos.org?store=/nix/store' --to ./copy-test --substituters "" --trusted-public-keys "" --extra-experimental-features nix-command
error: cannot add path '/Users/rkjnsn/.local/share/nix/0bqhsb3qawa3n5mx1dhgwfqz484y28kr-mkdep.sh' because it lacks a signature by a trusted key

It doesn’t work because it lacks the ca field somehow:

{
  "/nix/store/0bqhsb3qawa3n5mx1dhgwfqz484y28kr-mkdep.sh": {
    "ca": null,
    "compression": "xz",
    "deriver": "/nix/store/xvbr5nckh5jpalg75gjw7mizbqjhajl3-mkdep.sh.drv",
    "downloadHash": "sha256-se1PiF+6j63S+285emHZS4UvW5C5HuIwa6UVnSqEP2c=",
    "downloadSize": 1636,
    "narHash": "sha256-gZSe8F/4y6IUNU7bWSn0XInoLorZF+3XL1o6qh2+nFg=",
    "narSize": 3016,
    "references": [],
    "registrationTime": null,
    "signatures": [
      "cache.nixos.org-1:tfe91n5PMVY2ZuhVYEYcRcbZkWoT4qhlwo78BoD09CpTBSi0qkGhh5gyBA2lRNnKnIX+ctBghiWFiIwPyJbtDw=="
    ],
    "storeDir": "/nix/store",
    "ultimate": false,
    "url": "nar/0rrzhhm9s5d5dcqf47mrj1djz1abv5hplfbgzg9av3xsby44zvdi.nar.xz",
    "version": 1
  }
}

Nothing to do with recursive/not recursive.

Yeah exposing some easy way to recalculate the CA path from a cli would be quite useful I suppose, so that it’s not necessary to recalculate everything by hand…

Indeed, as it appears I can use /nix/store/521d7xqs0svwyxbx9w2fhj7mqqdy03jf-source with nix copy, which is also recursive. It’s also a bit odd that mkdep.sh, being a single file, is recursive at all. That said, mkdep.sh is an actual file that failed to download that I was able to pull from the cache using my script, so it seems worth sticking with using curl to download the files directly to continue to handle such cases.

(In cases like this, it’d be greatly appreciated if you filed these as bugs, assuming that you’re running into this on master or stable.)

1 Like