I’ve been digging into this question a lot the last few days.
How could this be documented properly and in a way that would make possible to grasp in finite time?
I don’t know but I’ll give it a shot below.
fetchurl is one of the really recursive things, so picking it apart has been helpful. There’s a fetchurl/default.nix but there’s actually other code for it too. And fair warning, my statements below are probably slightly inaccurate, so take “it first happens” and the like as “I’m pretty sure it first happens”
What I’ve found most helpful for the thought process is
lib is pure-nix code, isolated. So mentally put that in a box.
- Theres a machine-specific boot-stage sequence that happens first. Basically all the impure stuff like
stdenv gets created through that process, and its pretty complicated. So mentally put that in a box and mostly ignore it.
- After machine-specific bootstrapping we get
stdenv, stdenvNoCC, and buildPackages which I think should be called foundationalPackagesYouCanTakeForGranted
- Then
all-packages.nix is evaluated
Inside of the machine-bootstrapping the fetchurl/default.nix is loaded for the very first time, with absolutely barebone arguments. It creates the stdenv.fetchurlBoot that everyone can use. (I think its possible buildPackages.fetchurl is equivlent to stdenv.fetchurlBoot but I’m not sure)
But inside of all-packages.nix we see a fetchurl = ....
It’s what I would call a big-ole-spagetti-meatball of an expression (below), but
# `fetchurl' downloads a file from the network.
fetchurl = if stdenv.buildPlatform != stdenv.hostPlatform
then buildPackages.fetchurl # No need to do special overrides twice,
else makeOverridable (import ../build-support/fetchurl) {
inherit lib stdenvNoCC buildPackages;
inherit cacert;
curl = buildPackages.curlMinimal.override (old: rec {
# break dependency cycles
fetchurl = stdenv.fetchurlBoot;
zlib = buildPackages.zlib.override { fetchurl = stdenv.fetchurlBoot; };
pkg-config = buildPackages.pkg-config.override (old: {
pkg-config = old.pkg-config.override {
fetchurl = stdenv.fetchurlBoot;
};
});
perl = buildPackages.perl.override { fetchurl = stdenv.fetchurlBoot; };
openssl = buildPackages.openssl.override {
fetchurl = stdenv.fetchurlBoot;
buildPackages = {
coreutils = buildPackages.coreutils.override {
fetchurl = stdenv.fetchurlBoot;
inherit perl;
xz = buildPackages.xz.override { fetchurl = stdenv.fetchurlBoot; };
gmp = null;
aclSupport = false;
attrSupport = false;
};
inherit perl;
};
inherit perl;
};
libssh2 = buildPackages.libssh2.override {
fetchurl = stdenv.fetchurlBoot;
inherit zlib openssl;
};
# On darwin, libkrb5 needs bootstrap_cmds which would require
# converting many packages to fetchurl_boot to avoid evaluation cycles.
# So turn gssSupport off there, and on Windows.
# On other platforms, keep the previous value.
gssSupport =
if stdenv.isDarwin || stdenv.hostPlatform.isWindows
then false
else old.gssSupport or true; # `? true` is the default
libkrb5 = buildPackages.libkrb5.override {
fetchurl = stdenv.fetchurlBoot;
inherit pkg-config perl openssl;
keyutils = buildPackages.keyutils.override { fetchurl = stdenv.fetchurlBoot; };
};
nghttp2 = buildPackages.nghttp2.override {
fetchurl = stdenv.fetchurlBoot;
inherit pkg-config;
enableApp = false; # curl just needs libnghttp2
enableTests = false; # avoids bringing `cunit` and `tzdata` into scope
};
});
};
We can flatten it out into something sensible, like this
let
#
# note: stdenv, buildPackages, and lib are the only things not-defined below
#
# not the full perl, but enough args for a minimal verison
bootstrap_perl = buildPackages.perl.override {
fetchurl = stdenv.fetchurlBoot;
};
# not the full xz, but enough args for a minimal verison (... etc)
bootstrap_xz = buildPackages.xz.override {
fetchurl = stdenv.fetchurlBoot;
};
bootstrap_coreutils = buildPackages.coreutils.override {
fetchurl = stdenv.fetchurlBoot;
perl = bootstrap_perl;
xz = bootstrap_xz;
gmp = null;
aclSupport = false;
attrSupport = false;
};
bootstrap_openssl = buildPackages.openssl.override {
fetchurl = stdenv.fetchurlBoot;
perl = bootstrap_perl;
buildPackages = {
perl = bootstrap_perl;
coreutils = bootstrap_coreutils;
};
};
bootstrap_keyutils = buildPackages.keyutils.override {
fetchurl = stdenv.fetchurlBoot;
};
bootstrap_zlib = buildPackages.zlib.override {
fetchurl = stdenv.fetchurlBoot;
};
bootstrap_pkg-config = buildPackages.pkg-config.override (old: {
pkg-config = old.pkg-config.override {
fetchurl = stdenv.fetchurlBoot;
};
});
bootstrap_libssh2 = buildPackages.libssh2.override {
fetchurl = stdenv.fetchurlBoot;
zlib = bootstrap_zlib;
openssl = bootstrap_openssl;
};
bootstrap_libkrb5 = buildPackages.libkrb5.override {
fetchurl = stdenv.fetchurlBoot;
pkg-config = bootstrap_pkg-config;
perl = bootstrap_perl;
openssl = bootstrap_openssl;
keyutils = bootstrap_keyutils;
};
bootstrap_nghttp2 = buildPackages.nghttp2.override {
fetchurl = stdenv.fetchurlBoot;
pkg-config = bootstrap_pkg-config;
enableApp = false; # curl just needs libnghttp2
enableTests = false; # avoids bringing `cunit` and `tzdata` into scope
};
bootstrap_curl = buildPackages.curlMinimal.override (old: rec {
# break dependency cycles # <- original comment, not mine --Jeff
fetchurl = stdenv.fetchurlBoot;
pkg-config = bootstrap_pkg-config;
zlib = bootstrap_zlib;
perl = bootstrap_perl;
openssl = bootstrap_openssl;
libssh2 = bootstrap_libssh2;
libkrb5 = bootstrap_libkrb5;
nghttp2 = bootstrap_nghttp2;
gssSupport =
if stdenv.isDarwin || stdenv.hostPlatform.isWindows
then false
else old.gssSupport or true; # `? true` is the default
});
#
#
# finally, using only buildPackages, stdenv, lib,
# and the things above, we can make fetchurl
#
#
fetchurl =
if stdenv.buildPlatform != stdenv.hostPlatform then
buildPackages.fetchurl # No need to do special overrides twice,
else
(import ../build-support/fetchurl) {
lib = lib;
stdenvNoCC = stdenvNoCC;
buildPackages = buildPackages;
cacert = buildPackages.cacert;
curl = bootstrap_curl;
}
;
in
fetchurl
That fetchurl^ at the very bottom, I would call fetchurl2.
stdenv.fetchurlBoot would be fetchurl1
However, and this is the fixed-point function bit, I did modifiy one thing; the cacert argument .
In the original expression its just cacert = cacert;, and I honestly don’t know if
- It is the full cacert package (which needs python3, and python3Packages, and all of their dependencies), or
- If cacert is actually equal to
buildPackages.cacert at that moment
I think its actually possible for both of those to be true;
- we build fetchurl2 using
buildPackages.cacert (because thats the cacert we have at the time)
- now that we have fetchurl2, we can build python3, and other stuff that needed fetchurl
- eventually we build the full/normal
cacert using fetchurl2
- now that we have cacert2, we start building fetchurl3 with cacert=cacert2 as the input argument
(and so on, and so on, until fetchurl4 or whatever, going until they get arguments that never change; e.g. fixedpoint)
Even for just the cowsay package, I know perl has to go through +3 iterations like that.
There’s still a lot more to be discovered, and hopefully others will chime in. I’m a 3rd year PhD Computer Science student at a top university who loves recursion … and this stuff still crushes my brain. We really need to reduce the complexity/recursion used in Nixpkgs.