I’ve been digging into this question a lot the last few days.
How could this be documented properly and in a way that would make possible to grasp in finite time?
I don’t know but I’ll give it a shot below.
fetchurl
is one of the really recursive things, so picking it apart has been helpful. There’s a fetchurl/default.nix
but there’s actually other code for it too. And fair warning, my statements below are probably slightly inaccurate, so take “it first happens” and the like as “I’m pretty sure it first happens”
What I’ve found most helpful for the thought process is
-
lib
is pure-nix code, isolated. So mentally put that in a box.
- Theres a machine-specific boot-stage sequence that happens first. Basically all the impure stuff like
stdenv
gets created through that process, and its pretty complicated. So mentally put that in a box and mostly ignore it.
- After machine-specific bootstrapping we get
stdenv
, stdenvNoCC
, and buildPackages
which I think should be called foundationalPackagesYouCanTakeForGranted
- Then
all-packages.nix
is evaluated
Inside of the machine-bootstrapping the fetchurl/default.nix
is loaded for the very first time, with absolutely barebone arguments. It creates the stdenv.fetchurlBoot
that everyone can use. (I think its possible buildPackages.fetchurl
is equivlent to stdenv.fetchurlBoot
but I’m not sure)
But inside of all-packages.nix
we see a fetchurl = ...
.
It’s what I would call a big-ole-spagetti-meatball of an expression (below), but
# `fetchurl' downloads a file from the network.
fetchurl = if stdenv.buildPlatform != stdenv.hostPlatform
then buildPackages.fetchurl # No need to do special overrides twice,
else makeOverridable (import ../build-support/fetchurl) {
inherit lib stdenvNoCC buildPackages;
inherit cacert;
curl = buildPackages.curlMinimal.override (old: rec {
# break dependency cycles
fetchurl = stdenv.fetchurlBoot;
zlib = buildPackages.zlib.override { fetchurl = stdenv.fetchurlBoot; };
pkg-config = buildPackages.pkg-config.override (old: {
pkg-config = old.pkg-config.override {
fetchurl = stdenv.fetchurlBoot;
};
});
perl = buildPackages.perl.override { fetchurl = stdenv.fetchurlBoot; };
openssl = buildPackages.openssl.override {
fetchurl = stdenv.fetchurlBoot;
buildPackages = {
coreutils = buildPackages.coreutils.override {
fetchurl = stdenv.fetchurlBoot;
inherit perl;
xz = buildPackages.xz.override { fetchurl = stdenv.fetchurlBoot; };
gmp = null;
aclSupport = false;
attrSupport = false;
};
inherit perl;
};
inherit perl;
};
libssh2 = buildPackages.libssh2.override {
fetchurl = stdenv.fetchurlBoot;
inherit zlib openssl;
};
# On darwin, libkrb5 needs bootstrap_cmds which would require
# converting many packages to fetchurl_boot to avoid evaluation cycles.
# So turn gssSupport off there, and on Windows.
# On other platforms, keep the previous value.
gssSupport =
if stdenv.isDarwin || stdenv.hostPlatform.isWindows
then false
else old.gssSupport or true; # `? true` is the default
libkrb5 = buildPackages.libkrb5.override {
fetchurl = stdenv.fetchurlBoot;
inherit pkg-config perl openssl;
keyutils = buildPackages.keyutils.override { fetchurl = stdenv.fetchurlBoot; };
};
nghttp2 = buildPackages.nghttp2.override {
fetchurl = stdenv.fetchurlBoot;
inherit pkg-config;
enableApp = false; # curl just needs libnghttp2
enableTests = false; # avoids bringing `cunit` and `tzdata` into scope
};
});
};
We can flatten it out into something sensible, like this
let
#
# note: stdenv, buildPackages, and lib are the only things not-defined below
#
# not the full perl, but enough args for a minimal verison
bootstrap_perl = buildPackages.perl.override {
fetchurl = stdenv.fetchurlBoot;
};
# not the full xz, but enough args for a minimal verison (... etc)
bootstrap_xz = buildPackages.xz.override {
fetchurl = stdenv.fetchurlBoot;
};
bootstrap_coreutils = buildPackages.coreutils.override {
fetchurl = stdenv.fetchurlBoot;
perl = bootstrap_perl;
xz = bootstrap_xz;
gmp = null;
aclSupport = false;
attrSupport = false;
};
bootstrap_openssl = buildPackages.openssl.override {
fetchurl = stdenv.fetchurlBoot;
perl = bootstrap_perl;
buildPackages = {
perl = bootstrap_perl;
coreutils = bootstrap_coreutils;
};
};
bootstrap_keyutils = buildPackages.keyutils.override {
fetchurl = stdenv.fetchurlBoot;
};
bootstrap_zlib = buildPackages.zlib.override {
fetchurl = stdenv.fetchurlBoot;
};
bootstrap_pkg-config = buildPackages.pkg-config.override (old: {
pkg-config = old.pkg-config.override {
fetchurl = stdenv.fetchurlBoot;
};
});
bootstrap_libssh2 = buildPackages.libssh2.override {
fetchurl = stdenv.fetchurlBoot;
zlib = bootstrap_zlib;
openssl = bootstrap_openssl;
};
bootstrap_libkrb5 = buildPackages.libkrb5.override {
fetchurl = stdenv.fetchurlBoot;
pkg-config = bootstrap_pkg-config;
perl = bootstrap_perl;
openssl = bootstrap_openssl;
keyutils = bootstrap_keyutils;
};
bootstrap_nghttp2 = buildPackages.nghttp2.override {
fetchurl = stdenv.fetchurlBoot;
pkg-config = bootstrap_pkg-config;
enableApp = false; # curl just needs libnghttp2
enableTests = false; # avoids bringing `cunit` and `tzdata` into scope
};
bootstrap_curl = buildPackages.curlMinimal.override (old: rec {
# break dependency cycles # <- original comment, not mine --Jeff
fetchurl = stdenv.fetchurlBoot;
pkg-config = bootstrap_pkg-config;
zlib = bootstrap_zlib;
perl = bootstrap_perl;
openssl = bootstrap_openssl;
libssh2 = bootstrap_libssh2;
libkrb5 = bootstrap_libkrb5;
nghttp2 = bootstrap_nghttp2;
gssSupport =
if stdenv.isDarwin || stdenv.hostPlatform.isWindows
then false
else old.gssSupport or true; # `? true` is the default
});
#
#
# finally, using only buildPackages, stdenv, lib,
# and the things above, we can make fetchurl
#
#
fetchurl =
if stdenv.buildPlatform != stdenv.hostPlatform then
buildPackages.fetchurl # No need to do special overrides twice,
else
(import ../build-support/fetchurl) {
lib = lib;
stdenvNoCC = stdenvNoCC;
buildPackages = buildPackages;
cacert = buildPackages.cacert;
curl = bootstrap_curl;
}
;
in
fetchurl
That fetchurl^ at the very bottom, I would call fetchurl2
.
stdenv.fetchurlBoot
would be fetchurl1
However, and this is the fixed-point function bit, I did modifiy one thing; the cacert
argument .
In the original expression its just cacert = cacert;
, and I honestly don’t know if
- It is the full cacert package (which needs python3, and python3Packages, and all of their dependencies), or
- If cacert is actually equal to
buildPackages.cacert
at that moment
I think its actually possible for both of those to be true;
- we build fetchurl2 using
buildPackages.cacert
(because thats the cacert we have at the time)
- now that we have fetchurl2, we can build python3, and other stuff that needed fetchurl
- eventually we build the full/normal
cacert
using fetchurl2
- now that we have cacert2, we start building fetchurl3 with cacert=cacert2 as the input argument
(and so on, and so on, until fetchurl4 or whatever, going until they get arguments that never change; e.g. fixedpoint)
Even for just the cowsay package, I know perl has to go through +3 iterations like that.
There’s still a lot more to be discovered, and hopefully others will chime in. I’m a 3rd year PhD Computer Science student at a top university who loves recursion … and this stuff still crushes my brain. We really need to reduce the complexity/recursion used in Nixpkgs.