Rec & env allocation

I was about to post what follows in a PR thread, but since it’s not the first time I’m making that point, I think discourse might be a better fit for this.


TL;DR: using rec has an evaluation cost in terms of env allocations. Not a massive one, but Nixpkgs-wide, it’s probably not totally negligible either.


To the point. Let’s take everybody’s favorite hello derivation:

{ lib
, stdenv
, fetchurl
, testVersion
, hello
}:

stdenv.mkDerivation rec {
  pname = "hello";
  version = "2.10";

  src = fetchurl {
    url = "mirror://gnu/hello/${pname}-${version}.tar.gz";
    sha256 = "0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i";
  };

  doCheck = true;

  passthru.tests.version =
    testVersion { package = hello; };

  meta = with lib; {
    description = "A program that produces a familiar, friendly greeting";
    longDescription = ''
      GNU Hello is a program that prints "Hello, world!" when you run it.
      It is fully customizable.
    '';
    homepage = "https://www.gnu.org/software/hello/manual/";
    changelog = "https://git.savannah.gnu.org/cgit/hello.git/plain/NEWS?h=v${version}";
    license = licenses.gpl3Plus;
    maintainers = [ maintainers.eelco ];
    platforms = platforms.all;
  };
}

Nothing fancy is happening here. We use a recursive attrset to propagate pname and version throughout the derivation. Let’s see the evaluation env allocation stats:

» NIX_SHOW_STATS=1 nix-instantiate -A hello
(...)
"envs": {
    "number": 64914,
    "elements": 89334,
    "bytes": 1753296
  },
(...)

Okay, now let’s use a smaller let env + inherit env to propagate pname and version instead of a rec attributeset:

{ lib
, stdenv
, fetchurl
, testVersion
, hello
}:
let
  pname = "hello";
  version = "2.10";
in stdenv.mkDerivation {
  inherit pname version;
  src = fetchurl {
    url = "mirror://gnu/hello/${pname}-${version}.tar.gz";
    sha256 = "0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i";
  };

  doCheck = true;

  passthru.tests.version =
    testVersion { package = hello; };

  meta = with lib; {
    description = "A program that produces a familiar, friendly greeting";
    longDescription = ''
      GNU Hello is a program that prints "Hello, world!" when you run it.
      It is fully customizable.
    '';
    homepage = "https://www.gnu.org/software/hello/manual/";
    changelog = "https://git.savannah.gnu.org/cgit/hello.git/plain/NEWS?h=v${version}";
    license = licenses.gpl3Plus;
    maintainers = [ maintainers.eelco ];
    platforms = platforms.all;
  };
}

Let’s evaluate that again:

(...)
"envs": {
    "number": 64914,
    "elements": 89330,
    "bytes": 1753264
  },
(...)

As you can see, doing that, we spared 4 elements allocations and about 32 bytes in the new env we just created.

In that case, the attrset on which we were applying rec is pretty small. The bigger the attrset will be, the worse the bloat will get.

Is there a massive allocation difference? No. Would it be worth the trouble of removing all the occurrences of that pattern in Nixpkgs? Probably not.

However, I don’t think the let alternative is much less readable, I think it makes sense to use it when we can. It won’t surely hurt.

6 Likes

However it is usually two lines longer and the let in itself is recursive which can be pretty confusing.

Edit:
Please tell me that I did the math wrong and forgot something because it sounds pretty negligible what we could save for eval:
We save 32 bytes per package and have 80 000 packages. This would sum up to 2560000 Byte or 2,56 MB which is not worth the effort IMO.

1 Like

The above example with the 32 bytes overhead is specific to the hello package.

It’ll vary depending on the size of the record you’re setting as recursive. I don’t have any data for the Nixpkgs-wide impact. I don’t think extrapolating the hello world example is a good approach to get an approximation.

Is there an easy way how we can approximate in a better way how much this would save? Maybe it is only relevant for very big package or package sets?

Hosted by Flying Circus.