Memoize result of builtins.exec


#1

Nix has a rarely used builtins.exec which allows to execute an external program and capture its output on the eval phase
It has many benefits which otherwise would require import-from-derivation or won’t be possible at all.
A nice summary of the use cases by @shlevy is here https://github.com/NixOS/nix/commit/0bb8db257d98a32abde759f4d07d28b5178bd3bf

The problem is: if half of my use cases, the number of required builtins.exec invocations is quadratic to the number of required results.
One example is reading keys for Tinc or generate a random key and add it to git if it is a new machine and does not have a key yet.
if there are 100 machines, each has a key and each key is included in 99 closures of other machines.
So there are 9900 invocations of builtins.exec when compiling nixops forest of system closures, although there are only 100 unique keys (thanks to Nix’s referential transparency, if one could say that about something so inpure as builtins.exec) :frowning:

I wonder, if it possible to memoize results of builtins.exec, or generally, memoize anything for the time of the Nix evaluation phase? It could be useful for other functions too, builtins.readFile, etc

I have some (probably false) memories that there was a commit in the Nix history adding something like builtins.memo or a singleton hash table which has been reverted afterwards, but I cannot find it…


#2

Hm… Does Nix not implement maximal laziness? I thought this would ensure that two calls to builtins.exec with the same arguments would only be evaluated once. But this code takes 4 seconds to eval.

let args = ["bash" "-c" ''sleep 2 && echo \"hi\"''];
in { foo = builtins.exec args + builtins.exec args; }
$ time nix eval -f sleep.nix foo --allow-unsafe-native-code-during-evaluation
"hihi"

real	0m4.020s
user	0m0.013s
sys	0m0.008s

#3

Decided to try out converting a haskell prime number generator to Nix to test this

{ num }:

with import <nixpkgs/lib>;

rec {
  filter = f: { head, tail }:
    let newTail = filter f tail;
    in if f head then { inherit head; tail = newTail; } else newTail;
  unfoldr = f: b:
    let a = f b;
    in { head = a.elem; tail = unfoldr f a.next; };

  primes = {
    head = 2;
    tail = filter
      (isPrime primes)
      (unfoldr (n: { elem = n; next = n + 2; }) 3);
  };
  isPrime = { head, tail }: n:
    head * head > n || mod n head != 0 && isPrime tail n;

  get = { head, tail }: n: if n == 0 then head else get tail (n - 1);

  foo = get primes num;
  bar = get primes num + get primes num;
}

$ time nix eval -f sleep.nix foo --arg num 20000
224743

real	0m1.901s
user	0m1.798s
sys	0m0.104s
$ time nix eval -f sleep.nix bar --arg num 20000
449486

real	0m1.907s
user	0m1.804s
sys	0m0.103s

Pretty much identical, so it looks like it is doing maximal laziness. I guess this is just disabled somehow for builtins.exec?


#4

I think you’re looking for builtins.memoise, which has been implemented in the memoise branch of Nix, see https://github.com/NixOS/nix/commit/0395b9b94af56bb814810a32d680c606614b29e0 :slight_smile:


#5

In absence of builtins.memoise, if you’re able to identify the whole set of keys in advance then you should be able to manually memoize by using an attribute set. Instead of:

  get-key = machine: builtins.exec get-key-prog [ machine ];

You could do

  get-key = let
    memo = builtins.listToAttrs (map (machine: {
      name = machine;
      value = builtins.exec get-key-prog [ machine ];
    }) machines;
  in machine: memo.${machine};

#6

Nix doesn’t have any form of maximal laziness. You can only get it by binding the repeated expression to a variable in a high enough scope.


#7

Just realized… my primes example was fast because they were sharing the primes list -_- This version does take twice as long, showing there is no maximal lazines.

{ num }:

with import <nixpkgs/lib>;

rec {
  filter = f: { head, tail }:
    let newTail = filter f tail;
    in if f head then { inherit head; tail = newTail; } else newTail;
  unfoldr = f: b: let a = f b; in { head = a.elem; tail = unfoldr f a.next; };

  isPrime = { head, tail }: n: head * head > n || mod n head != 0 && isPrime tail n;

  get = _n: let
    go = { head, tail }: n: if n == 0 then head else go tail (n - 1);
    primes = { head = 2; tail = filter (isPrime primes) (unfoldr (n: { elem = n; next = n + 2; }) 3); };
  in go primes _n;

  foo = get num;
  bar = get num + get num;
}

@shlevy Any idea why that paper didn’t end up implemented in Nix?


#8

This was one of the ideas as part of the attempt to reduce Nix’s memory consumption a year ago:

to move upwards the thunks which captured no free variables in the current scope, so they will be evaluated only once and won’t prevent the current scope from being GC’ed.

As it not only reduces the memory footprint but also magically converts O(n^2) code to O(n) it might worth a second try :slight_smile: