Memoize result of builtins.exec

volth · February 2, 2019, 9:44pm

Nix has a rarely used builtins.exec which allows to execute an external program and capture its output on the eval phase
It has many benefits which otherwise would require import-from-derivation or won’t be possible at all.
A nice summary of the use cases by @shlevy is here https://github.com/NixOS/nix/commit/0bb8db257d98a32abde759f4d07d28b5178bd3bf

The problem is: if half of my use cases, the number of required builtins.exec invocations is quadratic to the number of required results.
One example is reading keys for Tinc or generate a random key and add it to git if it is a new machine and does not have a key yet.
if there are 100 machines, each has a key and each key is included in 99 closures of other machines.
So there are 9900 invocations of builtins.exec when compiling nixops forest of system closures, although there are only 100 unique keys (thanks to Nix’s referential transparency, if one could say that about something so inpure as builtins.exec)

I wonder, if it possible to memoize results of builtins.exec, or generally, memoize anything for the time of the Nix evaluation phase? It could be useful for other functions too, builtins.readFile, etc

I have some (probably false) memories that there was a commit in the Nix history adding something like builtins.memo or a singleton hash table which has been reverted afterwards, but I cannot find it…

ElvishJerricco · February 2, 2019, 11:13pm

Hm… Does Nix not implement maximal laziness? I thought this would ensure that two calls to builtins.exec with the same arguments would only be evaluated once. But this code takes 4 seconds to eval.

let args = ["bash" "-c" ''sleep 2 && echo \"hi\"''];
in { foo = builtins.exec args + builtins.exec args; }

$ time nix eval -f sleep.nix foo --allow-unsafe-native-code-during-evaluation
"hihi"

real	0m4.020s
user	0m0.013s
sys	0m0.008s

ElvishJerricco · February 2, 2019, 11:17pm

Decided to try out converting a haskell prime number generator to Nix to test this

{ num }:

with import <nixpkgs/lib>;

rec {
  filter = f: { head, tail }:
    let newTail = filter f tail;
    in if f head then { inherit head; tail = newTail; } else newTail;
  unfoldr = f: b:
    let a = f b;
    in { head = a.elem; tail = unfoldr f a.next; };

  primes = {
    head = 2;
    tail = filter
      (isPrime primes)
      (unfoldr (n: { elem = n; next = n + 2; }) 3);
  };
  isPrime = { head, tail }: n:
    head * head > n || mod n head != 0 && isPrime tail n;

  get = { head, tail }: n: if n == 0 then head else get tail (n - 1);

  foo = get primes num;
  bar = get primes num + get primes num;
}

$ time nix eval -f sleep.nix foo --arg num 20000
224743

real	0m1.901s
user	0m1.798s
sys	0m0.104s

$ time nix eval -f sleep.nix bar --arg num 20000
449486

real	0m1.907s
user	0m1.804s
sys	0m0.103s

Pretty much identical, so it looks like it is doing maximal laziness. I guess this is just disabled somehow for builtins.exec?

Infinisil · February 2, 2019, 11:29pm

I think you’re looking for builtins.memoise, which has been implemented in the memoise branch of Nix, see Add memoise primop · NixOS/nix@0395b9b · GitHub

shlevy · February 3, 2019, 1:00am

In absence of builtins.memoise, if you’re able to identify the whole set of keys in advance then you should be able to manually memoize by using an attribute set. Instead of:

  get-key = machine: builtins.exec get-key-prog [ machine ];

You could do

  get-key = let
    memo = builtins.listToAttrs (map (machine: {
      name = machine;
      value = builtins.exec get-key-prog [ machine ];
    }) machines;
  in machine: memo.${machine};

shlevy · February 3, 2019, 1:01am

Nix doesn’t have any form of maximal laziness. You can only get it by binding the repeated expression to a variable in a high enough scope.

ElvishJerricco · February 3, 2019, 4:42am

Just realized… my primes example was fast because they were sharing the primes list -_- This version does take twice as long, showing there is no maximal lazines.

{ num }:

with import <nixpkgs/lib>;

rec {
  filter = f: { head, tail }:
    let newTail = filter f tail;
    in if f head then { inherit head; tail = newTail; } else newTail;
  unfoldr = f: b: let a = f b; in { head = a.elem; tail = unfoldr f a.next; };

  isPrime = { head, tail }: n: head * head > n || mod n head != 0 && isPrime tail n;

  get = _n: let
    go = { head, tail }: n: if n == 0 then head else go tail (n - 1);
    primes = { head = 2; tail = filter (isPrime primes) (unfoldr (n: { elem = n; next = n + 2; }) 3); };
  in go primes _n;

  foo = get num;
  bar = get num + get num;
}

@shlevy Any idea why that paper didn’t end up implemented in Nix?

volth · February 5, 2019, 6:56pm

This was one of the ideas as part of the attempt to reduce Nix’s memory consumption a year ago:

to move upwards the thunks which captured no free variables in the current scope, so they will be evaluated only once and won’t prevent the current scope from being GC’ed.

As it not only reduces the memory footprint but also magically converts O(n^2) code to O(n) it might worth a second try

tomberek · August 13, 2021, 10:04pm

What happened to the memoise branch? Why was this not incorporated? I couldn’t find discussion on it.

PlumpMath · April 22, 2022, 10:45pm

https://github.com/svanderburg/nijs/issues/5

How about Dhall? (Even Haskell has no experience, pauses while studying because of other study recently)
I want to run all kinds of languages or execution and then get the evaluation outputs (except files) back to nix.
Security problem?! Is it because of the side-effects problem that can’t be constrained? It would be nice if only the calculated value of text could be piped.

PlumpMath · April 22, 2022, 11:31pm

Replit — Dynamic version for Nix derivations ;;
Ultimately, what you want is already there. I just hadn’t thought of this way.

but, yet hate file-mode direction…
like hope!

shell-cmd | ${nixfunc @dummy-var}