Four years ago, @sternenseemann showed that builtins.genericClosure can serve as a trampoline in Nix, encoding each step of a tail-recursive function as a “node” and letting the C++ worklist loop drive iteration instead of Nix call frames. [1] It worked at 10,000 iterations where plain recursion overflows, but the overhead from copying state at every step led him to call the approach “pretty much useless.” Worse, thunks accumulate in the non-key fields and overflow the C++ stack when you force the result. “I’ve not toyed with this,” he wrote.
I’ve been building nix-effects, an algebraic effects library for Nix, and the interpreter needs to chain a lot of operations through a freer monad. The thunk problem was the wall.
The fix is one line:
key = builtins.deepSeq newState (step.key + 1);
genericClosure forces the key field at every step for deduplication. Embed deepSeq newState inside that expression and you piggyback on the forcing. No thunk chain forms. Paste into nix repl:
let
trampoline = init: step:
let
steps = builtins.genericClosure {
startSet = [{ key = 0; state = init; }];
operator = s:
let result = step s.state;
in if result.done then []
else [{
key = builtins.deepSeq result.state (s.key + 1);
state = result.state;
}];
};
in (builtins.elemAt steps (builtins.length steps - 1)).state;
in (trampoline { i = 0; total = 0; } (s:
if s.i >= 100000
then { done = true; state = s; }
else { done = false; state = { i = s.i + 1; total = s.total + 1; }; })).total
Returns 100000. Now remove the builtins.deepSeq result.state from the key line and run it again. genericClosure churns through all 100,000 steps without complaint. Then .total tries to force a chain 100,000 thunks deep and dies. Same trampoline, same steps. The failure shows up when you try to use the result, not during the computation. Looks like the trampoline is broken when it isn’t. That misdirection is why sternenseemann shelved it.
Trade-offs. genericClosure’s std::map adds O(log n) per step for key dedup, and it shows: at 1M iterations the trampoline takes about a second versus ~90ms for foldl' doing the same work (roughly 11x overhead, benchmarked with hyperfine, Nix 2.31). If you know the iteration count, use foldl'. The trampoline is for when you don’t, and for compound state that foldl' can’t handle without building its own thunk chains.
Full writeup in a blog post: Trampolining Nix with genericClosure (the thunk trap mechanism, a real freer monad interpreter, benchmarks).
To the best of my knowledge, nobody has documented the deepSeq-in-key trick before. I could be wrong; it’s the kind of thing that might live in someone’s dotfiles and never get written up. Has anyone hit the stack limit in production Nix code, and is there a realistic path for builtins.trampoline (#8430)?
[1] @sternenseemann, Tail Call Optimization in Nix Today, February 2022.