How do I approach optimizing an expression that takes a long time to evaluate? Is there any form of Nix profiler? Are there easy ways to experiment with techniques like memoization?
I mean this as a general issue but the expression that I am dealing with now is the hydraJobs attribute of flake github:lukego/nix-cl-report/ac1901afba8ca8501c12cfcd744b2ce0ae59a454.
This expression produces several thousand derivations that each reference large sets of build inputs. If I reduce the size of the build inputs then the expression evaluates much more quickly. I assume that a computation somewhere is being repeated.
I assume problems like this can be solved by strategically reusing/caching/memoizing values but itās not clear to me how to identify which ones and to affect the caching.
EDIT: My problem turns out to be specific to Hydra evaluating expressions using hydra-eval-jobs processes (see discussion below.)
I have not actively monitored the time spent on eval, though ran the following command and when I came back from a short discussion with a coworker it was already querying substituters. So evaluation didnāt took longer than ~15 minutes.
I have to be honest, after printing the initial list of ~27000 items to build, it takes another 20 minutes before actually the first thing gets substituted.
Though then I have a constant flow of active downloads.
I do not know though, what hydra actually counts as āevaluation timeā.
It might be, that it counts the related builds into the evaluation here, as it canāt start a new evaluation as long as related builds havenāt finished anyway.
I think I also do remember, that back when I still had a hydra running, the displayed āeval timeā indeed was dependant on the amount of stuff to build.
Interesting. I donāt think thatās the situation here.
My āevaluationā has been running for over one hour now. I donāt see any downloads in the Hydra logs (journalctl -f) as I usually would.
I do see a procession of hydra-eval-jobs processes using ~100% CPU and ~4GB RAM. Running ātopā the picture is consistent but there is a new pid on each refresh.
I wonder whatās going on? Is Hydra running a long series of sub-evaluations in separate processes before kicking of substitutions and builds?
I would first search though your code if you are iterating anywhere over all your nixosConfigurationās and maybe recurse them to find attributes about machines to eg generate update scripts or so.
I suspect the issue is that Hydra is evaluating job derivations separately and that this inhibits memoization.
I say this because running top on the Hydra server shows a series of short-lived (~1sec) hydra-eval-jobs processes running at maximum CPU.
It seems unfortunate if (a) these processes are repeating a lot of evaluation work already performed by the previous process and (b) running one-at-a-time despite the machine having many CPU cores.
I thought that as well after reading Elcoās thesis but then I did a test calling a function performing a long running computation twice with the same arguments and from time it took to run the code Iām 100% sure it ran the computation twice.