I’ve been exploring training ML models in a reproducible way via nix, in a style similar to Practical Nix for Data Science: Part 2
However one shortcoming of this approach is that builds can run for hours or days and you’d like to monitor and graph the training as it progresses (with tools like tensorboard and wandb).
How would you go about solving this in an elegant way? One thought I had is to create a build hook that watches some log files in the build directory:
nix-build -E '(import <nixpkgs> {}).callPackage ./model.nix {}' --build-hook $(pwd)/builder.sh
But it’s not clear to me how --build-hook
works or if it even ought to be used at all. It would be annoying if it got deprecated in future. Perhaps you have a better idea?