Managing multiple machines, async updates

Hi there, I manage a fleet of a few hosts, each having its own nixosConfiguration output in the flake, with lots of code reuse. So far I deploy to each machine basically with nixos-rebuild switch --build-host foo --target-host foo --use-remote-sudo (actual script).

I also manage some machines that aren’t always up (e.g. laptops, a liveusb system). For these machines, my deployment method is a problem: I can’t nixos-rebuild to a target host that’s not available at that time. Going around and booting each laptop is a pain.

I’d love a deployment tool that:

  • if a host is available when i’m operating, lets me update that host when i run a command;
  • if the host is not available, can build the configuration for that host, keep a GC root for that configuration, and somehow advertises that there’s an update available for that host;
  • has an on-host component that polls for available updates and switches to them
  • bonus points if the on-host component can interact graphically with the user to confirm that it’s ok to download/apply the update; extra bonus points if i can mark an update as urgent bypassing this interaction (e.g. for security patches).

Do you have suggestions on how to achieve this?

Your requirements look like a perfect fit for bento Introducing bento, a NixOS deployment framework