I finally wrapped up the approach I’m using into something coherent on github.
To sum up: when managing my infra in AWS, I have a git repository that has both Terraform configuration and NixOS flake defining configurations for all the hosts I might need.
By running terraform apply
I both: create new infra (EC2 machines) and provision them with their configuration passed to them via an s3 bucket. New hosts bootstrap automatically via user-data
init script, and then monitor their desired configuration and rebuild themselves when it has been updated using a system daemon.
This has many operational benefits:
- It integrates seamlessly with Terraform workflow, making desired NixOS configuration / configuration deployed work exactly the same as any other Terraform state.
- Avoids sshing to machine to update them (which is sometimes undesirable due to network restrictions, or require another set of proviledges to take care of) - and I don’t have to touch my yubikey (hw ssh key) many times just to update a handful of machines at once.