Hi all!
I’m happy to announce that we open sourced a terraform module to automate deploying NixOS to fleets of EC2 instances.
This module is useful for when you are deploying and managing long-lived EC2 instances. For stateless workloads I would instead recommend baking AMIs with GitHub - NixOS/amis: Home for NixOS AMI automation and coordinating the rollout of those with auto-scaling groups and instance-refresh. (We’re working on open sourcing examples for this as well).
To deploy, you can either deploy flakes directly (in which case evaluation happens on the EC2 instance; which requires internet access and quite a bit of RAM) or you can deploy nix store paths that you push to an S3 bucket cache first.
Fully working examples of both deploying flake-refs or nix-store-paths can be found in terraform-aws-ssm-nixos-deploy-document/examples at main · MercuryTechnologies/terraform-aws-ssm-nixos-deploy-document · GitHub
Basic usage can be as simple as booting EC2 instances with the AMI documented at Download | Nix & NixOS, and then creating an SSM association describing what NixOS configs to push based on tags.
These tag selectors can be arbitrarily complex. You can for example split up your servers in groups of production and staging; and push different flakes to them. Please refer to the AWS documentation for all the options.
For example, here is a group of instances that will pull and deploy a flake every 5 minutes:
resource "aws_instance" "webserver" {
count = 4
ami = data.aws_ami.nixos.id
instance_type = "t3a.small"
tags = { Role = "webserver" }
}
# pull-based
resource "aws_ssm_association" "webserver" {
name = module.nixos_deploy_document.id
targets {
key = "tag:Role"
values = ["webserver"]
}
parameters = {
installable = "github:myorg/myrepo#webserver"
}
schedule_expresion = "5 minutes"
max_concurrency = "50%"
max_errors. = "50%"
}
Let me know what you think!