Anyone using NixOS your production deployment OS?

If so, what are the challenges you encountered?

4 Likes

I took the easy way and deployed to AWS, where I can build a ready to launch AMI with nixos.

Beyond that, it has mostly been an exercise in understanding the Nixos approach to managing VPS

For clusters of VPS, Nixos has been a good foundation, and I found myself using some orchestration like hashicorp nomad (which support VPS) to use and deploy services.

Usually I am still using terraform to provision machines, or more recently Pulumi.

nice, good to hear. Do you have multiple hosts when you deploy? I need to wrap a bunch of hosts in an ASG and deploy to that. I think NixOS’ nixos-rebuild switch --target command is going to make it a lot easier.

I am using Pulumi for spinning up instances but never heard of Hashicorp Nomad. It looks interesting. When you say orchestration, do you mean orchestrating a single stage deployment? I’m assuming you will use something like Jenkins to orchestrate multi-stage deployment right? We have a pre-prod environment, dev, and a production environment.

I deploy 3 machines currently. The greatest upfront challenge is definitely getting a good workflow and building up the surrounding infrastructure (tooling and workflows mainly), but once you have that it’s a really enjoyable experience.

For provisioning baremetal (most of my machines), I’ve a workflow involving nixos-generators, nixos-anywhere, and some custom tooling; I essentially build a tailored installer image depending on the machine (architecture, etc), flash it to a usb, boot it, then use nixos-anywhere to bootstrap and provision it. There are still a few manual steps here and there but even when provisioning many (I’ve briefly had ~20 machines provisioned through a similar system) it’s not that bad as it essentially boils down to plug in usb; boot to ram; repeat with next machine, and then provisioning them all simultaneously with nixos-anywhere

To actually manage all the machines, I use deploy-rs. It’s a joy to work with, super easy to configure, and because of it’s magic rollback features I’m not afraid of changing networking and other critical settings. One drawback is having to build all configurations even if you only want to deploy one which can cause extremely slowdowns if e.g. one of your machines use a different architecture, a custom kernel, non-cached software, or even worse: all three (yay for my raspberry pi running home-manager on zfs)

My biggest pain point is provisioning cloud servers as, depending on the provider, it can be pretty cumbersome. Nixos-infect solves a lot of problems here, but not everything sadly.

Something I’ve been meaning to get to, is using Nixos for the machines, and then running Kubernetes on top for all the services (as much as I love nixos, there are some things that are just better suited to k8s. worth nothing is kubenix).