Our family is redoing our home lab to use NixOS. Currently, we SSH into the server and execute nix-rebuild, but we are exploring other deployment options.
Our top contenders are NixOps, deploy-rs, or just using --target-host with nixos-rebuild.
All three options seem pretty straightforward, so we may just try all three and see what we like. At present, my goal is to build locally, then deploy. All target and build architectures are currently NixOS and x86_64, although it is not out of the question that we might throw some ARM targets in the mix at some point.
Have any of you tried 2 or more of these options, and have any reflections to add? What did you love about a particular tool?
nixops is abandonware. It was kind of magical for aws deployements where it could do everything form scratch.
If you have physical access to the machines, nixos-rebuild is the simplest and most straightforward one.
I once locked out myself of a machine with nixos-rebuild because I had disabled the ssh server. Since then I use deploy-rs that automatically fallbacks to previous generations in case of issues but it adds complexity compared to nixos-rebuild, might not be worth it for you.
Looking at the NixOps repo activity, it wouldnāt occur to me that this project is abandonware, other than the oddly recent transition to Python 3. What reasons cause you to brand it abandonware?
Using nixos-rebuild sounds like a good starting place, then. Much appreciated!
we donāt want to hurt anyoneās feelings by saying thatā¦ but very unfortunately it is true
real shame too because to me nixops, conceptually speaking, is the perfect tool for the type of networks i manage
since you said homelab you might consider colmenaā¦ it has one nice feature where you can reference configuration from machine a on machine b, which can be really nice if you have a reverse proxy or whatever
+1 to colmena. For me the āreference other machinesā configs when generating a machineās configā feature is a game changer. This is something that nixops did originally, and was carried over (almost identically) in later deployment tools like morph and colmena. There are a few minor things that I wish colmena did better (e.g. remote evaluation, nix-output-monitor support, etc.), but itās pretty close to a local optimum for me.
A few things I do with this ācross-referencing other machinesā configā feature:
Machines that I configure as being remote builders automatically get added to the /etc/nix/machines of all the other machines that need to perform nix builds.
All the enabled Prometheus exporters on all my machines automatically get scraped by my Prometheus server without having to write any of the plumbing manually.
I auto-generate a wireguard p2p mesh by having each machine look at the wireguard config of other machines in my networkās Nix config.
I remember deploy-rs not playing well with having to type a password for remote sudo. I know 1-2 other tools Iāve tried also had similar problems. ot sure if theyāve been fixed since then.
but in case the machine configurations are in flakes - referencing different configs is not a problem, in this case just doing nixos-rebuild --target-host will do the trick, no?
Iām just personally trying to keep all the āthird-partyā tools to the minimum
This was surprising for me to read, but seems to be true. Iāve been happily using it for years to maintain my remote (some virtual) machines. Iād been meaning to look into colmena but havenāt had an impetus to, given that I havenāt had any issues with NixOps. If plans really are to sunset it, then that might be the impetus I need.
Cachix agent looks very nice; thanks for pointing that out.
Our current use case is a single deployment target that has decent compute resources. So we are finding that our original practice of connecting via SSH may not be so bad. Running nix-rebuild --target-host works, but doesnāt necessarily gain us anything over connecting first (and using tmux, which has advantages) then running nix-rebuild.
Given our scenario, your system.autoUpgrade idea, @hugosenari,is brilliant. Honestly, I didnāt know that option existed!
Gonna run that by the Ops team nowā¦ (the 16-year-old in the basement). Weāll seeā¦
FWIW Iāve been using deploy-rs as of late for most of my Pis and itās been rather painless. I would like to give colmena a try. Would be nice to just use nixos-rebuild --target-host but I havenāt and deploy-rs offers those nice checks to avoid mistakes
Basically what it does is constantly asking your system to be reinstalled, pointing to a repository.
So Iām not expecting it to update your inputs. And you have to add --refresh or nix will use flake cache (not a fresh pull)
This would be run on hosts I log into less often, or that are used rarely and should be upgraded on boot/resume after a period of time. Previously, I had one or two of those with the auto upgrade service enabled, on channels. I disabled that when everything moved to a system flake.
Itās why I havenāt used any of the various deployment tools (though bento seems interesting), because theyāre mostly push style, and this use-case needs something more pull based.
I, too, had missed that there was a flake option for the autoupdate service. For me, the idea here is that I update regularly on my active desktop, and push those revisions including the locked flake inputs to the repo. So those autoupgrades update inputs via the git pull of the lock file, and only update to revisions I have already used and built, rather than updating their lock file locally. This also means the content will already be in the store of another local system.
I might (if I can be bothered) even keep a separate branch for the known-good updates. Thatās still less branch maintenance than I used to do before consolidating a common base and per-host branches into a single flake branch.
I set this up on one of those hosts shortly after posting here, and initial tests were good. But I had one suspicion, which turned out to be valid on testing again this morning:
The default config of the autoUpdate service has a timer with the persistent flag set, so it runs when the laptop is resumed after being suspended for a while (overnight, or for several monthsā¦)
But the problem I found this morning:
Dec 14 09:05:13 rocinante systemd[1]: Starting NixOS Upgrade...
Dec 14 09:05:13 rocinante nixos-upgrade-start[32301]: warning: you don't have Internet access; disabling some network-dependent features
Dec 14 09:05:13 rocinante nixos-upgrade-start[32298]: building the system configuration...
Dec 14 09:05:13 rocinante nixos-upgrade-start[32355]: warning: you don't have Internet access; disabling some network-dependent features
ā¦
It runs too soon after resume, before the wifi has had a chance to reconnect.
So it needs and additional dependency on network-online.target probably, and/or a delay. Edit: that dependency is already there, but doesnāt seem to apply after resume, Iāll look further into this.
I use this snippet for fwupd-refresh, but it could be easily used for nixos-upgrade. It just allows the service to restart several times over several minutes without failingā¦which typically should allow the network time to be up on a laptop.