Navi (v0.0.x): A highly experimental NixOS deployment Tool/TUI & infrastructure lifecycle manager

I just made the repository for navi public: https://github.com/cafkafk/navi.

It’s a specialized deployment tool for NixOS, forked from Colmena. It extends the core deployment capabilities you might be used to with a persistent daemon architecture, integrated infrastructure provisioning via Terraform/Terranix, and a terminal user interface (TUI) for managing large-scale fleets.

It looks something like this:

I built this because I wanted deep native integration between infrastructure provisioning and system configuration. While I could have just used an existing deployment tool as an input, I wouldn’t have had the necessary control over the internals. And there is a lot to be gained from this kind of tight integration.

This is actually based on some older work, but at vitvio, we found that we needed a tool that would scale with us across multiple teams and larger infrastructures than are typically managed in homelab environments. So I made that tool… enjoy!

Some key features include:

  • TUI Dashboard: Launch navi tui for real-time fleet management, live monitoring, and interactive deployments.
  • Infrastructure Provisioning: Define Terraform resources alongside NixOS configurations in the same Hive, and manage lifecycle commands directly (navi provision).
  • Daemon Architecture: A background service manages connections and task queues (i.e. deploy in the background).
  • Disk Unlocking: Remote unlocking for encrypted ZFS pools via initrd SSH.
  • Provenance: Answers questions such as “what is outdated”, “who did this” and “what commit is this host currently on”.
  • Registrant support: Because it’s annoying that I have to touch HTTP just to deal with my domains.

It’s currently pretty GCP/bare-metal/nixos-anywhere centered (as those are my usecases), but it supports the whole lifecycle of provisioning, updating, reprovisoning and reinstalling a large fleet of machines across GCP and bare-metal from terraform/tofu/nothing → nixos-anywhere → normal “boot --reboot” ops.

A few major disclaimers:

  1. Security Notice: This tool is currently highly experimental. Do not use with production API keys or on multi-user systems, as credential handling is not yet hardened.
  2. Documentation: As of right now, the source code is the documentation. I don’t have the time to write or maintain formal docs at this stage. It’s often easier to just read the source code anyway.
  3. If you find yourself asking, “Why should I use this instead of [insert existing tool]?” The honest answer is: You really shouldn’t. Please don’t. At least, not yet. Expect breaking changes without notice.

Feel free to poke around the source code or try it out in a safe environment!

22 Likes

Thank you for sharing this open source. Please post an update when it’s less experimental.

4 Likes

I will indeed. Also just to clarify, I put it like this to make it clear that my current bandwidth isn’t really there to provide much support.

But the only thing I’m really overly cautious about security wise is if you’re running a system with multiple users (since they may be able to read the daemons socket), and if you’re using namecheap tokens (since their api is deeply antiquated and I’ve admittedly made some concessions in the name of security that shouldn’t be used broadly).


If anyone is actually trying this out or using it on a real system, I’d also love to hear about it to get feedback on your experience. Feel free to open UX issues just so I can map friction points.

Can’t promise I’ll get around to them, but it’s nice to know what’s wrong for when I do have some time.

4 Likes