Introducing bento, a NixOS deployment framework

What’s the difference between NixOS for workstation and NixOS for server? It’s just NixOS in the end :thinking:

Except that if your servers are interconnected, lazy updates are not suitable.

Ha, glad you asked! I just felt the need to embed this in a bigger context w.r.t. ecosystem semantics: NixOS for Workstations vs NixOS for Servers

Since, I’m looking at Nix 90% with the “for Business” lense, I’ve somehow felt that mismatch in idealized and purported usage scenario tacitly happening in the past, so I thought this may merit a different framing.

This was just a first take, but I hope that framing goes somewhat in the right direction and leads us to a marginal ecosystem improvement, so the creator will.

1 Like

This make me realize it should be the reason why I named bento “a deployment framework” in this thread title, because it’s something you can build upon.

I’d not expect any business to use it as this :scream_cat: :scream_cat: but as a foundation to build something matching their requirements. They could throw the code entirely, and just keep the idea if this is enough for them :smiley:

I added a way to track the state of remote systems
https://asciinema.org/a/519060

Instead of relying on sending configurations files through sftp, and run nixos-rebuild flakes / not flakes, this involve a lot of conditionals, and I’d like to make things simpler.

In the current state, it’s also not possible to use a single flakes file to manage all hosts. This is going to change.

I found a way to transfer a nixos configuration file as a single file, this will be transmitted over sftp, so I’ll be able to tell if a client is running the same derivation that is currently on the sftp, and solve a lot of problems. I’ll be able to get ride of nixos-rebuild too, and the client won’t know if it’s using flakes or not.

create a derivation file for the system, using flakes. I still need to figure how to do this without flakes

DRV=$(nix path-info --json --derivation .#nixosConfigurations.bento-machine.config.system.build.toplevel | jq '.[].path' | tr -d '"')

make the result of $DRV available to the remote machine

nix-build $DRV -A system   (or nix build $DRV)
sudo result/bin/switch-to-configuration switch (or boot)

edit: getting the derivation path using non flakes

nix-instantiate '<nixpkgs/nixos>' -A config.system.build.toplevel -I nixos-config=./configuration.nix

Now I just need to ensure the result only contain what’s required for this host.

Related: NixOS: switch-to-configuration script does not correctly add a boot entry when executed standalone · Issue #82851 · NixOS/nixpkgs · GitHub

1 Like

I may have been a bit too enthusiastic, because it doesn’t seem to do what I thought :sweat_smile:

That’s potentially a lot of headache avoided, thank you very much :star_struck:

Now featuring time since update, and if not up to date, time since the configuration is available

1 Like

I’ve been hitting issues with nixos-rebuild, it’s interesting because the command doesn’t report correctly it’s failing.

Bento is now reporting issues like not enough disk space, but ultimately I need it to report the current version of the system to compare with what we have locally :+1:t3:

https://github.com/NixOS/nixpkgs/issues/189966

1 Like

Is this asking for help?

Anyways, just in case it’s (tangentially) useful: https://github.com/input-output-hk/bitte/blob/f452ea2c392a4a301c1feb20955c7d0a4ad38130/modules/terraform.nix#L191-L194

1 Like

I added the feature to compare the expected NixOS version and the last reported NixOS version

   machine   local version   remote version              state                                     time
   -------       ---------      -----------      -------------                                     ----
  kikimora        996vw3r6      996vw3r6 💚    sync pending 🚩       (build 5m 53s) (new config 2m 48s)
       nas        r7ips2c6      lvbajpc5 🛑 rebuild pending 🚩       (build 5m 49s) (new config 1m 45s)
      t470        ih7vxijm      ih7vxijm 💚      up to date 💚                           (build 2m 24s)
        x1        fcz1s2yp      fcz1s2yp 💚      up to date 💚                           (build 2m 37s)

The state “sync pending” means we updated the files used by Bento (mostly 2 shell scripts to download and run nixos-rebuild), and “rebuild pending” when the hash differs, this one implies a rebuild will be done on the remote NixOS.

It’s not perfect, but I finally found how to use systemd sockets to run something, there are two methods, one with a socket and a .service with the same name that should listen on network. The other is using a socket and a template ending with a @ in the name, and using Accept=yes in the socket. I needed the latter.

  systemd.sockets.listen-update = {
    enable = true;
    wantedBy = ["sockets.target"];
    requires = ["network.target"];
    listenStreams = ["51337"];
    socketConfig.Accept = "yes";
  };

  systemd.services."listen-update@" = {
    path = with pkgs; [systemd];
    enable = true;
    serviceConfig.StandardInput = "socket";
    serviceConfig.ExecStart = "${pkgs.systemd.out}/bin/systemctl start bento-upgrade.service";
    serviceConfig.ExecStartPost = "${pkgs.systemd.out}/bin/journalctl -f -n 10 --no-pager -u bento-upgrade.service";
  };

if you open http://localhost:51337 in the web browser, it starts the update process and shows the journal log of the update service :smiley:

2 Likes

Bento :bento: now supports rollback! The fleet display shows the rollback status, so you can easily be aware that something went wrong.

It’s also getting closer to work as a self containing script that doesn’t need to be store in the top level of repo. It’s still shell script only (not even bash), like 400 lines. I may rewrite it into somewhere else at some point.

Also, I’d like it to be able to use a single flakes with many hosts in it, instead of putting a flake per host directory. Ideally, it should support both.

2 Likes

A while ago, I started GitHub - divnix/hive: The secretly open NixOS-Society which deals with a certain folder layout for organizing users and their hosts.

It would be nice if bento would have only a weak opinion on the flake layout.

The flake layout is prime screen assets and in order to limit boilerplate bloat, I enjoy if a vertical framework considers horizontal integration points (i.e. integration points that aren’t anchored to a particular vertical schema).

I’d then like to try an integration upon reviving that project (for workstations :stuck_out_tongue_winking_eye: ).

The next step is to handle flakes with multiple machines.

The current layout is to have the configuration file of each host in its own directory. In the future, for directories with a flakes.nix files, bento will look for machines with a configuration file, and will iterate over the hosts list, instead of using the directory name for the machine.

This doesn’t break setups with non-flakes (it’s still experimental!), we can still use separate directories with their own flakes / non flakes, and we can have a single directory with a flake and many hosts, and handle that.

I need to hack a bit now :rainbow:

2 Likes

I’m just writing a pattern piece for Standard about CD. And after some discussion with a friend and colleague, I realize, there is one very huge advantage of pull-based approaches:

They don’t require you to punch security holes into your target infrastructure, since the target infrastructure never needs to listen.

The ideal would be therefore pull-based approaches (like bento) that aren’t also choreography (unlike bento).

Eureka! bento-server as a “clearance” state machine that can be interacted with through a well-defined API. Sounds like something? :grinning_face_with_smiling_eyes:

I’m not interested to go into this.

But as you said, with a pub/sub or polling agent on each machine, you schedule and order systems upgrades to match your needs. Each server could always get the latest build ASAP, but wait for green light to switch their configuration.

If you need to update a pool, you could group servers in a pool, and just give the green light to them one by one.

The project is now hosted on GitHub GitHub - rapenne-s/bento: A KISS deployment tool to keep your NixOS fleet (servers & workstations) up to date.
It seem nixers enjoy GitHub, I wouldn’t prevent them from contributing :+1:t3:

I think you are. :grinning_face_with_smiling_eyes:

Isn’t that topic to which servers would subscribe the very same “clearance” authority that I describe?