CI/CD rebuilds via github

tcurdt · November 26, 2023, 10:14pm

I have my server configuration on github and want to trigger deployments/rebuilds via github workflows.

The idea would be to have very restricted ssh account that is only allowed to pull my flake repo and then trigger a rebuild. Icing on the cake would be to make a health check an rollback if that fails.

Is there a better way to approach this?
Any pointers of what to look into?

ajaxbits · November 28, 2023, 1:19am

I was able to accomplish this with Tailscale and Tailscale SSH a while ago.

Idea being that you write a workflow to:

Attach the GitHub runner to your tailnet (ephemerally, and with very restricted permissions)
Build the configuration
Push it to the machine over ssh, which is automatically directed and authenticated over Tailscale.

You can find an example of this here. And I got the automatic rollback by using deploy-rs for deployments, which has that baked in.

However, I recently abandoned this in favor of a pull-based CD system with Garnix and system.autoUpgrade. My homelab automatically polls GitHub and builds any new code on the machine. I like this because it simplifies my setup and avoids slow GitHub builds.

I got this from the excellent book, NixOS in Production. Totally recommend.

The only downside is that I lost the automatic rollback with this method. Still working that out.

prescientmoon · November 28, 2023, 7:53am

How do you secure said setup? From what I’ve heard keeping ssh root keys around on a personal computer is not a good idea, but wouldn’t such a setup allow a script to push a malicious commit to said repo, which would then push another script to the server with root perms, hence being equivalent to the aforementioned thing (with some added security by obscurity).

I know I’m a bit paranoid (nobody would spend enough time to investigate my setup in such detail), but I feel like many other push-based tools have similar issues (last I checked most of the nix ones had issues when using a user ssh key and having to handle sudo being invoked)

tobiasBora · November 28, 2023, 8:40am

@prescientmoon well, anyone allowed to push to the github could indeed install any malware they want. So in a sense, the ssh key allowed to push to the github repository replaces the root ssh key, and the password/2FA of github’s account is like the root password, so I don’t see why it would be less secure than a normal root+ssh config if you properly configure the github account/repo.

Now, indeed, if you want extra security and are worried about the fact that your user should not be allowed to push to the server, you might need to create a new account just for the server… You can also enable stuff like mandatory reviews/codeowner to force any commit to be reviewed, and use protected branch + required status to forbid merge that do not fullfil some requirements (tests, signatures of commits…) Protected branches and required status checks - The GitHub Blog See also git - How to limit pushing operation to allow only commits that are signed with GPG in github - Stack Overflow

PS: people interested by CI/CD for github pages from nix-based compiled website can look at my own config here https://github.com/leo-colisson/website/blob/e21420a717c10ddc70d3740a11492fbd2eba7ef6/.github/workflows/main.yml

domenkozar · November 28, 2023, 2:57pm

That’s what Cachix Deploy does and it’s free

tobiasBora · November 28, 2023, 4:42pm

Also, why not simply configure

system.autoUpgrade.flake = "github:YourUser/yourRepo";
system.autoUpgrade.enable = true;

Or, for more complicated setups, running a git comand via either cron or a systemd timer as recommended here Automatic rebuild on every push to master · Issue #5 · zupo/nix · GitHub

{
  systemd.timers.git-updater = {
    wantedBy = [ "timers.target" ];
    # Wait 60 seconds after the service finished before starting it again
    # This should prevent the service being started again too early if an update is in progress
    timerConfig.OnUnitInactiveSec = 60;
  };
  systemd.services.git-updater = {
    # I'm not entirely sure why this would be needed
    serviceConfig.Type = "oneshot";
    script = ''
      # Update script here
    '';
  };
}

tcurdt · November 29, 2023, 3:34pm

How do you secure said setup?

As @tobiasBora mentioned, in this very case access to the github repo itself is already the weak point.

I usually always had a special user with a non-interactive login that is only allowed to run a single command. Basically using ssh as a low-risk secure trigger.

When the machine config is coming from the repo this approach becomes a little pointless though.

I guess it’s the question whether infrastructure and system configurations are trusted enough from a remote repo. Maybe it could be possible to required signed commits and verify them all before a deploy?

tcurdt · November 29, 2023, 3:39pm

Or, for more complicated setups, running a git command via eith cron or a systemd timer

Polling isn’t that great though.
Either there is a delay or it’s a not very responsible use IMO.

But it’s hard to beat

system.autoUpgrade.flake = "github:YourUser/yourRepo";
system.autoUpgrade.enable = true;

in simplicity I guess.

How often is that called?

tcurdt · November 29, 2023, 3:45pm

It consists of running a simple daemon process cachix deploy agent myagent that connects to our backend using websockets and waits for a new deployment. There’s no Nix evaluation or building done on the agent. The agent pulls all binaries from your binary cache and activates the new deployment.

Thanks for the pointer!
I need to look into that I guess. Sounds like it does it all.

tobiasBora · November 29, 2023, 4:48pm

Polling is indeed not great in term of efficiency, but has the advantage of avoiding the configuration of web hooks/additional users/security issues etc and is super simple to configure… (e.g. if you want to avoid DOS attacks, and use webhooks, I guess you want to only allow github to run it otherwise people could try to force your server to upgrade many time a day which would wake it less responsive) Otherwise I guess you can setup a new user with very little rights except for calling the upgrade script (maybe using a setuid script, a mini local web server, or a special shell?), and run in a github action a script to login to that user or connect to the web server with the appropriate credentials, but I don’t of a simple copy/paste solution for that (I guess it should exist?), except for solutions mentioned above.

By default, the system upgrade every day at 04:40 if my understanding is correct, according to NixOS Search but this can be configured using autoUpgrade.date = "hourly" or "minutely", but you can also set *:0/5 (to be confirmed) to run it every 5mn… Details: systemd.time

tcurdt · November 29, 2023, 10:48pm

Triggering via ssh is actually quite simple. In /etc/ssh/sshd_config you just need to add:

Match User deploy
  PasswordAuthentication no
  AllowAgentForwarding no
  AllowTcpForwarding no
  X11Forwarding no
  ForceCommand /etc/ssh/allow.deploy

and then in the ForceCommand check the command

#!/usr/bin/env bash

set -eu
set -o pipefail

[[ -z "${SSH_ORIGINAL_COMMAND:-}" ]] && exit 1

case "${SSH_ORIGINAL_COMMAND}" in
  "/usr/local/bin/deploy"*)
    exec /usr/local/bin/deploy
    ;;
  *)
    echo "invalid command"
    exit 1
    ;;
esac

It’s best to avoid any arguments to leave out a whole train of security headaches.

It’s pretty much just a ssh key that cannot do much damage. No DOS attack, no public webhook, mini web server with setuid. Like this it is a nice and secure way for a remote trigger - that even returns log output.

tobiasBora · November 30, 2023, 8:56am

Well I guess the deploy script must be setuid to be allowed to run nixos-rebuild, otherwise you need to give root access to the user. I guess it’s fine if you make sure the script accepts no argument, but one still need to configure it properly.

tcurdt · December 2, 2023, 4:15pm

The deploy user has sudo access for some clearly defined commands which avoid the setuid.
I would hope that’s enough for a nixos-rebuild?

tobiasBora · December 2, 2023, 5:18pm

Oh yes good point, I guess it should be enough.

tcurdt · December 2, 2023, 6:02pm

Might be nice to wrap this up in a nix package/flake.
As a newb I am not sure I am up for that yet - but could be nice addition.
Maybe I’ll give that a try.

con-f-use · May 30, 2024, 8:21pm

So to necro this a little bit. Does someone have a solution for automatic rollbacks to the last working generation with autoUpdate? Don’t wanna use deploy-rs or cachix, I’d rather homebrew something with nixos only. It would also be nice to have some kind of watchdog and be able to specify tests that on fail would trigger a rollback. Or some ideas how to rollback if the boot process itself got messed up.

ajaxbits · June 6, 2024, 2:41am

Yea I think a lot of people yearn for that as well. The Comin project might have some ideas to crib from, but check for yourself, since I’m not familiar with its workings at all.