Escaping Kubernetes?

tcurdt · May 25, 2024, 9:42pm

While I have successfully used k8s in the past, the gluttony that shines all over it does not make me the biggest fan. Just the memory consumption of cert-manager (effectively used every 3 months) or the number of lines the prometheus helm chart generates covers me with disbelief.

What I need

run some oci containers on 1-2 machines
run multiple instances per machine
roll out a new version when a new image is available in the oci registry
rolling restart with rollback support based on health checks

What I don’t really need

switching dynamic workloads between machines
overlay networks
namespaces

Is there any way to escape the big k8s (or the smaller brother k3s) in a bit more the nix way?

I don’t see virtualisation.oci-containers to be up to the task.
But I thought I ask before I am missing something…

…and give in to run k3s.

thoth · May 26, 2024, 12:18am

I miss the days of many competing orchestration thingamabobs, flynn deis octohost, even rancher had its own orchestration layer they eventually ditched for k8s. Anyone else remember rancherOS where docker ran as pid 1!

Hashicorp still has nomad, but I think it’s pretty poorly supported and I expect hc to shutter the project just like rancher did their orchestrator.

And there is still docker swarm.

But you might think about just rolling your own by starting docker compose from a shellscript, crontab, or even a systemd unit. And refire off the script if you can’t curl a port or something similar.

I would like to hear about some alternatives, especially nix specific though, so I am subscribing to this thread.

tcurdt · May 26, 2024, 10:12am

I even run a cluster with Nomad currently - but its future reminds me of docker swarm. It’s kept alive for now.

You can make things work with docker compose - and it’s really close to good enough. When it comes to roll outs and scaling of services it a little ugly though.

But that’s how dokku does it. Unfortunately dokku isn’t ported to work to nixOS yet. And I am not sure how much of a match it is conceptually.

I think dokku would suffice most of my needs - but if there was something better integrated with nixOS I would celebrate.

kampka · May 26, 2024, 11:26am

I’m running a set of Nomad clusters for similar reasons, not wanting to deal with k8s. K3s is slightly simpler to run on NixOS, but it still requires me to deal with all of the moving parts of k*s.

I’m not really worried for the future of Nomad though. For my purposes, the product is pretty much feature complete. I don’t really need a huge investment from Hashicorp into the product. It doesn’t have any huge, outstanding bugs that I am aware of, apart from some situations where drivers might misbehave. So I’m fairly happy to just keep running it as is.

I would actually consider forking the latest MPL version and mainting it as feature complete, had Hashicorp not shown that they were willing to sue forks over variable names.

tcurdt · May 26, 2024, 11:50am

I’m running a set of Nomad clusters for similar reasons, not wanting to deal with k8s.

Do you have a config repo you can share?
Or maybe some snippets how you have it set up?

I am just curious.

My Nomad cluster is currently running on Debian.
I would love to see how it looks like on NixOS.

I would actually consider forking the latest MPL version and mainting it as feature complete, had Hashicorp not shown that they were willing to sue forks over variable names.

They did what?

TBH that sounds like another reason to NOT use it. But well…

kampka · May 26, 2024, 12:43pm

They sent a cease and desist order, claiming OpenTofu infringed on their copyright by incorporating some of their BSL code from upstream Terraform.
The whole thing seems pretty hair thin to me, so I’d suspect they just want to intimidate everyone who actually intends to fork.
Since I don’t have the Linux Foundation legal team backing me, I’d rather wait and see what the IBM acquisition will bring.

kampka · May 26, 2024, 12:54pm

I’m planing on turning it into a flake for re-use as soon as I find some time.

My approach is still very opinionated, I build images using dockerTools.buildImage and run them from the nix store, so theres no registry in between. Also, I run my setup for high availability, so scalability is not built in.

I’ll drop a link here once I published something, should not be too hard to adopt the useful bits.

thoth · May 26, 2024, 6:10pm

Ya the IBM acquisition is a mystery to me, will IBM double down on the hostile patent troll stance? Or will they re-license everything back to [non-hostile license]? The latter is doubtful and I fear the former is inevitable from big blue. It is interesting to note that the nazi barcode company was sponsoring a fork of vault → openbao before the acquisition of HC. That will probably be swept under the rug quickly.

thoth · May 29, 2024, 3:19pm

I must say that I like the k3s setup in NixOS quite a bit, k3s is now a CNCF project and a nice balance between the power of full k8s but tightened up a bit. But anything running in a kubernetes cluster that has multiple masters and etcd will need add iops, eventually killing an SSD in my experience. So I run my control plane nodes on spinning rust when I can. And a few services I run outside of the cluster that are of lower priority and don’t need the full hooplah that is kubernetes.

One service in my household is an rpi running pihole. I don’t really care about backing up the logs, I only look at that data every now and then to diagnose what is going in my network. So I have cronjob that upon @restart it fires off a script that starts my pihole entirely in ram (tmpfs), by untarring a previous backup of its persistent volume.

#!/bin/bash
main () {
THIS_NAME=host1
THIS_IP=10.0.0.12
set -eu

if [[ -f .pihole.cid ]]; then
  set +e
  # this command may fail if the container is not running so +e here
  docker kill $(cat .pihole.cid)
  docker rm $(cat .pihole.cid)
  set -e
  rm .pihole.cid
fi
# https://github.com/pi-hole/docker-pi-hole/blob/master/README.md

#PIHOLE_BASE="${PIHOLE_BASE:-$(pwd)}"
PIHOLE_BASE="/tmpool/pihole"
rm -Rf $PIHOLE_BASE
[[ -d "$PIHOLE_BASE" ]] || mkdir -p "$PIHOLE_BASE" || { echo "Couldn't create storage directory: $PIHOLE_BASE"; exit 1; }
tar zxf /root/pihole.tgz -C /tmpool

docker pull pihole/pihole:latest
  #--dns=$THIS_IP --dns=1.1.1.1 \
docker run -d \
  --name $THIS_NAME \
  -p 53:53/tcp -p 53:53/udp \
  -p 3080:80 \
  --cidfile .pihole.cid \
  -e TZ="America/Chicago" \
  -v "${PIHOLE_BASE}/etc-pihole/:/etc/pihole/" \
  -v "${PIHOLE_BASE}/log-pihole/:/var/log/pihole/" \
  -v "${PIHOLE_BASE}/etc-dnsmasq.d/:/etc/dnsmasq.d/" \
  --dns=1.1.1.1 \
  --restart=unless-stopped \
  --hostname $THIS_NAME \
  -e VIRTUAL_HOST="$THIS_NAME" \
  -e PROXY_LOCATION="$THIS_NAME" \
  -e ServerIP="$THIS_IP" \
  pihole/pihole:latest

printf 'Starting up pihole container '
for i in $(seq 1 20); do
    if [ "$(docker inspect -f "{{.State.Health.Status}}" $THIS_NAME)" == "healthy" ] ; then
        printf ' OK'
        echo -e "\n$(docker logs $THIS_NAME 2> /dev/null | grep 'password:') for your pi-hole: https://${THIS_IP}/admin/"
        exit 0
    else
        sleep 3
        printf '.'
    fi

    if [ $i -eq 20 ] ; then
        echo -e "\nTimed out waiting for Pi-hole start, consult your container logs for more info (\`docker logs $THIS_NAME\`)"
        exit 1
    fi
done;
}

time main $@

On another machine that is an old gaming tower I run a few AI projects to toy around with, on this machine I am experimenting with firing off a tmuxinator session so I can easily log in and check on the logs to see what might be wrong, but again, I do not want to back these logs up so just running in a tmux session is fine by me.

name: serve
root: ~/

windows:
  - Mon: 
      root: /unreal/coopadmin/ollama
      layout: even-horizontal
      panes:
          - sudo htop
          - sudo nvtop
  - Ollama: cd /unreal/coopadmin/ollama && docker compose up
  - Invoke: cd /unreal/gpu/InvokeAI/docker && docker compose up
  - Devika: 
      root: /unreal/gpu/devika
      layout: even-horizontal
      panes:
          - cd /unreal/gpu/devika && zsh -c '. ~/.zshrc && python devika.py'
          - cd /unreal/gpu/devika/ui && zsh -c '. ~/.zshrc && nvm use && bun install && bun run start --host=10.11.5.11'
  - AGiXT: cd /unreal/gpu/AGiXT && docker compose -f docker-compose-local-nvidia-sd.yml up
  - EZlocalAI: cd /unreal/gpu/ezlocalai && docker compose -f docker-compose-cuda.yml up

tcurdt · May 30, 2024, 6:49pm

I don’t know. It’s a lot of yaml for just some rolling deploys. And then it’s not even fully declarative.
But where I agree is that the install with NixOS is pretty nice.

Still! If there was a way to have rolling restarts and running multiple instances just with NixOS and systemd I would take that over k3s any time.

But seems like k3s is the only realistic option for now.

sebohe · May 30, 2024, 8:25pm

If you feel bold enough, there is this: GitHub - astro/skyflake: NixOS Hyperconverged Infrastructure on Nomad/NixOS