Kubernetes bringup on current unstable (2022-07-17)

I’m not super well versed in the kubernetes ecosystem so I might be misunderstanding the problem, but I think there are a few gaps in the current nixos wiki page that I’d like clarified:

It seems like the default cfssl config and the default kubernetes config disagree on where ca.pem (the public key for the CA root?) should go. cfssl puts it in /var/lib/cfssl/ca.pem and something dumps an empty file at /var/lib/kubernetes/secrets/ca.pem. I fixed this by manually copying the cfssl ca.pem to /var/lib/kubernetes/secrets. This fixed certmgr:

Jul 17 12:01:55 carrot-cake certmgr-pre-start[80356]: 2022/07/17 12:01:55 [INFO] certmgr: loading from config file /nix/store/4r56vfg2skz4hm1ymvmwmrwn3sa486k0-certmgr.yaml
Jul 17 12:01:55 carrot-cake certmgr-pre-start[80356]: 2022/07/17 12:01:55 [INFO] manager: loading certificates from /nix/store/m4z0digyigrqps9p68vxqyjkx91cbjbx-certmgr.d
Jul 17 12:01:55 carrot-cake certmgr-pre-start[80356]: 2022/07/17 12:01:55 [INFO] manager: loading spec from /nix/store/m4z0digyigrqps9p68vxqyjkx91cbjbx-certmgr.d/addonManager.json
Jul 17 12:01:55 carrot-cake certmgr-pre-start[80356]: 2022/07/17 12:01:55 [ERROR] cert: failed to fetch remote CA: failed to parse rootCA certs

Which got further:

Jul 17 12:29:02 carrot-cake certmgr-pre-start[138552]: 2022/07/17 12:29:02 [INFO] certmgr: loading from config file /nix/store/4r56vfg2skz4hm1ymvmwmrwn3sa486k0-certmgr.yaml
Jul 17 12:29:02 carrot-cake certmgr-pre-start[138552]: 2022/07/17 12:29:02 [INFO] manager: loading certificates from /nix/store/m4z0digyigrqps9p68vxqyjkx91cbjbx-certmgr.d
Jul 17 12:29:02 carrot-cake certmgr-pre-start[138552]: 2022/07/17 12:29:02 [INFO] manager: loading spec from /nix/store/m4z0digyigrqps9p68vxqyjkx91cbjbx-certmgr.d/addonManager.json
Jul 17 12:30:32 carrot-cake systemd[1]: certmgr.service: start-pre operation timed out. Terminating.

Then I manually added my kube.api IP to the loopback interface:

ip addr add 10.1.1.2 dev lo

and certmgr seemed happy.

After that, kube-apiserver was still failing to start so I had to chown /var/lib/kubernetes/secrets/ca.pem to kubernetes:nogroup, mode 644. Finally, I was able to tickle etcd and the full system seems to be up:

# systemctl start etcd
# kubectl cluster-info
Kubernetes control plane is running at https://api.kube:6443
CoreDNS is running at https://api.kube:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

Hopefully someone searching for these errors can find this page. It would be nice for the nixos kubernetes module to set itself up properly though. If someone can point me to where in the nixos-modules forest some changes need to be made I can submit a PR to nixpkgs.

This might be the same issue I’m having with the k3s service.

Thank you so much for documenting your fixes!

Here is where the Kubernetes service is defined