I’ve upgraded a test cluster of a three machine Kubernetes cluster on NixOS 19.09. So now It’s using tha bare 19.09 k8s module (a 1.15, flannel 0.11), before it was using using 1.14 from unstable, with the stabilization across machines (#56789) patch applied and before the PR that reverts it (#67563) went into place.
The only problem that I’m facing now is that the network communication between the machines is spotty after the machines finish the bootstrap phase. If I manually restart the flannel daemon of each machine the network layer resumes normal functionality. Do you have experienced any similar symptom?
The only thing that seems stand out in the flannel logs are these lines:
github.com/coreos/flannel/subnet/kube/kube.go:310: Failed to watch *v1.Node: Get https://belial.etour.tn.it:6443/api/v1/nodes?resourceVersion=43657342&timeoutSeconds=414&watch=true: dial tcp 192.168.122.102:6443: connect: connection refused
github.com/coreos/flannel/subnet/kube/kube.go:310: Failed to list *v1.Node: nodes is forbidden: User "flannel-client" cannot list resource "nodes" in API group "" at the cluster scope
github.com/coreos/flannel/subnet/kube/kube.go:310: Failed to watch *v1.Node: Get https://belial.etour.tn.it:6443/api/v1/nodes?resourceVersion=43658446&timeoutSeconds=384&watch=true: dial tcp 192.168.122.102:6443: connect: connection refused
github.com/coreos/flannel/subnet/kube/kube.go:310: Failed to list *v1.Node: nodes is forbidden: User "flannel-client" cannot list resource "nodes" in API group "" at the cluster scope
My suspect that this happens while flannel has started and kube-apiserver has yet to be completely ready (flannel is configured to use “kubernetes” storage).
Anyway, when I restart the daemons manually (so way after kube-apiserver has become ready) those lines do not end up in the logs.
I thinking about how to resolve this issue… starting the flannel service after the apiserver is easy on the master server, but I don’t know how to do that on the other machines. If you have any comment, suggestion, please speak up!
I had a bunch of trouble getting kube to work the way I wanted and ended up using a different approach based on kubeadm.
There was a comment with some code on one of the PRs which led me to this implementation below. I’m quite happy with it, I had some pain getting flannel to work properly with the 19.09 kubernetes module (I bind all my services to tinc interfaces, and couldn’t get the flannel implementation to play nicely) but with this setup I now implement the network overlay the ‘vanilla way’, through a simple kubectl apply -f . This means I am also not tied to flannel which I personally not use it myself.
I suppose you could hook in a one-off systemd boot script that also configured the network overlay by executing a kubectl apply, I don’t bootstrap the cluster that often so I never bothered.
This won’t fix the issues you are having, mostly posting this here in case you or others are keen to try a different/simpler implementation.
{ pkgs, lib, config, ... }: let cfg = config.services.kubeadm; in {
options.services.kubeadm = {
enable = lib.mkEnableOption "kubeadm";
role = lib.mkOption {
type = lib.types.enum ["master" "worker" ];
};
apiserverAddress = lib.mkOption {
type = lib.types.str;
description = ''
The address on which we can reach the masters. Could be loadbalancer
'';
};
bootstrapToken = lib.mkOption {
type = lib.types.str;
description = ''
The master will print this to stdout after being set up.
'';
};
nodeip = lib.mkOption {
type = lib.types.str;
};
discoveryTokenCaCertHash = lib.mkOption {
type = lib.types.str;
};
};
config = lib.mkIf cfg.enable {
boot.kernelModules = [ "br_netfilter" ];
boot.kernel.sysctl = {
"net.ipv4.ip_forward" = 1;
"net.bridge.bridge-nf-call-iptables" = 1;
};
environment.systemPackages = with pkgs; [
gitMinimal
openssh
docker
utillinux
iproute
ethtool
thin-provisioning-tools
iptables
socat
];
virtualisation.docker.enable = true;
systemd.services.kubeadm = {
wantedBy = [ "multi-user.target" ];
after = [ "kubelet.service" ];
postStart = lib.mkIf (cfg.role == "master")
''
KUBECONFIG=/etc/kubernetes/admin.conf kubectl -n kube-public get cm cluster-info -o json | jq -r '.data.kubeconfig' > /etc/kubernetes/cluster-info.cfg
chmod a+r /etc/kubernetes/cluster-info.cfg
'';
# These paths are needed to convince kubeadm to bootstrap
path = with pkgs; [ kubernetes jq gitMinimal openssh docker utillinux iproute ethtool thin-provisioning-tools iptables socat ];
serviceConfig = {
Type = "oneshot";
RemainAfterExit = true;
# Makes sure that its only started once, during bootstrap
ConditionPathExists = "!/var/lib/kubelet/config.yaml";
Statedirectory = "kubelet";
ConfigurationDirectory = "kubernetes";
ExecStart = {
master = "${pkgs.kubernetes}/bin/kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=${cfg.apiserverAddress} --ignore-preflight-errors='all' --token ${cfg.bootstrapToken} --token-ttl 0 --upload-certs";
worker = "${pkgs.kubernetes}/bin/kubeadm join ${cfg.apiserverAddress} --token ${cfg.bootstrapToken} --discovery-token-unsafe-skip-ca-verification --ignore-preflight-errors all --discovery-token-ca-cert-hash ${cfg.discoveryTokenCaCertHash}";
}.${cfg.role};
};
};
systemd.services.kubelet = {
description = "Kubernetes Kubelet Service";
wantedBy = [ "multi-user.target" ];
path = with pkgs; [ gitMinimal openssh docker utillinux iproute ethtool thin-provisioning-tools iptables socat cni ];
serviceConfig = {
StateDirectory = "kubelet";
# This populates $KUBELET_KUBEADM_ARGS and is provided
# by kubeadm init and join
EnvironmentFile = "-/var/lib/kubelet/kubeadm-flags.env";
Restart = "always";
StartLimitInterval= 0;
RestartSec = 10;
ExecStart = ''
${pkgs.kubernetes}/bin/kubelet \
--kubeconfig=/etc/kubernetes/kubelet.conf \
--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf \
--config=/var/lib/kubelet/config.yaml \
--fail-swap-on=false \
--cni-bin-dir="/opt/cni/bin" \
--address="${cfg.nodeip}" \
--node-ip="${cfg.nodeip}" \
$KUBELET_KUBEADM_ARGS
'';
};
};
};
}
If you are asking if the issue disappears after a while then I’m not competely sure. It doesn’t goes away in half-hours or few hours. but today I found the cluster functioning as expect after the threee VMs have been shut down berfore and restarted after the borg backup job that saves them. I’ll wait some more cycles of the same to happen to say something
Hi @Azulinho, thanks for sharing this! I’m quite curious of testing it out what do you use other than flannel? I’ve another client which set up their own cluster and they are using calico, but I must say that on the surface flannel seems a bit simpler than calico and because the things to learn about kubernetes are so many a simpler tool makes at least that part easier to manage.