Declarative, Rootless, K3S!

Hello All!

I have benefited majorly from the guides and discussion on this site and hope to contribute a little bit. I’m venturing into the wonderful world of kubernetes and figured out how to set up a declarative rootless server.

NOTE: Before starting, I only had luck after setting systemd.enableUnifiedCgroupHierarchy = false; (I happened to add that line to my configuration.nix but you can set it wherever) as described in this github issue.

{ config, pkgs, ... }:
let 
  USER = "k3s"; # set it and forget it!
in
{

  ## Optional
  nixpkgs.config.allowUnfree = true;

  ## Reusable system user block
  ## Warning: if you want to drop into a shell and interact with systemd, use the following:
  ## `MYUSER=[${USER}] sudo -H -u $MYUSER XDG_RUNTIME_DIR=/run/user/$(id -u $MYUSER)
  users.users.${USER} = {
    isSystemUser = true;
    linger = true;
    home = "/var/lib/${USER}";
    description = "system user for running k3";
    packages = with pkgs; [ 
      k3s
      killall
      slirp4netns
      podman
    ];
    group = "${USER}";
    extraGroups = [ "systemd-journal" ];
    autoSubUidGidRange = true;
  };
  
  users.groups.${USER} = {};

  ### Rootless K3S, based on this: https://github.com/k3s-io/k3s/blob/e2179aa957a02d4b357bef9aabb163f043471023/k3s-rootless.service
  
  systemd.user.services."k3s-rootless" = {
    # NOTE: Don't try to run `k3s server --rootless` on a terminal, as it doesn't enable cgroup v2 delegation.
    # If you really need to try it on a terminal, prepend `systemd-run --user -p Delegate=yes --tty` to create a systemd scope.
    
    # systemd unit file for k3s (rootless)
    #
    # Usage:
    # - [Optional] Enable cgroup v2 delegation, see https://rootlesscontaine.rs/getting-started/common/cgroup2/ .
    #   This step is optional, but highly recommended for enabling CPU and memory resource limtitation.
    #
    # - Run `systemctl --user disable --now k3s-rootless && systemctl --user enable --now k3s-rootless`
    #
    # - Run `KUBECONFIG=~/.kube/k3s.yaml kubectl get pods -A`, and make sure the pods are running.
    #
    # Troubleshooting:
    # - See `systemctl --user status k3s-rootless` to check the daemon status
    # - See `journalctl --user -f -u k3s-rootless` to see the daemon log
    # - See also https://rootlesscontaine.rs/

    enable = true;

    description="k3s (Rootless)";
    #environment= { 
    #  PATH = "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin";
    #};

    ## Initially the service couldn't find the slirp4netns binary on the PATH
    path = [ 
      "${pkgs.k3s}"
      "${pkgs.slirp4netns}"
      "${pkgs.podman}"
      "/run/wrappers/" # for newuidmap
    ];

    serviceConfig = {
      ExecStart = "${pkgs.k3s}/bin/k3s server --rootless --snapshotter=fuse-overlayfs --kubelet-arg='cgroup-driver=systemd'";
      ExecReload="${pkgs.killall}/bin/killall -s HUP $MAINPID";
      TimeoutSec=0;
      RestartSec=2;
      Restart="always";
      StartLimitBurst=3;
      StartLimitInterval="60s";
      LimitNOFILE="infinity";
      LimitNPROC="infinity";
      LimitCORE="infinity";
      TasksMax="infinity";
      Delegate="yes";
      Type="simple";
      KillMode="mixed";
    };

    #[Install]
    wantedBy= [ "default.target" ];
  };
  
  systemd.user.services."${USER}-podman-enabler" = {
    enable = true;
    description = "ensure podman service and socket are enabled";
    wantedBy = [ "multi-user.target" ];
    serviceConfig = {
      ExecStart = "systemctl enable --now podman.service podman.socket";
      Type = "oneshot";
    };
  };

  ## Possibly not necessary, but convenient for copy/pasting this user for other purposes

  systemd.user.services."${USER}-podman-network-maker" = {
    enable = true;
    description = "ensure podman networks are available";
    wantedBy = [ "multi-user.target" ];
    serviceConfig = {
      ExecStart = "podman network create proxy behold";
      Type = "oneshot";
    };
  };
  
  systemd.user.services."${USER}-rootless-restart" = {
    enable = true;
    description = "Start all containers where restart=always (rootless)";
    wantedBy = [ "default.target" ];
    wants = [ "network-online.target" ];
    after = [ "network-online.target" ];
    serviceConfig = {
      Type = "oneshot";
      RemainAfterExit = true;
      Environment = "PODMAN_SYSTEMD_UNIT=%n: LOGGING=\"--log-level=info\"";
      ExecStart = "/usr/bin/podman $LOGGING start --all --filter restart-policy=always";
      ExecStop = "/bin/sh -c '/usr/bin/podman $LOGGING stop $(/usr/bin/podman container ls --filter restart-policy=always -q)'";
    };
  };

}

You can check that it’s actually running rootless like this:

> ps -U root | grep k3
# note, nothing shows up

> ps -U k3s | grep k3
   1807 ?        00:00:00 k3s-server
   2582 ?        00:31:15 k3s

Finally, you can check the status page at: https://localhost:6443/ and you should be greeted with a status page.

Happy Hosting!

6 Likes

Edit:

It looks like adding systemd.enableUnifiedCgroupHierarchy = false; as I initially recommended was actually wrong. The real trick was adding the following line instead, as described in this reddit post

systemd.services."user@".serviceConfig.Delegate = "memory pids cpu cpuset";

Note, that’s verbatim what the line should be. No need to swap out ‘user’ for the name of a user, as it is a system-wide delegation :slight_smile:

Hey @coy-yote, this is good stuff! I started to go down the path of setting up a rootless k3s based on your guide, and ran into a few speedbumps. I’m noting them here to try and help the next person :slight_smile:

The first time I tried to apply the configuration changes, I ran into some issue where I was stuck not being able to make any changes to users:

[ERROR]   stdout) setting up /etc...
[ERROR]   stderr) Unable to list users with logind: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
[ERROR]  failure) Child process exited with error code: 11

Not sure what happened there but thankfully a reboot fixed it.

K3s will not start if IP forwarding is not enabled. I added the line boot.kernel.sysctl = { "net.ipv4.ip_forward" = 1; };.

There’s a few extra steps required to associate the systemd user service(s) with the k3s user. One must switch to the k3s user in a terminal (with $HOME and $XDG_RUNTIME_DIR environment variables pointing to /var/lib/k3s and /run/user/$(id -u k3s) respectively) and then run systemctl --user enable --now k3s-rootless to enable the service for that user. I couldn’t login as k3s with the MYUSER=[${USER}] sudo -H ... command from a comment in your original post. (sudo just kept erroring out with an unhelpful usage message).

I did enable the service for my own user (for testing) and noticed that systemd creates a symlink to a fully qualified nix store path:

Created symlink '/home/tkennedy/.config/systemd/user/k3s-rootless.service' → '/nix/store/c18j94c6x1bmnm7d7h42ig5hg71ki1xs-unit-k3s-rootless.service/k3s-rootless.service'.

This would need to be changed to something that isn’t going to break every time the service changes. Toward that end I started looking into home-manager to declaratively manage this link between the systemd user service and the k3s user, but didn’t get very far with it.

Why am I stopping now? I was reading the k3s known issues with rootless mode and saw that I can’t run multi-node clusters in rootless mode. I plan on creating a cluster so I’ll have to wait until that issue is resolved before I continue down this path.

Even though I ended up abandoning this setup, I really appreciate the guide in giving me a jumpstart down this path.

1 Like

Thanks for the helpful updates! I was doing a lot of tinkering before I wrote this up. Also good to know regarding the rootless mode bug. Glad you found your way through and that you got some use out of this!