Can't start new LXC container - not enough space

Hi,

After upgrading to 19.09 I can’t make lxc containers to work (they worked before).
LXD is enabled: virtualisation.lxd.enable = true;
I’ve created new ZFS dataset: zfs create tank/lxd
lxd init:

# lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: 
Do you want to configure a new storage pool? (yes/no) [default=yes]: 
Name of the new storage pool [default=default]: 
Name of the storage backend to use (btrfs, dir, lvm, zfs) [default=zfs]: 
Create a new ZFS pool? (yes/no) [default=yes]: no
Name of the existing ZFS pool or dataset: tank/lxd
Would you like to connect to a MAAS server? (yes/no) [default=no]: 
Would you like to create a new local network bridge? (yes/no) [default=yes]: no
Would you like to configure LXD to use an existing bridge or host interface? (yes/no) [default=no]: yes
Name of the existing bridge or host interface: br0
Would you like LXD to be available over the network? (yes/no) [default=no]: 
Would you like stale cached images to be updated automatically? (yes/no) [default=yes] no
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]: yes
config:
  images.auto_update_interval: "0"
networks: []
storage_pools:
- config:
    source: tank/lxd
  description: ""
  name: default
  driver: zfs
profiles:
- config: {}
  description: ""
  devices:
    eth0:
      name: eth0
      nictype: bridged
      parent: br0
      type: nic
    root:
      path: /
      pool: default
      type: disk
  name: default
cluster: null

trying to launch container:

# lxc launch ubuntu:18.04 test
Creating test
Starting test                               
Error: Failed to run: /nix/store/16z5hmi8k2dy5bcd16xjshlliy5hgnbm-lxd-3.13-bin/bin/.lxd-wrapped forkstart test /var/lib/lxd/containers /var/log/lxd/test/lxc.conf: 
Try `lxc info --show-log local:test` for more info
# lxc info --show-log local:test
[...]

lxc test 20191027205625.930 WARN     initutils - initutils.c:setproctitle:341 - Invalid argument - Failed to set cmdline
lxc test 20191027205626.170 ERROR    cgfsng - cgroups/cgfsng.c:__do_cgroup_enter:1500 - No space left on device - Failed to enter cgroup "/sys/fs/cgroup/cpuset//lxc.monitor/test/cgroup.procs"
lxc test 20191027205626.170 ERROR    start - start.c:__lxc_start:2009 - Failed to enter monitor cgroup
lxc test 20191027205626.170 ERROR    lxccontainer - lxccontainer.c:wait_on_daemonized_start:873 - Received container state "STOPPING" instead of "RUNNING"
lxc test 20191027205626.171 WARN     cgfsng - cgroups/cgfsng.c:cgfsng_monitor_destroy:1180 - No space left on device - Failed to move monitor 31513 to "/sys/fs/cgroup/cpuset//lxc.pivot/cgroup.procs"

lxc 20191027205626.172 WARN     commands - commands.c:lxc_cmd_rsp_recv:135 - Connection reset by peer - Failed to receive response for command "get_state"

I have enough space on the device:

# lxc storage info default
info:
  description: ""
  driver: zfs
  name: default
  space used: 777.41MB
  total space: 359.07GB
used by:
  containers:
  - test
  images:
  - d6f281a2e523674bcd9822f3f61be337c51828fb0dc94c8a200ab216d12a0fff
  profiles:
  - default

What step am I missing, what am I doing wrong?

Thanks!

Please attach output of lxd linit --dump and lxc profile show default

Thanks, outputs are:

# lxd init --dump
config:
  images.auto_update_interval: "0"
networks:
- config:
    ipv4.address: 10.13.90.1/24
    ipv4.nat: "true"
    ipv6.address: fd42:2856:1f43:eb40::1/64
    ipv6.nat: "true"
  description: ""
  managed: true
  name: lxdbr0
  type: bridge
storage_pools:
- config:
    source: tank/lxd
    volatile.initial_source: tank/lxd
    zfs.pool_name: tank/lxd
  description: ""
  name: default
  driver: zfs
profiles:
- config: {}
  description: Bridged networking LXD profile
  devices:
    eth0:
      name: eth0
      nictype: bridged
      parent: br0
      type: nic
  name: bridgeprofile
- config: {}
  description: ""
  devices:
    eth0:
      name: eth0
      nictype: bridged
      parent: br0
      type: nic
    root:
      path: /
      pool: default
      type: disk
  name: default

and

# lxc profile show default
config: {}
description: ""
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: br0
    type: nic
  root:
    path: /
    pool: default
    type: disk
name: default
used_by:
- /1.0/containers/test

Try running:

$ echo 1 > /sys/fs/cgroup/cpuset/cgroup.clone_children
$ lxc launch ubuntu:18.04 test

Still nothing, error is exactly the same as before. I tried launching test2, as test already exists (but I can’t start it).

Please show me the output of $ mount | grep cgroup

# mount | grep cgroup
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset,clone_children)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)

See this comment:

Please notice, that those commands should be executed after the container is created but not started (at least that’s how I understand this).

1 Like

Thanks, this workaround works:

# echo 0 | sudo tee /sys/fs/cgroup/cpuset//lxc.monitor/cpuset.cpus
0
# echo 0 | sudo tee /sys/fs/cgroup/cpuset//lxc.pivot/cpuset.cpus
0
# lxc start test
Error: Failed to run: /nix/store/16z5hmi8k2dy5bcd16xjshlliy5hgnbm-lxd-3.13-bin/bin/.lxd-wrapped forkstart test /var/lib/lxd/containers /var/log/lxd/test/lxc.conf: 
Try `lxc info --show-log test` for more info

# echo 0 | sudo tee /sys/fs/cgroup/cpuset/lxc.payload/cpuset.cpus
0
# lxc start test

# 

Now - how to make this change permanent?

i think the way to go is to fork and clone nixpkgs master somwhere,
then add this location to your nix path

export NIX_PATH=mypkgs=/path/to/fork/default.nix:$NIX_PATH

then make your changes, you can have a look at my fork
build it

nix-build  '<mypkgs>' -A lxc

add it to your env

nix-env -f '<mypkgs>' -iA lxc

unfortunately neither the workaround nor the patch works for me which sucks…

As you suggested on IRC:

Try using this patch this way:

  1. Download it: # wget https://github.com/lxc/lxc/commit/b31d62b847a3ee013613795094cce4acc12345ef.patch -O /etc/nixos/cpuset.patch
  2. Override derivation from configuration.nix:

 nixpkgs.config.packageOverrides = super: let self = super.pkgs; in {
   lxc = super.lxc.overrideAttrs (oldAttrs: rec {
     paches = oldAttrs.patches ++ [ /etc/nixos/cpuset.patch ];
   });
 };```

Rather than manually downloading it, I’d use fetchpatch and rather than packageOverrides, an overlay:

    overlays = [
      (self: super: {
        lxc = super.lxc.overrideAttrs (oldAttrs: {
          patches = oldAttrs.patches ++ [
            (self.fetchpatch {
              url = "https://github.com/lxc/lxc/commit/b31d62b847a3ee013613795094cce4acc12345ef.patch";
              sha256 = "0j4ch22l81b20m03l818442ra47hw4k6zyizsn3a0q3gv6mq5h7x";
            })
          ];
        });
      })

thanks @otwieracz and @JohnAZoidberg for your help!
i settled for a package override with fetchpatch and that worked!

any idea what went wrong with my approach?
would i have had to build/install the lxd package from my nixpkgs fork also?

Thanks very much! @otwieracz you’ve helped a lot but unfortunately packageOverrides doesn’t work for me.
Overlays as @JohnAZoidberg suggested did!

1 Like

@otwieracz’s packageOverrides contains a typo:
it should be ‘patches’ not ‘paches’
for reference here is mine:

   nixpkgs.config.packageOverrides = super: let self = super.pkgs; in {
    lxc = super.lxc.overrideAttrs (oldAttrs: rec {
      patches = oldAttrs.patches ++ [
        (self.fetchpatch {
          url = "https://github.com/lxc/lxc/commit/b31d62b847a3ee013613795094cce4acc12345ef.patch";
          sha256 = "1jpskr58ih56dakp3hg2yhxgvmn5qidi1vzxw0nak9afbx1yy9d4";
        }) 
      ];
    });
  };