Ok, gave that a try but still can’t get it to work. I’ve tried everything I can think of at the moment, but need a deeper understanding of LXC to figure this out.
For now I will just go with the simple setup that worked - move /var out of tmpfs and onto its own ZFS dataset.
For future reference, in case anyone else has this problem, here’s a wall of text of what I tried. Don’t feel obligated to read it unless you’re just really curious or are working on the same problem yourself.
- Objective: put all of root / on tmpfs, including /var, while persisting enough of lxc/lxd to rpool/safe/persist so that lxc/lxd continues to work across reboots/rebuilds
- the crucial lxc/lxd locations are:
- /var/lib/lxc
- /var/lib/lxcfs (fuse.lxcfs filesystem for providing host system information to running containers)
- /var/lib/lxd (holds the actual storage pools, containers, and vms)
- /var/log/lxd
- /etc/lxc (system config)
- ~/.config/lxc (unprivileged config)
- Plan:
- persist /var/lib/{lxc,lxd} to /persist/var/lib/{lxc,lxd}
- allow NixOS and lxc/lxd to automatically manage /var/lib/lxcfs (it is auto mounted to device: lxcfs of type: fuse.lxcfs)
- persist /etc/lxc/{default.conf,lxc-usernet,lxc.conf} to /persist/etc/lxc/{default.conf,lxc-usernet,lxc.conf}
Relevant lines in configuration.nix
:
{ config, pkgs, ... }:
{
...
virtualisation = {
containers.enable = true;
containerd.enable = true;
libvirtd = {
enable = true;
qemuRunAsRoot = false;
};
lxd = {
enable = true;
zfsSupport = true;
recommendedSysctlSettings = true;
};
lxc = {
enable = true;
lxcfs.enable = true;
systemConfig = ''
lxc.lxcpath = /var/lib/lxd/containers
lxc.bdev.zfs.root = rpool/safe/lxd
'';
};
};
...
environment.etc = {
"lxc/default.conf".source = "/persist/etc/lxc/default.conf";
"lxc/lxc-usernet".source = "/persist/etc/lxc/lxc-usernet";
"lxc/lxc.conf".source = "/persist/etc/lxc/lxc.conf";
};
...
systemd.tmpfiles.rules = [
"L /var/lib/lxc - - - - /persist/var/lib/lxc"
"L /var/lib/lxd - - - - /persist/var/lib/lxd"
];
...
};
hardware-configuration.nix for this is pretty standard.
lxd init
completes without error and creates the ZFS dataset (rpool/safe/lxd), the lxd storage pool (/var/lib/lxd/storage-pools/lxdpool), and the network bridge (lxdbr0).
But attempting to launch a container results in an error that does not occur when all /var is on disk (on its own ZFS dataset, rpool/local/var).
full details
$> sudo lxd init
[sudo] password for bgibson:
Would you like to use LXD clustering? (yes/no) [default=no]:
Do you want to configure a new storage pool? (yes/no) [default=yes]:
Name of the new storage pool [default=default]: lxdpool
Name of the storage backend to use (btrfs, dir, lvm, zfs) [default=zfs]:
Would you like to create a new zfs dataset under rpool/lxd? (yes/no) [default=yes]: no
Create a new ZFS pool? (yes/no) [default=yes]: no
Name of the existing ZFS pool or dataset: rpool/safe/lxd
Would you like to connect to a MAAS server? (yes/no) [default=no]:
Would you like to create a new local network bridge? (yes/no) [default=yes]:
What should the new bridge be called? [default=lxdbr0]:
What IPv4 address should be used? (CIDR subnet notation, “auto” or “none”) [default=auto]:
What IPv6 address should be used? (CIDR subnet notation, “auto” or “none”) [default=auto]:
Would you like the LXD server to be available over the network? (yes/no) [default=no]: yes
Address to bind LXD to (not including port) [default=all]:
Port to bind LXD to [default=8443]:
Trust password for new clients:
Again:
Would you like stale cached images to be updated automatically? (yes/no) [default=yes]
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]:
$> sudo lxc launch images:alpine/edge ae-test
Creating ae-test
Starting ae-test
Error: Failed to run: /nix/store/9kp22pvvgn376q6jqhvi8agqwqzbg3a2-lxd-4.14/bin/.lxd-wrapped forkstart ae-test /var/lib/lxd/containers /var/log/lxd/ae-test/lxc.conf:
Try `lxc info --show-log local:ae-test` for more info
$> sudo lxc info --show-log local:ae-test
Name: ae-test
Location: none
Remote: unix://
Architecture: x86_64
Created: 2021/08/10 04:44 UTC
Status: Stopped
Type: container
Profiles: default
Log:
lxc ae-test 20210810044427.200 WARN conf - conf.c:lxc_map_ids:3007 - newuidmap binary is missing
lxc ae-test 20210810044427.201 WARN conf - conf.c:lxc_map_ids:3013 - newgidmap binary is missing
lxc ae-test 20210810044427.203 WARN conf - conf.c:lxc_map_ids:3007 - newuidmap binary is missing
lxc ae-test 20210810044427.203 WARN conf - conf.c:lxc_map_ids:3013 - newgidmap binary is missing
lxc ae-test 20210810044427.204 WARN cgfsng - cgroups/cgfsng.c:fchowmodat:1293 - No such file or directory - Failed to fchownat(43, memory.oom.group, 65536, 0, AT_EMPTY_PATH | AT_SYMLINK_NOFOLLOW )
lxc ae-test 20210810044427.249 ERROR conf - conf.c:lxc_setup_rootfs_prepare_root:3437 - Failed to setup rootfs for
lxc ae-test 20210810044427.249 ERROR conf - conf.c:lxc_setup:3600 - Failed to setup rootfs
lxc ae-test 20210810044427.249 ERROR start - start.c:do_start:1265 - Failed to setup container "ae-test"
lxc ae-test 20210810044427.249 ERROR sync - sync.c:sync_wait:36 - An error occurred in another process (expected sequence number 5)
lxc ae-test 20210810044427.256 WARN network - network.c:lxc_delete_network_priv:3621 - Failed to rename interface with index 0 from "eth0" to its initial name "veth2065dbfa"
lxc ae-test 20210810044427.257 ERROR start - start.c:__lxc_start:2073 - Failed to spawn container "ae-test"
lxc ae-test 20210810044427.257 ERROR lxccontainer - lxccontainer.c:wait_on_daemonized_start:868 - Received container state "ABORTING" instead of "RUNNING"
lxc ae-test 20210810044427.257 WARN start - start.c:lxc_abort:1016 - No such process - Failed to send SIGKILL via pidfd 44 for process 7938
lxc ae-test 20210810044427.399 WARN conf - conf.c:lxc_map_ids:3007 - newuidmap binary is missing
lxc ae-test 20210810044427.400 WARN conf - conf.c:lxc_map_ids:3013 - newgidmap binary is missing
lxc 20210810044427.419 ERROR af_unix - af_unix.c:lxc_abstract_unix_recv_fds_iov:207 - Connection reset by peer - Failed to receive response
lxc 20210810044427.420 ERROR commands - commands.c:lxc_cmd_rsp_recv_fds:129 - Failed to receive file descriptors
By comparison, here is what it lxd init
looks like when all of /var is mounted to disk (rpool/local/var) instead of tmpfs:
> sudo lxd init master [81c71d8] deleted modified untracked
[sudo] password for bgibson:
Would you like to use LXD clustering? (yes/no) [default=no]:
Do you want to configure a new storage pool? (yes/no) [default=yes]:
Name of the new storage pool [default=default]: lxdpool
Name of the storage backend to use (dir, lvm, zfs, btrfs) [default=zfs]:
Would you like to create a new zfs dataset under rpool/lxd? (yes/no) [default=yes]: no
Create a new ZFS pool? (yes/no) [default=yes]: no
Name of the existing ZFS pool or dataset: rpool/safe/lxd
Would you like to connect to a MAAS server? (yes/no) [default=no]:
Would you like to create a new local network bridge? (yes/no) [default=yes]:
What should the new bridge be called? [default=lxdbr0]:
What IPv4 address should be used? (CIDR subnet notation, “auto” or “none”) [default=auto]:
What IPv6 address should be used? (CIDR subnet notation, “auto” or “none”) [default=auto]:
Would you like the LXD server to be available over the network? (yes/no) [default=no]: yes
Address to bind LXD to (not including port) [default=all]:
Port to bind LXD to [default=8443]:
Trust password for new clients:
Again:
Would you like stale cached images to be updated automatically? (yes/no) [default=yes]
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]:
bgibson@z10pe-d8:~ (*)
> sudo lxc launch images:alpine/edge ae-test master [81c71d8] deleted modified untracked
Creating ae-test
Starting ae-test
bgibson@z10pe-d8:~ (*)
> sudo lxc list master [81c71d8] deleted modified untracked
+---------+---------+----------------------+----------------------------------------------+-----------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+---------+---------+----------------------+----------------------------------------------+-----------+-----------+
| ae-test | RUNNING | 10.227.20.113 (eth0) | fd42:93b1:cd1:18e7:216:3eff:fe6e:b3bd (eth0) | CONTAINER | 0 |
+---------+---------+----------------------+----------------------------------------------+-----------+-----------+