Impermanence, zfs and secrets/vars (clan.lol)

Hey folks,

I’m trying to implement Impermanence on my machine. I have zfs pools:

{ lib, pkgs, config, ... }:
let
  mirrorBoot = { idx, device }: {
    type = "disk";
    device = device;
    content = {
      type = "gpt";
      partitions = {
        ESP = {
          size = "2G";
          type = "EF00";
          content = {
            type = "filesystem";
            format = "vfat";
            mountpoint = "/boot${idx}";
          };
        };
        zfs = {
          size = "100%";
          content = {
            type = "zfs";
            pool = "zroot";
          };
        };
      };
    };
  };
in
{
  boot.loader.grub = {
    enable = true;
    efiSupport = true;
    efiInstallAsRemovable = true;
    mirroredBoots = [
      {
        path = "/boot0";
        devices = [ "nodev" ];
      }
      {
        path = "/boot1";
        devices = [ "nodev" ];
      }
    ];
  };

  swapDevices = [
    {
      device = "/swapfile";
      size = 8192; # 8 GiB
      randomEncryption = true;
    }
  ];

  disko.devices = {
    disk = {
      x = mirrorBoot {
        idx = "0";
        device = "/dev/disk/by-id/nvme-WD_BLACK_SN850X_1000GB_245261801360";
      };

      y = mirrorBoot {
        idx = "1";
        device = "/dev/disk/by-id/nvme-WD_BLACK_SN850X_1000GB_25097H800196";
      };
    };

    zpool = {
      zroot = {
        type = "zpool";
        mode = "mirror";

        options = {
          ashift = "12";
          autotrim = "on";
        };

        rootFsOptions = {
          canmount = "off";
          checksum = "edonr";
          compression = "zstd";
          dnodesize = "auto";
          mountpoint = "none";
          normalization = "formD";
          relatime = "on";
          "com.sun:auto-snapshot" = "false";
        };

        datasets = {
          "root" = {
            type = "zfs_fs";
            options.mountpoint = "none";
          };

          "root/nixos" = {
            type = "zfs_fs";
            mountpoint = "/";
            postCreateHook = "zfs snapshot zroot/root/nixos@empty";
          };

          "root/nix" = {
            type = "zfs_fs";
            mountpoint = "/nix";
            postCreateHook = "zfs snapshot zroot/root/nix@empty";
          };

          "root/tmp" = {
            type = "zfs_fs";
            mountpoint = "/tmp";
            options.sync = "disabled";
          };

          "root/persist" = {
            type = "zfs_fs";
            mountpoint = "/persist";
            options."com.sun:auto-snapshot" = "true";
          };

          "root/persist/appdata" = {
            type = "zfs_fs";
            mountpoint = "/persist/appdata";
          };

          "root/persist/microvm" = {
            type = "zfs_fs";
            mountpoint = "/persist/microvm";
          };
        };
      };
    };
  };
}

and rollback upon boot:

boot.initrd.systemd.services.rollback = {
    description = "Rollback root filesystem to a pristine state on boot";
    wantedBy = [
      # "zfs.target"
      "initrd.target"
    ];
    after = [
      "zfs-import-zroot.service"
    ];
    before = [
      "sysroot.mount"
    ];
    path = with pkgs; [
      zfs
    ];

    unitConfig.DefaultDependencies = "no";
    serviceConfig.Type = "oneshot";
    script = ''
      zfs rollback -r zroot/root/nixos@empty && echo "  >> >> rollback complete << <<"
    '';
  };

and persistence:

environment.persistence."/persist" = {
    directories = [
      "/etc/nixos"
      "/var/lib/nixos"
      "/var/lib/docker"
      "/var/lib/tailscale"
    ];

    files = [
      "/etc/machine-id"
    ];
  }; 

so what’s happening now, clan.lol keeps regen openssh keys every reboot. And sops templates are not working properly → they are gone after reboot, and I need manually “switch” config again for them to reappear.

I guess, I’m hitting some weird ordering issue, or… maybe not backing up something else?


[root@hommy:~]# sshd -T | grep hostkey
hostkeyagent none
hostkeyalgorithms ssh-ed25519-cert-v01@openssh.com,ecdsa-sha2-nistp256-cert-v01@openssh.com,ecdsa-sha2-nistp384-cert-v01@openssh.com,ecdsa-sha2-nistp521-cert-v01@openssh.com,sk-ssh-ed25519-cert-v01@openssh.com,sk-ecdsa-sha2-nistp256-cert-v01@openssh.com,rsa-sha2-512-cert-v01@openssh.com,rsa-sha2-256-cert-v01@openssh.com,ssh-ed25519,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,sk-ssh-ed25519@openssh.com,sk-ecdsa-sha2-nistp256@openssh.com,rsa-sha2-512,rsa-sha2-256
hostkey /run/secrets/vars/openssh/ssh.id_ed25519
hostkey /run/secrets/vars/openssh/ssh.id_ed25519

so hostkey is different each time: template folder is gone, and vars are re-gened

[root@hommy:~]# ls -la /run/secrets/rendered/
total 12
drwxr-x--x 2 root keys   0 Dec 31 13:17 .
drwxr-x--x 4 root keys   0 Dec 31 13:17 ..
-rw------- 1 root root 123 Dec 31 13:17 caddy.env
-rw------- 1 root root 134 Dec 31 13:17 gatus.env
-rw------- 1 root root 188 Dec 31 13:17 lldap.env

[root@hommy:~]# ls -la /run/secrets/vars/
total 0
drwxr-x--x 5 root keys 0 Dec 31 13:17 .
drwxr-x--x 4 root keys 0 Dec 31 13:17 ..
drwxr-x--x 2 root keys 0 Dec 31 13:17 borgbackup
drwxr-x--x 2 root keys 0 Dec 31 13:17 cloudflare-tunnel
drwxr-x--x 2 root keys 0 Dec 31 13:17 openssh

here is boot log:

Dec 31 15:40:40 hommy kernel: rndis_host 7-1.1:2.0 enp73s0f3u1u1c2: renamed from usb0
Dec 31 15:40:40 hommy systemd-networkd[643]: usb0: Interface name change detected, renamed to enp73s0f3u1u1c2.
Dec 31 15:40:40 hommy sshd[684]: Server listening on 0.0.0.0 port 2222.
Dec 31 15:40:40 hommy sshd[684]: Server listening on :: port 2222.
Dec 31 15:40:40 hommy systemd-networkd[643]: enp73s0f3u1u1c2: Configuring with /etc/systemd/network/40-enp73s0f3u1u1c2.network.
Dec 31 15:40:40 hommy systemd-networkd[643]: enp73s0f3u1u1c2: Link UP
Dec 31 15:40:40 hommy systemd-networkd[643]: enp73s0f3u1u1c2: Gained carrier
Dec 31 15:40:40 hommy systemd[1]: Found device HGST_HUH721010ALE600 1.
Dec 31 15:40:40 hommy kernel: usb 7-1.2: new high-speed USB device number 4 using xhci_hcd
Dec 31 15:40:40 hommy systemd[1]: Found device TOSHIBA_HDWG51JUZSVA 1.
Dec 31 15:40:40 hommy systemd[1]: Found device TOSHIBA_HDWG51J 1.
Dec 31 15:40:40 hommy systemd[1]: Found device HGST_HUH721010ALE600 1.
Dec 31 15:40:40 hommy systemd[1]: Starting Cryptography Setup for luks-disk1...
Dec 31 15:40:40 hommy systemd[1]: Starting Cryptography Setup for luks-disk2...
Dec 31 15:40:40 hommy systemd[1]: Starting Cryptography Setup for luks-disk3...
Dec 31 15:40:40 hommy systemd[1]: Starting Cryptography Setup for luks-disk4...
Dec 31 15:40:40 hommy kernel: usb 7-1.2: New USB device found, idVendor=0557, idProduct=9241, bcdDevice= 3.18
Dec 31 15:40:40 hommy kernel: usb 7-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
Dec 31 15:40:40 hommy kernel: usb 7-1.2: Product: SMCI HID KM
Dec 31 15:40:40 hommy kernel: usb 7-1.2: Manufacturer: Linux 3.18.0 with ast_vhub
Dec 31 15:40:40 hommy systemd[1]: Starting Cryptography Setup for luks-disk5...
Dec 31 15:40:40 hommy systemd[1]: Starting Import ZFS pool "zroot"...
Dec 31 15:40:40 hommy systemd-cryptsetup[689]: Set cipher aes, mode xts-plain64, key size 512 bits for device /dev/disk/by-id/ata-TOSHIBA_HDWG51JUZSVA_1440A013FQ3H-part1.
Dec 31 15:40:40 hommy systemd-cryptsetup[692]: Set cipher aes, mode xts-plain64, key size 512 bits for device /dev/disk/by-id/ata-HGST_HUH721010ALE600_JEKHBM0Z-part1.
Dec 31 15:40:40 hommy systemd-cryptsetup[691]: Set cipher aes, mode xts-plain64, key size 512 bits for device /dev/disk/by-id/ata-TOSHIBA_HDWG51J_Z2H0A0J7FQ3H-part1.
Dec 31 15:40:40 hommy systemd-cryptsetup[690]: Set cipher aes, mode xts-plain64, key size 512 bits for device /dev/disk/by-id/ata-TOSHIBA_HDWG51J_9350A00KFQ3H-part1.
Dec 31 15:40:40 hommy systemd-cryptsetup[693]: Set cipher aes, mode xts-plain64, key size 512 bits for device /dev/disk/by-id/ata-HGST_HUH721010ALE600_16G6901Z-part1.
Dec 31 15:40:40 hommy kernel: hid: raw HID events driver (C) Jiri Kosina
Dec 31 15:40:40 hommy kernel: usbcore: registered new interface driver usbhid
Dec 31 15:40:40 hommy kernel: usbhid: USB HID core driver
Dec 31 15:40:40 hommy kernel: input: Linux 3.18.0 with ast_vhub SMCI HID KM as /devices/pci0000:40/0000:40:08.1/0000:49:00.3/usb7/7-1/7-1.2/7-1.2:1.0/0003:0557:9241.0001/input/input0
Dec 31 15:40:40 hommy kernel: hid-generic 0003:0557:9241.0001: input,hidraw0: USB HID v1.00 Keyboard [Linux 3.18.0 with ast_vhub SMCI HID KM] on usb-0000:49:00.3-1.2/input0
Dec 31 15:40:40 hommy kernel: input: Linux 3.18.0 with ast_vhub SMCI HID KM as /devices/pci0000:40/0000:40:08.1/0000:49:00.3/usb7/7-1/7-1.2/7-1.2:1.1/0003:0557:9241.0002/input/input1
Dec 31 15:40:40 hommy kernel: hid-generic 0003:0557:9241.0002: input,hidraw1: USB HID v1.00 Mouse [Linux 3.18.0 with ast_vhub SMCI HID KM] on usb-0000:49:00.3-1.2/input1
Dec 31 15:40:40 hommy kernel: scsi 18:0:0:0: Direct-Access     JetFlash Transcend 16GB   8.07 PQ: 0 ANSI: 4
Dec 31 15:40:40 hommy kernel: sd 18:0:0:0: [sdg] 30617600 512-byte logical blocks: (15.7 GB/14.6 GiB)
Dec 31 15:40:40 hommy kernel: sd 18:0:0:0: [sdg] Write Protect is off
Dec 31 15:40:40 hommy kernel: sd 18:0:0:0: [sdg] Mode Sense: 23 00 00 00
Dec 31 15:40:40 hommy kernel: sd 18:0:0:0: [sdg] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
Dec 31 15:40:40 hommy kernel:  sdg: sdg1 sdg2
Dec 31 15:40:40 hommy kernel: sd 18:0:0:0: [sdg] Attached SCSI removable disk
Dec 31 15:40:40 hommy zfs-import-zroot-start[698]: importing ZFS pool "zroot"...Successfully imported zroot
Dec 31 15:40:40 hommy systemd[1]: Finished Import ZFS pool "zroot".
Dec 31 15:40:40 hommy systemd[1]: Reached target ZFS pool import target.
Dec 31 15:40:40 hommy systemd[1]: Reached target ZFS startup target.
Dec 31 15:40:40 hommy systemd[1]: Starting create-needed-for-boot-dirs.service...
Dec 31 15:40:40 hommy systemd[1]: Starting Rollback root filesystem to a pristine state on boot...
Dec 31 15:40:40 hommy systemd[1]: persist\x2dtmp\x2dmnt-persist.mount: Deactivated successfully.
Dec 31 15:40:40 hommy systemd[1]: create-needed-for-boot-dirs.service: Deactivated successfully.
Dec 31 15:40:40 hommy systemd[1]: Finished create-needed-for-boot-dirs.service.
Dec 31 15:40:41 hommy rollback-start[1278]:   >> >> rollback complete << <<
Dec 31 15:40:41 hommy systemd[1]: rollback.service: Deactivated successfully.
Dec 31 15:40:41 hommy systemd[1]: Finished Rollback root filesystem to a pristine state on boot.
Dec 31 15:40:41 hommy systemd[1]: Mounting /sysroot...
Dec 31 15:40:41 hommy systemd[1]: Mounted /sysroot.
Dec 31 15:40:41 hommy systemd[1]: Reached target Initrd Root File System.
Dec 31 15:40:41 hommy systemd[1]: Mounting /sysroot/nix...
Dec 31 15:40:41 hommy systemd[1]: Mounting /sysroot/persist...
Dec 31 15:40:41 hommy systemd[1]: Mounting /sysroot/run...
Dec 31 15:40:41 hommy systemd[1]: Starting Mountpoints Configured in the Real Root...
Dec 31 15:40:41 hommy systemd-sysroot-fstab-check[1447]: /sysroot should be mounted in the initrd, will request daemon-reload.
Dec 31 15:40:41 hommy systemd[1]: Mounted /sysroot/run.
Dec 31 15:40:41 hommy systemd[1]: Reload requested from client PID 1447 ('systemd-sysroot') (unit initrd-parse-etc.service)...
Dec 31 15:40:41 hommy systemd[1]: Reloading...
Dec 31 15:40:41 hommy systemd-networkd[643]: enp73s0f3u1u1c2: Gained IPv6LL
Dec 31 15:40:41 hommy systemd[1]: Reloading finished in 363 ms.
Dec 31 15:40:41 hommy systemd-sysroot-fstab-check[1447]: Requesting initrd-fs.target/start/replace...
Dec 31 15:40:41 hommy systemd[1]: Mounted /sysroot/nix.
Dec 31 15:40:41 hommy systemd[1]: Mounted /sysroot/persist.
Dec 31 15:40:41 hommy systemd-sysroot-fstab-check[1447]: Requesting swap.target/start/replace...
Dec 31 15:40:41 hommy systemd[1]: Starting Find NixOS closure...

The impermanence stuff should have nothing to do with the sops-nix stuff. The sops-nix files are populated during boot; they’re not really stateful in a way that your rollback service would affect. On most systems, sops-nix should populate those files during activation, whose logs you can see with journalctl -b0 -u initrd-nixos-activation.service (on systems using sysusers or userborn, sops-nix does this in sops-install-secrets.service in stage 2 instead).

Turns out I need to persist `/var/lib/sops-nix’ and make it required for boot

fileSystems."/var/lib/sops-nix".neededForBoot = true;

directories = [
      "/etc/nixos"
      "/var/lib/nixos"
      "/var/lib/docker"
      "/var/lib/tailscale"
      "/var/lib/sops-nix"
    ];