Impermanence and hibernation

tschan · February 18, 2024, 12:46pm

Hi there,

I’ve installed NixOS + Impermanence (using btrfs snapshots) on my main desktop and after initial setup it works pretty much flawlessly.

I’m now in the planning stages of moving my laptop to basically the same setup and one question came up regarding Impermanence + hibernation. During the week I like to hibernate my laptop and only do one update + full reboot at the end of the week. The way I understand the setup for resetting the root filesystem that is suggested in the Impermanence repo it will reset root also when resuming from hibernation, correct?

{
  inputs,
  lib,
  ...
}: {
  imports = [
    inputs.impermanence.nixosModules.impermanence
  ];

  boot.initrd.postDeviceCommands = lib.mkAfter ''
    mkdir /btrfs_tmp
    mount /dev/mapper/cryptroot /btrfs_tmp
    if [[ -e /btrfs_tmp/root ]]; then
        mkdir -p /btrfs_tmp/old_roots
        timestamp=$(date --date="@$(stat -c %Y /btrfs_tmp/root)" "+%Y-%m-%-d_%H:%M:%S")
        mv /btrfs_tmp/root "/btrfs_tmp/old_roots/$timestamp"
    fi

    delete_subvolume_recursively() {
        IFS=$'\n'
        for i in $(btrfs subvolume list -o "$1" | cut -f 9- -d ' '); do
            delete_subvolume_recursively "/btrfs_tmp/$i"
        done
        btrfs subvolume delete "$1"
    }

    for i in $(find /btrfs_tmp/old_roots/ -maxdepth 1 -mtime +30); do
        delete_subvolume_recursively "$i"
    done

    btrfs subvolume create /btrfs_tmp/root
    umount /btrfs_tmp
  '';
}

Assuming my understanding is correct I would like to avoid the reset on resume to prevent any weird issues for running applications with open files. Still assuming my understanding is correct what would be an alternative way to reset on reboot but not an resume?

Thanks

lucassong3000 · February 15, 2025, 7:33pm

have you figured out the problem yet?

tschan · February 19, 2025, 12:55pm

I’ve switched to a systemd based initrd with a unit doing the rollback and hibernation works as expected:

{
  boot.initrd.systemd.services.recreate-root = {
    description = "Rolling over and creating new filesystem root";

    wantedBy = [ "initrd.target" ];
    requires = [ "initrd-root-device.target" ];
    after = [ "initrd-root-device.target" ];
    before = [
      "sysroot.mount"
      "create-needed-for-boot-dirs.service"
    ];

    unitConfig.DefaultDependencies = "no";
    serviceConfig.Type = "oneshot";

    script = ''
      mkdir /btrfs_tmp
      mount /dev/mapper/cryptroot /btrfs_tmp
      if [[ -e /btrfs_tmp/root ]]; then
        mkdir -p /btrfs_tmp/old_roots
        timestamp=$(date --date="@$(stat -c %Y /btrfs_tmp/root)" "+%Y-%m-%d_%H:%M:%S")
        mv /btrfs_tmp/root "/btrfs_tmp/old_roots/$timestamp"
      fi

      delete_subvolume_recursively() {
        IFS=$'\n'
        for i in $(btrfs subvolume list -o "$1" | cut -f 9- -d ' '); do
          delete_subvolume_recursively "/btrfs_tmp/$i"
        done
        btrfs subvolume delete "$1"
      }

      for i in $(find /btrfs_tmp/old_roots/ -maxdepth 1 -mtime +30); do
        delete_subvolume_recursively "$i"
      done

      btrfs subvolume create /btrfs_tmp/root
      umount /btrfs_tmp
    '';
  };
}

ElvishJerricco · February 19, 2025, 1:05pm

@lucassong3000 Nowadays the answer is to use postResumeCommands instead of postDeviceCommands. Or do something with systemd initrd like @tschan did.

But this doesn’t look quite right to me. The systemd-hibernate-resume.service unit isn’t ordered before initrd-root.device.target. You probably want to order after local-fs-pre.target (or just directly after systemd-hibernate-resume.service) to get it right.

lucassong3000 · February 20, 2025, 3:54am

not to make matters worse, but i have a nvidia graphics card, that i successfully compiled the driver for it. how would that fit into the picture? I mean, atm, everything works with cpu only, running on nvidia optimus.

I think keep it that way is the only wise choice, because under nvidia gpu, the sleep wont recover from blank screen.

tschan · February 20, 2025, 7:04am

I’ve been using it for close to a year now like this and the service behaves as expected. Resets the root filesystem on cold boot but doesn’t do so on wake from hibernation. I think I’ll leave it as is, never touch a working system and all that.

Edit: I just got a forum achievement for using my first quote, I feel so proud

ElvishJerricco · February 20, 2025, 5:08pm

I’m telling you, what you have only works because of luck. Simply add "local-fs-pre.target" to the after list and you’ll avoid the system crashing and likely corrupting your file system irrecoverably.

tschan · February 20, 2025, 5:23pm

It doesn’t look like the recreate-root unit gets executed on resume at all right now. What would adding local-fs-pre.target to the after list accomplish exactly? Honest question, I’m not very familiar with how systemd executes which units in which order on hibernate and then resume. Looking at this it seems like I would have to opt in the unit into being executed on resume by adding a WantedBy=suspend.target. I’m not doing that though so is there really a problem at all?

ElvishJerricco · February 20, 2025, 5:39pm

That’s for units that should run as the system is going into suspend (and in case there’s any confusion, suspend and hibernate are different; suspend just sleeps the hardware, while hibernate saves the RAM to disk and powers the machine off).

Your unit runs in initrd as the system boots up. When the system resumes from hibernation, it goes through an ordinary bootup process, including initrd and things like prompting for the disk encryption password, until the initrd triggers a “resume” operation in the kernel, which freezes userspace, reads the hibernation image from disk, and resumes from the state the system was hibernated from. That’s what systemd-hibernate-resume.service does. If your unit even starts before systemd-hibernate-resume.service, then you risk corrupting your file system irrecoverably, because the hibernation image on disk expects your file system to be completely untouched since the system was hibernated.

As for how systemd orders units starting, it’s fairly simple. Every scheduled unit (so anything that’s wanted by anything else that’s scheduled, originating with default.target) is started as soon as all of its dependencies are done. So anything that a unit is ordered After has to be done, and any other unit that is ordered Before that unit has to be done. As soon as that condition is met, systemd starts the unit. In your case, those conditions are met effectively simultaneously for systemd-hibernate-resume.service and your recreate-root.service, so they will be started at the same time.

The reason you haven’t seen any problems is because systemd-hibernate-resume.service is extremely fast, so it is unlikely that your unit makes any progress before the kernel freezes all of userspace to resume from the hibernation image. But eventually, something will happen that causes some kind of hiccup and you will have mounted the file system and possibly even begun modifying it before userspace is frozen, causing irrecoverable damage to your file system. This can be prevented by simply adding "local-fs-pre.target" to the after list, because systemd-hibernate-resume.service is ordered before local-fs-pre.target. Even if I’m wrong about something here, the worst that can happen by adding this dependency is that boot gets delayed by a few milliseconds.

tschan · February 20, 2025, 5:51pm

That makes sense, thanks for the detailed explanation. Makes sense to me. Although I have one more question. The documentation for local-fs-pre.target says the following:

This target unit is automatically ordered before all local mount points marked with auto (see above). It can be used to execute certain units before all local mounts.

So does that mean if I have

{
   after = [ "local-fs-pre.target" ];
   before = ["sysroot.mount"];
}

my unit would be started basically “between” them? local-fs-pre.target should be ordered before the sysroot.mount, right?

ElvishJerricco · February 20, 2025, 6:33pm

Exactly, which is exactly what you want.

tschan · February 20, 2025, 6:45pm

Perfect, thanks again! So this is my final unit now:

{
  boot.initrd.systemd.services.recreate-root = {
    description = "Rolling over and creating new filesystem root";

    wantedBy = [ "initrd.target" ];
    requires = [ "initrd-root-device.target" ];
    after = [
      "initrd-root-device.target"
      "local-fs-pre.target"
    ];
    before = [
      "sysroot.mount"
      "create-needed-for-boot-dirs.service"
    ];

    unitConfig.DefaultDependencies = "no";
    serviceConfig.Type = "oneshot";

    script = ''
      mkdir /btrfs_tmp
      mount /dev/mapper/cryptroot /btrfs_tmp
      if [[ -e /btrfs_tmp/root ]]; then
        mkdir -p /btrfs_tmp/old_roots
        timestamp=$(date --date="@$(stat -c %Y /btrfs_tmp/root)" "+%Y-%m-%d_%H:%M:%S")
        mv /btrfs_tmp/root "/btrfs_tmp/old_roots/$timestamp"
      fi

      delete_subvolume_recursively() {
        IFS=$'\n'
        for i in $(btrfs subvolume list -o "$1" | cut -f 9- -d ' '); do
          delete_subvolume_recursively "/btrfs_tmp/$i"
        done
        btrfs subvolume delete "$1"
      }

      for i in $(find /btrfs_tmp/old_roots/ -maxdepth 1 -mtime +30); do
        delete_subvolume_recursively "$i"
      done

      btrfs subvolume create /btrfs_tmp/root
      umount /btrfs_tmp
    '';
  };
}

lucassong3000 · February 20, 2025, 9:11pm

wait does it work under Nvidia driver too?

ElvishJerricco · February 20, 2025, 9:12pm

the nvidia driver should have nothing to do with it I’m pretty sure

EDIT: Well, other than the possibility that the nvidia driver may be buggy and may not restore properly after a resume happens. But I have no idea about the state of nvidia in that situation, and it wouldn’t be the sort of thing you could really do anything about

lucassong3000 · February 20, 2025, 9:29pm

I am not sure it is because the driver that is buggy, but if your, for instance, xwayland
is rendered by nvidia gpu, then,

you wont be able to recover from blank screen after sleep, even if nvidia power management under nix config is disabled.

which, if you think about it, is strange enough. Or you might just have to let cpu do almost everything for you on system, which really sucks. but hopefully nvidia on linux will change for the even better, since it open sourced its kernel module to linux community.