Nixos rebuild and unmounting datasets is delayed by zfs_txg_timeout

Does anyone have experience with NixOS using zfs with high zfs_txg_timeout (of the zfs filesystem)?
(Yes, I do have a UPS. No, I would rather not set it to a low value.)

There is a problem that at the end of nixos-rebuild switch/boot adding the new generation will be delayed by zfs_txg_timeout … e.g. if zfs_txg_timeout is 60 seconds, then even a very simple rebuild will take 60 seconds longer. It is apparently waiting for changes to get commited to disk, but perhaps isn’t using sync writes to accomplish it (?) so it waits around until zfs itself decides to commit the current transaction group.

The same problem happens when shutting down. Unmounting the zfs datasets takes around zfs_txg_timeout seconds, which is zfs_txg_timeout seconds more than it needs. Any ideas on what to do with this?

Calling zpool sync (from another terminal) when it gets “stuck” like this allows it to continue (which is proof it waits around until zfs commits current writes). Perhaps zpool sync can be set up to be called every time the configuration is being saved and every time the datasets are unmounting during shutdown? But I don’t know if that is just a crude fix that can be done better some other way.

I have also been able to determine that the part of nixos-rebuild that is delayed by this is nix-env -p … --set …, and that the sync command, when ran as root, has similar behavior of being slowed down this way (even if there are no writes to do).

I suspect that updating the nix store sqlite database may be the underlying cause, in the case of configuration switch. There are a particular sequence of operations in sqlite for locking and opening the database, that result in a sync and then a need to wait for the next txg to close, up to 5s by default, more in your case:

and it shows up as similar delays for several applications, e.g.

For shutdown, you could add a zpool sync to the relevant systemd unit?

1 Like

Thanks for showing me the relevant issues. I was having a very hard time trying to find anything mentioning the problem. Its a shame it is an unsolved problem.

Speaking of the systemd unit, what should I look for? All I have found is “zfs-mount.service” and a bunch of “***.mount” services(?) but I haven’t found anything called “umount” or “unmount”, which is what I’d expect to find.

1 Like

There is a zfs-sync service, it doesn’t actually call zpool sync, it sets a custom property on the root dataset of the pool, which I assume is supposed to trigger a txg close as part of export. You could try adding an explicit zpool sync there.