How do you treat the data files in a server NixOS for backup/versioning?

hypersw · October 17, 2019, 2:02am

I really enjoy making small single-task server systems with NixOS. All you need is one well-structured config file which fully describes the system, no ad-hoc modifications by running random commands inside the system, no tricky backups, just render a new system from the config whenif you need it anew.

All is great until the server allows configuring it thru its Web UI. Sometimes it’s even essential for things not mapped into text very well, or the edits are too often and minor to go through rebuild every time. Problem is, now the system is not a pure function of the configuration anymore, and cannot be thrown away and rebuilt from scratch.

You have to back up the state somehow. Obviously you do not want to back up the whole system anyway. What are the good practices for backing up select data folders within an otherwise immutable NixOS system? Let’s assume there are only small text files for now.

I’ve heard that bigger systems in the clouds (usually assembled with Docker) would mount an additional versioned S3 volume for storing the data. In my case that’s an overkill for a few config files. More importantly, this gets up and out of the OS level, and cannot be set up from within the nix config for the system, you have to involve VM configuration, which is another level of abstraction and another set of scripts. Or manual work. Not nice.

Other ways might be mounting an NFS volume (can be handled in OS config), setting up rsync, or making a git repo in the data folder with automated pull-commit-push-merge operations. The latter feels especially good that it would get your text files properly versioned in a very usable fashion, but it’s easy to get things wrong when setting up from scratch.

So what is the common practice? Are there any existing tools which go well with NixOS for managing the mutable part of its state?

hypersw · November 8, 2021, 3:37am

Came by to drop a note.
My current solution is to mount a directory out of the NAS storage, with sshfs.

So there’s the NAS machine holding all of the persistent data.
This machine has SSHable users for representing these data folders (separate users for important stuff or a user for a group of things not worth isolating), with public keys.

Machines which need this working data only have regular nixos stuff on their own disks (basically, all throw-away). They’d mount data folders thru sshfs with fstab (fileSystems in nixos config) under the proper user. sshfs does uid/gid translation (NAS’s system sees its user and mounted stuff appears as if owned by the local user), can recover/reconnect and works good enough.

Were a bit tricky to set up because recovery options have awful default values, and there’s no diagnostics when running in fstab mode, it just fails silently.
Also for many services you need to make them depend on network (as in systemd.services.servicename.after = lib.mkAfter [ "remote-fs.target" ]) otherwise they might start before their data folder is available.

Another good candidate is SMB/CIFS, but I were unable to solve all problems there. File modes sometimes get assigned randomly, and the scenario when a file is saved involving an atomic move keeps failing. Also the share would just disappear from time to time without reconnecting. SMB should be faster than SSHFS though, but I haven’t hit this really hard yet. Also SSH auth looks nicer (keys not passwords, and I believe SMB passwords cannot be set up from nixos config).

I’d also like to try GVFS SFTP option for SSH, but haven’t figured out how to fstab it yet.

Git FS must be an interesting option, though my data folders mostly happend to me machine-written non-text files of considerable size, so I were not interested in versioning them yet really. Should give it a try too.

NFS didn’t look so bright. Even though it gotta be the fastest option, it has no auth at all (in this ZeroTrust era heh), and adding auth as a separate layer over it kind of undermines its “directness”. Also it means a direct match of UIDs/GIDs between the two systems, which I do not understand in case of two totally unrelated and separately managed systems. Do you have a success story with it?