How do I configure NFS server to start after the required filesystem mounts are ready?

Hi,
I am using NixOS on a simple home server where I have some NFS shares configured. I have a problem where on bootup the nfs-server.service systemd unit is attempting to start before the required filesystems are mounted. This causes the NFS server to fail to start, and I need to manually start it (sudo systemctl start nfs-server.service).

How do I configure nfs-server.service to start after the required mounts are ready?

Relevant configuration

  fileSystems."/storage" = {
    device = "/dev/disk/by-uuid/xxxx";
    fsType = "btrfs";
    options = [ "subvol=@storage" "noatime" "nofail" "compress=zstd" ];
  };

  services.nfs.server = {
    enable = true;
    # Fixed ports for firewall.
    statdPort = 4000;
    lockdPort = 4001;
    mountdPort = 4002;

    exports = ''
      /storage/Shared    *(ro,insecure,all_squash)
    '';
  };

Failed NFS server unit status

On a fresh boot which exhibits this problem:

> systemctl status nfs-server.service
× nfs-server.service - NFS server and services
     Loaded: loaded (/etc/systemd/system/nfs-server.service; enabled; vendor preset: enabled)
    Drop-In: /nix/store/ygcbm4dnwns4hzkbyx3iqi74r7h9ys1k-system-units/nfs-server.service.d
             └─overrides.conf
     Active: failed (Result: exit-code) since Mon 2022-10-17 12:56:32 BST; 1min 30s ago
    Process: 1742 ExecStartPre=/nix/store/qw86hay4s0f8r0p98n074p8n3mb14hk3-nfs-utils-2.5.1/bin/exportfs -r (code=exited, status=1/FAILURE)
    Process: 1743 ExecStopPost=/nix/store/qw86hay4s0f8r0p98n074p8n3mb14hk3-nfs-utils-2.5.1/bin/exportfs -au (code=exited, status=0/SUCCESS)
    Process: 1744 ExecStopPost=/nix/store/qw86hay4s0f8r0p98n074p8n3mb14hk3-nfs-utils-2.5.1/bin/exportfs -f (code=exited, status=0/SUCCESS)
         IP: 0B in, 0B out
        CPU: 3ms

Oct 17 12:56:32 box systemd[1]: Starting NFS server and services...
Oct 17 12:56:32 box exportfs[1742]: exportfs: /etc/exports [1]: Neither 'subtree_check' or 'no_subtree_check' specified for export "*:/storage/Shared".
Oct 17 12:56:32 box exportfs[1742]:   Assuming default behaviour ('no_subtree_check').
Oct 17 12:56:32 box exportfs[1742]:   NOTE: this default has changed since nfs-utils version 1.0.x
Oct 17 12:56:32 box exportfs[1742]: exportfs: Failed to stat /storage/Shared: No such file or directory
Oct 17 12:56:32 box systemd[1]: nfs-server.service: Control process exited, code=exited, status=1/FAILURE
Oct 17 12:56:32 box systemd[1]: nfs-server.service: Failed with result 'exit-code'.
Oct 17 12:56:32 box systemd[1]: Stopped NFS server and services.

Checking the obvious first, does /storage/Shared exist? If yes I’d check the user running nfsd and whether that user has read access.

/storage/Shared does exist, but only after /storage is mounted. The /storage mount is mounted automatically, but seems to take a moment and usually completes after NFS has attempted to start.

If I log in and manually start NFS ( sudo systemctl start nfs-server.service ) then NFS starts just fine - as by that time /storage has finished mounting.

Some boot logs

-- Boot 8175d13f68634dab8fd594af761885ac --
Oct 17 12:56:18 box systemd[1]: Mounting /storage...
Oct 17 12:56:32 box systemd[1]: Starting NFS server and services...
Oct 17 12:56:32 box exportfs[1742]: exportfs: /etc/exports [1]: Neither 'subtree_check' or 'no_subtree_check' specified for export "*:/storage/Shared".
Oct 17 12:56:32 box exportfs[1742]:   Assuming default behaviour ('no_subtree_check').
Oct 17 12:56:32 box exportfs[1742]:   NOTE: this default has changed since nfs-utils version 1.0.x
Oct 17 12:56:32 box exportfs[1742]: exportfs: Failed to stat /storage/Shared: No such file or directory
Oct 17 12:56:32 box systemd[1]: nfs-server.service: Control process exited, code=exited, status=1/FAILURE
Oct 17 12:56:32 box systemd[1]: nfs-server.service: Failed with result 'exit-code'.
Oct 17 12:56:32 box systemd[1]: Stopped NFS server and services.
Oct 17 12:56:45 box systemd[1]: Mounted /storage.

I found a way to manually add a dependency between the systemd units.

systemd.services.nfs-server = {
    after = [ "storage.mount" ];
};

Now, my boot sequence looks much better:

> sudo journalctl --boot --unit nfs-server.service --unit storage.mount
Oct 17 16:09:19 box systemd[1]: Mounting /storage...
Oct 17 16:09:47 box systemd[1]: Mounted /storage.
Oct 17 16:09:47 box systemd[1]: Starting NFS server and services...
Oct 17 16:09:47 box exportfs[1778]: exportfs: /etc/exports [1]: Neither 'subtree_check' or 'no_subtree_check' specified for export "*:/storage/Shared".
Oct 17 16:09:47 box exportfs[1778]:   Assuming default behaviour ('no_subtree_check').
Oct 17 16:09:47 box exportfs[1778]:   NOTE: this default has changed since nfs-utils version 1.0.x
Oct 17 16:09:48 box systemd[1]: Finished NFS server and services.

Problem Solved!


The solution I have feels a little fragile. I’ve noticed that nixos-rebuild succeeds even if I get the name of the nfs-server systemd unit wrong. I assume my fix will break if the services.nfs.server logic is changed and those systemd units renamed.

Is there a way to write my fix in a way that doesn’t depend on the specific name of the nfs-server systemd unit?

1 Like

Hm this seems odd. Looking at the nfs-server.service file from nfs-utils, I see After=local-fs.target, which means it really shouldn’t try to start until after all FSes are mounted.

Oh, I see. It’s because of nofail. From man systemd.mount:

nofail

With nofail, this mount will be only wanted, not required, by local-fs.target or remote-fs.target. Moreover the mount unit is not ordered before these target units. This means that the boot will continue without waiting for the mount unit and regardless whether the mount point can be mounted successfully.

So nfs-server.service having After=local-fs.target is no good because of nofail. You can either remove that or you can add x-systemd.before=local-fs.target to the options for the FS.

1 Like

Ahhh - that makes some sense…

I added “nofail” back when that file system was on a previous machine. I had an occasion where the file system failed to mount and it halted the boot sequence. SSH didn’t start and I had to faff about getting access again (find a monitor and keyboard!!). Having “nofail” allows the system to boot even if that file system is broken, thus letting me ssh in and fix.

Ideally, I’d want NFS and some other services to wait for this filesystem to be available… but other services like Tailscale and ssh not to wait…

If I add x-systemd.before=local-fs.target to the options for the FS - do you know what will happen if the file system fails to mount?

I’m fairly sure that even with x-systemd.before=local-fs.target, because nofail removes the Requires dependency on the FS from local-fs.target, then when the FS fails, local-fs.target will still succeed and the boot will continue. But it will still cause local-fs.target to wait for the FS to either mount or fail to mount.

1 Like

Awesome, thank you!

I’ll try this tomorrow.

Using x-systemd.before=local-fs.target as an additional mount option worked perfectly. I also note this from the systemd.mount man page:

x-systemd.before=, x-systemd.after=

In the created mount unit, configures a Before= or After= dependency on another systemd unit, such as a mount unit. The argument should be a unit name or an absolute path to a mount point. This option may be specified more than once. This option is particularly useful for mount point declarations with nofail option that are mounted asynchronously but need to be mounted before or after some unit start, for example, before local-fs.target unit. See Before= and After= in systemd.unit(5) for details.

1 Like

Yes, but in this case you’re (potentially) back to other dependency problems if the mount ever fails, in that other services will not start, waiting for local-fs.target.

It would be better to be more specific, much the same way as you did with your original solution, but coming from the other side: use the x-systemd.before= but for the nfs server service instead

That shouldn’t be the case. Once the mount fails, it will no longer block local-fs.target, and the boot will proceed.

1 Like

Hm,
I think on balance I’m going to remove the nofail mount option, and revert back to the default “block services from starting” behaviour for this mount.

The exception I need is probably best applied to the SSH service, not to the mount point or services that need the mount point (as that’s most of them).

1 Like