Feedback and advice on setting up MergerFS + SnapRAID in NixOS

Hi! I apologize in advance for the length; I wanted to provide what I felt was adequate context. I’m relatively new to NixOS.

In preparation for revamping my bare-metal home server, I’ve been building out a few config files, in a VM. I think I’m in the final stages, and am ironing out the file that will handle part of my storage configuration.

I’ll probably add it as storage.nix, since hardware-configuration.nix says it shouldn’t be modified. However, for now, it’s all in hardware-configuration.nix.

For context, I have 5 drives, 3 of which will be part of my mass storage, and relevant to this post:

  • 1 × WD 18 TB HDD, for parity
  • 2 × Seagate 14 TB HDD, for data
  • The other 2 are a couple NVMe SSDs—1 for boot, 1 for root & home

I hope to have everything encrypted with LUKS, but am open to not doing so if it’ll be a bad idea. I just know I might have sensitive data (like photos) on the HDDs.

Below, you’ll see my config, as it exists right now. Everything before ## Drives and after ".AppleDB" was generated by the NixOS install in the VM. Everything in-between was written by me.

Below that, you’ll see what I think the config might look like when all’s said and done.

hardware-configuration.nix as it currently exists:

# Do not modify this file!  It was generated by ‘nixos-generate-config’
# and may be overwritten by future invocations.  Please make changes
# to /etc/nixos/configuration.nix instead.
{ config, lib, pkgs, modulesPath, ... }:

{
  imports =
    [ (modulesPath + "/profiles/qemu-guest.nix")
    ];

  boot.initrd.availableKernelModules = [ "ahci" "xhci_pci" "virtio_pci" "sr_mod" "virtio_blk" ];
  boot.initrd.kernelModules = [ ];
  boot.kernelModules = [ "kvm-amd" ];
  boot.extraModulePackages = [ ];

  fileSystems."/" =
    { device = "/dev/disk/by-uuid/36ea268c-fbec-45b8-8a46-bd29d8666171";
      fsType = "ext4";
    };

  boot.initrd.luks.devices."luks-22dc2fb9-e4f6-4985-8456-de7fee20b938".device = "/dev/disk/by-uuid/22dc2fb9-e4f6-4985-8456-de7fee20b938";

  swapDevices =
    [ { device = "/dev/disk/by-uuid/d7c62d37-fd47-4339-88a7-1d9ef5fd377b"; }
    ];

  ## Drives
  ############
  # NVMe drive(s) for Boot (Hope to pool in the future, but probably won't.)
  ## nvme-Kingston_SSD_KC3000_1TB_SERIAL
  #
  # NVMe drive(s) for Root and Home (Hope to pool in the future, but probably won't.)
  ## nvme-Kingston_SSD_KC3000_2TB_SERIAL
  #
  # Parity
  ## Parity01 - ata-WDBAMA0180HBK-NESN-XA_SERIAL (DCM: WGBPZCM)
  #
  # Data pool
  ## Disk01 - ata-STKP14000400-3EGAP6-571_SERIAL (DOM: 10/2023)
  ## Disk02 - ata-STKP14000400-3EGAP6-570_SERIAL (DOM: 03/2023)

  # Media storage disks, pooled by MergerFS
  fileSystems."/mnt/jbod" =
    { device = "/mnt/disks/disk*";
      fsType = "mergerfs";
      options = ["defaults" "minfreespace=250G" "fsname=mergerfs-jbod"];
    };

  fileSystems."/mnt/disks/disk01" =
    { device = "/dev/disk/by-id/ata-STKP14000400-3EGAP6-571_SERIAL";
      fsType = "xfs";
    };

  fileSystems."/mnt/disks/disk02" =
    { device = "/dev/disk/by-id/ata-STKP14000400-3EGAP6-570_SERIAL";
      fsType = "xfs";
    };

  # SnapRAID
  services.snapraid = {
    enable = true;
    #extraConfig = ''
    #  nohidden
    #  blocksize 256
    #  hashsize 16
    #  autosave 500
    #  pool /pool
    #'';
    parityFiles = [
      # Defines the file(s) to use as parity storage
      # It must NOT be in a data disk
      # Format: "FILE_PATH"
      "/mnt/disks/parity01/snapraid.parity"
    ];
    contentFiles = [
      # Defines the files to use as content list
      # You can use multiple specification to store more copies
      # You must have least one copy for each parity file plus one. Some more don't
      # hurt
      # They can be in the disks used for data, parity or boot,
      # but each file must be in a different disk
      # Format: "content FILE_PATH"
      "/var/snapraid.content"
      "/mnt/disks/parity/parity01/.snapraid.content"
      "/mnt/disks/disk01/.snapraid.content"
      "/mnt/disks/disk02/.snapraid.content"
    ];
    dataDisks = {
      # Defines the data disks to use
      # The order is relevant for parity, do not change it
      # Format: "DISK_NAME DISK_MOUNT_POINT"
      d01 = "/mnt/disks/disk01/";
      d02 = "/mnt/disks/disk02/";
    };
    #touchBeforeSync = true; # Whether `snapraid touch` should be run before `snapraid sync`. Default: true.
    sync.interval = "03:00";
    scrub.interval = "weekly";
    #scrub.plan = 8; # Percent of the array that should be checked by `snapraid scrub`. Default: 8.
    #scrub.olderThan = 10; # Number of days since data was last scrubbed before it can be scrubbed again. Default: 10
    exclude = [
      # Defines files and directories to exclude
      # Remember that all the paths are relative at the mount points
      # Format: "FILE"
      # Format: "DIR/"
      # Format: "/PATH/FILE"
      # Format: "/PATH/DIR/"
      "*.unrecoverable"
      "/tmp/"
      "/lost+found/"
      "*.!sync"
      ".AppleDouble"
      "._AppleDouble"
      ".DS_Store"
      "._.DS_Store"
      ".Thumbs.db"
      ".fseventsd"
      ".Spotlight-V100"
      ".TemporaryItems"
      ".Trashes"
      ".AppleDB"
    ];
  };

  # Enables DHCP on each ethernet and wireless interface. In case of scripted networking
  # (the default) this is the recommended approach. When using systemd-networkd it's
  # still possible to use this option, but it's recommended to use it in conjunction
  # with explicit per-interface declarations with `networking.interfaces.<interface>.useDHCP`.
  networking.useDHCP = lib.mkDefault true;
  # networking.interfaces.enp1s0.useDHCP = lib.mkDefault true;

  nixpkgs.hostPlatform = lib.mkDefault "x86_64-linux";
}

What I think it might look like in the end. Again, the parts not auto-generated will be in their own config file.

# Do not modify this file!  It was generated by ‘nixos-generate-config’
# and may be overwritten by future invocations.  Please make changes
# to /etc/nixos/configuration.nix instead.
{ config, lib, pkgs, modulesPath, ... }:

{
  # Removed `imports` and `boot.` for relevance / brevity.

  fileSystems."/" =
    { device = "/dev/disk/by-uuid/UUID";
      fsType = "btrfs"; # Because I can't be bothered with ZFS right now.
    };

  fileSystems."/home" =
    { device = "/dev/disk/by-uuid/UUID";
      fsType = "btrfs"; # Because I can't be bothered with ZFS right now.
    };

  fileSystems."/boot" =
    { device = "/dev/disk/by-uuid/2AD5-541F";
      fsType = "vfat";
    };

  boot.initrd.luks.devices."luks-UUID".device = "/dev/disk/by-uuid/UUID";

  # Need to figure out how to set swappiness.
  swapDevices = [ {
    device = "/var/lib/swapfile";
    size = 16*1024;
  } ];

  ## Drives
  ############
  # NVMe drive(s) for Boot (Hope to pool in the future, but probably won't.)
  ## nvme-Kingston_SSD_KC3000_1TB_SERIAL
  #
  # NVMe drive(s) for Root and Home (Hope to pool in the future, but probably won't.)
  ## nvme-Kingston_SSD_KC3000_2TB_SERIAL
  #
  # Parity
  ## Parity01 - ata-WDBAMA0180HBK-NESN-XA_SERIAL (DCM: WGBPZCM)
  #
  # Data pool
  ## Disk01 - ata-STKP14000400-3EGAP6-571_SERIAL (DOM: 10/2023)
  ## Disk02 - ata-STKP14000400-3EGAP6-570_SERIAL (DOM: 03/2023)

  # Media storage disks, pooled by MergerFS
  fileSystems."/mnt/jbod" =
    { device = "/mnt/disks/disk*";
      fsType = "mergerfs";
      options = ["defaults" "minfreespace=250G" "fsname=mergerfs-jbod"];
    };

  fileSystems."/mnt/disks/disk01" =
    { device = "/dev/disk/by-id/ata-STKP14000400-3EGAP6-571_SERIAL";
      fsType = "xfs";
    };

  fileSystems."/mnt/disks/disk02" =
    { device = "/dev/disk/by-id/ata-STKP14000400-3EGAP6-570_SERIAL";
      fsType = "xfs";
    };

  # SnapRAID
  services.snapraid = {
    enable = true;
    #extraConfig = ''
    #  nohidden
    #  blocksize 256
    #  hashsize 16
    #  autosave 500
    #  pool /pool
    #'';
    parityFiles = [
      # Defines the file(s) to use as parity storage
      # It must NOT be in a data disk
      # Format: "FILE_PATH"
      "/mnt/disks/parity01/snapraid.parity"
    ];
    contentFiles = [
      # Defines the files to use as content list
      # You can use multiple specification to store more copies
      # You must have least one copy for each parity file plus one. Some more don't
      # hurt
      # They can be in the disks used for data, parity or boot,
      # but each file must be in a different disk
      # Format: "content FILE_PATH"
      "/var/snapraid.content"
      "/mnt/disks/parity/parity01/.snapraid.content"
      "/mnt/disks/disk01/.snapraid.content"
      "/mnt/disks/disk02/.snapraid.content"
    ];
    dataDisks = {
      # Defines the data disks to use
      # The order is relevant for parity, do not change it
      # Format: "DISK_NAME DISK_MOUNT_POINT"
      d01 = "/mnt/disks/disk01/";
      d02 = "/mnt/disks/disk02/";
    };
    #touchBeforeSync = true; # Whether `snapraid touch` should be run before `snapraid sync`. Default: true.
    sync.interval = "03:00";
    scrub.interval = "weekly";
    #scrub.plan = 8; # Percent of the array that should be checked by `snapraid scrub`. Default: 8.
    #scrub.olderThan = 10; # Number of days since data was last scrubbed before it can be scrubbed again. Default: 10
    exclude = [
      # Defines files and directories to exclude
      # Remember that all the paths are relative at the mount points
      # Format: "FILE"
      # Format: "DIR/"
      # Format: "/PATH/FILE"
      # Format: "/PATH/DIR/"
      "*.unrecoverable"
      "/tmp/"
      "/lost+found/"
      "*.!sync"
      ".AppleDouble"
      "._AppleDouble"
      ".DS_Store"
      "._.DS_Store"
      ".Thumbs.db"
      ".fseventsd"
      ".Spotlight-V100"
      ".TemporaryItems"
      ".Trashes"
      ".AppleDB"
    ];
  };

  # Enables DHCP on each ethernet and wireless interface. In case of scripted networking
  # (the default) this is the recommended approach. When using systemd-networkd it's
  # still possible to use this option, but it's recommended to use it in conjunction
  # with explicit per-interface declarations with `networking.interfaces.<interface>.useDHCP`.
  networking.useDHCP = lib.mkDefault true;
  # networking.interfaces.enp1s0.useDHCP = lib.mkDefault true;

  nixpkgs.hostPlatform = lib.mkDefault "x86_64-linux";
}

Any feedback you can provide on my config may be helpful. I feel like I’m almost at the finish line, where prep is concerned, but the configuration of my filesystem(s) feels like one of the last major hurdles before I can install and move on to declaring Home Manager and Docker/Podman.

Some resources I used:

  1. Parts of ironicbadger’s Nix config(s) - GitHub
  2. Combining Different Sized Drives with mergerfs and SnapRAID - Self Hosted Home
  3. snapraid - NixOS options
  4. Do we really need swap on modern systems? - Hacker News
1 Like

This is rather tangential, but regarding swap, a relatively small amount is usually a good idea; In defence of swap: common misconceptions

1 Like

Funny you link this. I saw it when I was looking into whether or not I should add Swap.

I ultimately decided to just in case the system needs it for something. It’s highly unlikely I’ll use up all the space in my NVMe drive, so I can spare a few GB if it’ll help my system remain responsive, etc.

1 Like

Your mergerfs file system probably needs its options to contain x-systemd.requires-mounts-for=/mnt/disks/disk01 for each of the underlying file systems that it relies on. Otherwise it may attempt to mount before those have been.

2 Likes

Ah! That’s a good point! I believe that’s addressed in the Mount order section of the Filesystem page of the NixOS wiki.

ah yea, I forgot for a second that we have fileSystems.<name>.depends :stuck_out_tongue:

2 Likes

Hahaha! It happens. I updated it to:

# Media storage disks, pooled by MergerFS
  fileSystems."/mnt/jbod" =
    { depends = [
      # The `disk*` mounts have to be mounted in this given order.
      "/mnt/disks/disk01"
      "/mnt/disks/disk02"
      ];
      device = "/mnt/disks/disk*";
      fsType = "mergerfs";
      options = ["defaults" "minfreespace=250G" "fsname=mergerfs-jbod"];
    };

  fileSystems."/mnt/disks/disk01" =
    { device = "/dev/disk/by-id/ata-STKP14000400-3EGAP6-571_SERIAL";
      fsType = "xfs";
    };

  fileSystems."/mnt/disks/disk02" =
    { device = "/dev/disk/by-id/ata-STKP14000400-3EGAP6-570_SERIAL";
      fsType = "xfs";
    };

I like the depends tip! Thanks for sharing that.

Overall this looks like a solid config. Do you have any specific questions about anything here or were you just looking for a generalised feedback loop?

No problem! Glad I could return some value.

Thanks! For the most part, I just wanted some eyes on it, and general feedback, but I do also have a few questions:

  1. Would I add, and build, the MergerFS + SnapRAID configs immediately after first boot, post-NixOS install, or is there anything special I need to do before building?
  2. In the config, where by-id is used, how does one get or set the ID? I think getting the UUID is easy enough, but I’m not familiar with the ID.
  3. In the case of /boot’s by-uuid, is 2AD5-541F just a shortened version of the UUID, the first & last 4 digits? If so, could I use a shortened version for all devices?
  4. Do you know how to declare Swappiness, how frequently/under which circumstances Swap is used?

I think that’s all my questions for now.

My system is now live, so I figured I’d share the config I ended up using, called disks.nix, which is imported through configuration.nix. Also, marking this solved. Big thanks to everyone who helped me out.

{ config, lib, pkgs, modulesPath, ... }:

{

  # Media storage disks, pooled by MergerFS
  fileSystems."/mnt/disks/data01" =
    { device = "/dev/disk/by-label/data01";
      fsType = "xfs";
    };

  fileSystems."/mnt/disks/data02" =
    { device = "/dev/disk/by-label/data02";
      fsType = "xfs";
    };
    
  fileSystems."/mnt/jbod" =
    { depends = [
      # The `disk*` mounts have to be mounted in this given order.
      "/mnt/disks/data01"
      "/mnt/disks/data02"
      ];
      device = "/mnt/disks/data*";
      fsType = "mergerfs";
      options = ["defaults" "minfreespace=250G" "fsname=mergerfs-jbod"];
    };
    
  fileSystems."/mnt/disks/parity01" =
    { depends = [
      # The `disk*` mounts have to be mounted in this given order.
      "/mnt/jbod"
      ];
      device = "/dev/disk/by-label/parity01";
      fsType = "xfs";
    };  

  # SnapRAID
  services.snapraid = {
    enable = true;
    #extraConfig = ''
    #  nohidden
    #  blocksize 256
    #  hashsize 16
    #  autosave 500
    #  pool /pool
    #'';
    parityFiles = [
      # Defines the file(s) to use as parity storage
      # It must NOT be in a data disk
      # Format: "FILE_PATH"
      "/mnt/disks/parity01/snapraid.parity"
    ];
    contentFiles = [
      # Defines the files to use as content list.
      # You can use multiple specification to store more copies.
      # You must have least one copy for each parity file plus one. Some more don't hurt.
      # They can be in the disks used for data, parity or boot,
      # but each file must be in a different disk.
      # Format: "content FILE_PATH"
      "/var/snapraid.content"
      "/mnt/disks/parity01/.snapraid.content"
      "/mnt/disks/data01/.snapraid.content"
      "/mnt/disks/data02/.snapraid.content"
    ];
    dataDisks = {
      # Defines the data disks to use
      # The order is relevant for parity, do not change it
      # Format: "DISK_NAME DISK_MOUNT_POINT"
      d01 = "/mnt/disks/data01/";
      d02 = "/mnt/disks/data02/";
    };
    #touchBeforeSync = true; # Whether `snapraid touch` should be run before `snapraid sync`. Default: true.
    sync.interval = "03:00";
    scrub.interval = "weekly";
    #scrub.plan = 8; # Percent of the array that should be checked by `snapraid scrub`. Default: 8.
    #scrub.olderThan = 10; # Number of days since data was last scrubbed before it can be scrubbed again. Default: 10
    exclude = [
      # Defines files and directories to exclude
      # Remember that all the paths are relative at the mount points
      # Format: "FILE"
      # Format: "DIR/"
      # Format: "/PATH/FILE"
      # Format: "/PATH/DIR/"
      "*.unrecoverable"
      "/tmp/"
      "/lost+found/"
      "*.!sync"
      ".AppleDouble"
      "._AppleDouble"
      ".DS_Store"
      "._.DS_Store"
      ".Thumbs.db"
      ".fseventsd"
      ".Spotlight-V100"
      ".TemporaryItems"
      ".Trashes"
      ".AppleDB"
    ];
  };
}

I may have spoken too soon. Everything is mounted as expected, but I don’t have permission to access the MergerFS volume defined below. I only just tried to copy files to it.

fileSystems."/mnt/jbod" =
    { depends = [
      # The `disk*` mounts have to be mounted in this given order.
      "/mnt/disks/data01"
      "/mnt/disks/data02"
      ];
      device = "/mnt/disks/data*";
      fsType = "mergerfs";
      options = ["defaults" "minfreespace=250G" "fsname=mergerfs-jbod"];
    };

When I check the perms, it says the Owner is the System administrator, and the Group is root.

I’m not certain what I did wrong, or what the correct way is to go about fixing it / getting those permissions. My immediate thought is to use chown, but that seems incorrect. Like it may cause problems, or be impermanent.

I saw something that mentioned I should have ownership of the underlying disks, which are data01 and data02. Do I, perhaps, need to mount them elsewhere? Like at /run/media/<user>/data01, for example?

I tried changing the mount point to inside my user’s directory, but that didn’t help:

fileSystems."/home/<user>/JBOD" =
    { depends = [
      # The `disk*` mounts have to be mounted in this given order.
      "/mnt/disks/data01"
      "/mnt/disks/data02"
      ];
      device = "/mnt/disks/data*";
      fsType = "mergerfs";
      options = ["defaults" "minfreespace=250G" "fsname=mergerfs-jbod-in-home" "allow_other" "use_ino"];
    };

Problem solved, using chown <user>:users /mnt/disks/data* && chmod 0760 /mnt/disk/data*. The user needs to have ownership of the disks that make up the pool.

1 Like

Was about to reply with exactly your solution. Nice!