Hey all. I’ve migrated a host from a Ubuntu system configuration into a NixOS configuration, and along the way I noticed that a typical behavior in a Ubuntu system was missing – my software raid devices weren’t being scrubbed on a regular monthly basis to check for errors.
I created a NixOS module to address this (NixOS module to add mdcheck capabilities, which automatically scrubs swraid md devices on a fixed schedule; eventually I'd like to upstream this to nixpkgs but just shoved it here for now... · GitHub ), but before I made an attempt to put this into a nixpkgs PR I had a few general questions that I thought would be better discussed here.
I’ve named the configuration added hardware.raid.swraid.monitor
– is this a good naming scheme?
enable
defaults to false – this maintains compatibility with existing NixOS systems, but, has some risks of being an unsafe configuration since regular raid scrubbing is a somewhat standard safety practice. Should this be false
?
I removed some existing systemd units that are packaged by the mdadm
package, but aren’t usable because they don’t subst in the right mdcheck/mdadm paths, and instead recreated the units via systemd.timers
and systemd.services
to allow configurability. Is this a good direction, or, would it be better to use the upstream unit files and tweak them?
This doesn’t run the mdmonitor
systemd service which has a couple of functionalities; (a) it emails or runs a program on a device event, (b) it swaps spare devices between raid arrays if appropriate on the event of a failure. I didn’t want to include it because getting the functionality (b) requires enabling the functionality (a), which would require explicit configuration for “what to do”. Is leaving this out pretty reasonable? It could be enabled pretty easily by another module.
Aside from making the mdadm package changes by an overlay, is the rest of this somewhat appropriate to include in a nixpkgs PR? Any thoughts, concerns?
2 Likes
It’s probably better to just open the PR and allow for this discussion there.
1 Like
When you do open the PR, please ping me in there (same id as here) as I have a local module that might be of interest.
Alright, I’ve opened a PR and can take the discussion over there. Thanks.
NixOS:master
← mfenniak:mdcheck
opened 12:27AM - 06 Dec 22 UTC
###### Description of changes
Adds new options `hardware.raid.swraid.monitor … ` which, when enabled, ports a systemd timer from the mdadm package to NixOS that runs regularly scheduled scrubs on swraid md* devices.
To support the new NixOS module, this PR adjusts the mdadm package to: (a) package the mdcheck script, (b) remove the upstream systemd units (mdstart_*, mdmonitor*) that are not functional in NixOS.
Questions/uncertainties with this (raised in https://discourse.nixos.org/t/nixos-mdadm-mdcheck/23716):
- I’ve named the configuration added hardware.raid.swraid.monitor – is this a good naming scheme?
- `enable` defaults to false – this maintains compatibility with existing NixOS systems, but, has some risks of being an unsafe configuration since regular raid scrubbing is a somewhat standard safety practice. Should this be true, false, or maybe true-if-stateVersion>="23.05" so that it becomes the default on new systems?
- I removed some existing systemd units that are packaged by the mdadm package, but aren’t usable because they don’t subst in the right mdcheck/mdadm paths, and instead recreated the units via systemd.timers and systemd.services to allow configurability. Is this a good direction, or, would it be better to use the upstream unit files and tweak them?
- This doesn’t run the mdmonitor systemd service which has a couple of functionalities; (a) it emails or runs a program on a device event, (b) it swaps spare devices between raid arrays if appropriate on the event of a failure. I didn’t want to include it because getting the functionality (b) requires enabling the functionality (a), which would require explicit configuration for “what to do”. Is leaving this out pretty reasonable? It could be enabled pretty easily by another module, or enhanced in the future.
###### Things done
- Built on platform(s)
- [x] x86_64-linux
- [ ] aarch64-linux
- [ ] x86_64-darwin
- [ ] aarch64-darwin
- [ ] For non-Linux: Is `sandbox = true` set in `nix.conf`? (See [Nix manual](https://nixos.org/manual/nix/stable/command-ref/conf-file.html))
- [x] Tested, as applicable:
- [NixOS test(s)](https://nixos.org/manual/nixos/unstable/index.html#sec-nixos-tests) (look inside [nixos/tests](https://github.com/NixOS/nixpkgs/blob/master/nixos/tests))
- and/or [package tests](https://nixos.org/manual/nixpkgs/unstable/#sec-package-tests)
- or, for functions and "core" functionality, tests in [lib/tests](https://github.com/NixOS/nixpkgs/blob/master/lib/tests) or [pkgs/test](https://github.com/NixOS/nixpkgs/blob/master/pkgs/test)
- made sure NixOS tests are [linked](https://nixos.org/manual/nixpkgs/unstable/#ssec-nixos-tests-linking) to the relevant packages
- [ ] Tested compilation of all packages that depend on this change using `nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"`. Note: all changes have to be committed, also see [nixpkgs-review usage](https://github.com/Mic92/nixpkgs-review#usage)
- [x] Tested basic functionality of all binary files (usually in `./result/bin/`)
- [23.05 Release Notes (or backporting 22.11 Release notes)](https://github.com/NixOS/nixpkgs/blob/master/CONTRIBUTING.md#generating-2305-release-notes)
- [ ] (Package updates) Added a release notes entry if the change is major or breaking
- [ ] (Module updates) Added a release notes entry if the change is significant
- [ ] (Module addition) Added a release notes entry if adding a new NixOS module
- [ ] (Release notes changes) Ran `nixos/doc/manual/md-to-db.sh` to update generated release notes
- [x] Fits [CONTRIBUTING.md](https://github.com/NixOS/nixpkgs/blob/master/CONTRIBUTING.md).
<!--
To help with the large amounts of pull requests, we would appreciate your
reviews of other pull requests, especially simple package updates. Just leave a
comment describing what you have tested in the relevant package/service.
Reviewing helps to reduce the average time-to-merge for everyone.
Thanks a lot if you do!
List of open PRs: https://github.com/NixOS/nixpkgs/pulls
Reviewing guidelines: https://nixos.org/manual/nixpkgs/unstable/#chap-reviewing-contributions
-->
1 Like