My pull request to implement reload triggers has now landed on unstable which is a perfect opportunity to explain what I did there and why I think it improves NixOS.
For people interested in the entire switching process, documentation around it has also been provided in the manual.
If you maintain (or use) a module that you think would benefit from reloads, set
stopIfChanged=true if present and add
reloadTriggers= which will cause the unit to be reloaded instead of restarted.
To understand why the old behavior worked well but not perfectly it’s good to look at how it worked exactly. When switching systems, most units are restarted or reloaded when they changed. There were (and still are) a lot of options to tune this behavior to fit the specific needs of all units.
But how did the script figure out whether a unit changed? The implementation was simple and very effective. The script just resolved the symlink of the unit file (resulting in a store path) and compared the old and new values. If anything about the unit changed, it would get assigned a new store path and the values would differ. Is this a good solution to the problem? In my opinion it’s a great solution because it’s very simple and robust and I really love it. For restarts. When reloading units, the logic quickly falls apart.
So why does the logic doesn’t work when reloading units? When reloading a unit,
systemctl reload my-unit.service is executed which causes systemd to execute the command specified in
ExecReload=. One of the more common things to do here is to send a
SIGHUP to the main process which often causes it to reload its configuration file.
When changing anything about the unit and
reloadIfChanged=true is set, the unit is reloaded no matter why it changed. This is bad because all unit options only get applied when restarting the unit (at least for
.services units). This is already bad for adding sandboxing options like
ProtectSystem= but it gets even worse for package updates. When the store path of the package changes (and hence the store path of the unit), the unit gets reloaded and the old code still runs while being asked to reload the configuration using
ExecReload=. This is horrible for updates because people expect NixOS to do the right thing for package updates and will often not expect to have to restart units manually afterwards.
So what can we do about it? We could just drop the entire reload logic but that’s not a real solution because reloading is a really useful feature that is worth having. Some way is needed to differentiate between reloads and restarts. Luckily I added a proper (hopefully 100% compatible) systemd unit parser in my previous PR so the only thing left to do was to use what I had prepared. I replaced the logic to parse the unit files and compare the contents of the units instead of the unit paths themselves. With this method it was trivial to decide whether
X-Reload-Triggers= (a new option) changed in the unit or anything else. This also helps to prevent useless restarts from comments added to unit files and changed order of unit options (complains I never heard from anyone). The added complexity due to the new parser is justified by the proper reload logic in my opinion.
I took care of the priority between restarts and reloads. When a reload trigger changes and anything else about the unit, it’s restarted and not reloaded. There is also an option in the systemd module to propagate the reload triggers:
The only thing left to do (which I did not do because I’m not familiar with every module in NixOS ) is to remove
reloadIfChanged=true from all services and to add
reloadTriggers pointing to a list of (mostly) derivation that will cause the unit to reload instead of restarting.
I really enjoyed working on this feature and I plan to continue my efforts in this area by implementing restarts of
.socket units rather soon (and some automount stuff popped up as well…). I added a lot of test cases for the new behavior and also extended the documentation quite a bit. Since I’m not a regular manual writer, follow-up PRs are highly appreciated.
There are two more things to do: Migrate current modules from
reloadIfChanged and getting rid of the
reloadIfChanged option entirely because I don’t see any use case where it would never break anything. For the first task there is currently no coordinated effort. The second task is probably best to do after the 22.05 release so people with out-of-tree modules can have a soft switch to the new behavior.