Notification on systemd service failures

I deploy my server with NixOps and I want to be notified about systemd service failures. Therefore I tried to extend every systemd service with onFailure. Currently I have something along the lines of

{ config, lib, pkgs, ... }:

{
  config = lib.setAttrByPath [ "systemd" "services" ]
    (lib.genAttrs (lib.attrNames config.systemd.services)
      (serviceName: lib.setAttrByPath [ "serviceConfig" "onFailure" ]  [ "email@%n.service" ]));
}

However, that will lead into an infinite recursion. Any ideas?

1 Like

Just another idea. Systemd publishes all event changes of systemd units over dbus. You could listen to these events and trigger something any time a service goes to status failed.

Nixpkgs ships with the pystemd python package that easily allows you to do this https://github.com/facebookincubator/pystemd/blob/a51e19e1abe65498f2d74aa540c7016716c3e846/examples/monitor_all_units_from_signal.py

Or you could do it with busctl and some bash:

sudo busctl  monitor org.freedesktop.systemd1 --json=short | jq 'select(.member=="PropertiesChanged")'

You then just need one service that monitors all of them.

However, lā€™m also curious how we can solve your nixos config to not infinitely recurse. Ill have a better look later