I deploy my server with NixOps and I want to be notified about systemd service failures. Therefore I tried to extend every systemd service with onFailure. Currently I have something along the lines of
Just another idea. Systemd publishes all event changes of systemd units over dbus. You could listen to these events and trigger something any time a service goes to status failed.
Unit files now support top level dropin directories of the form
<unit_type>.d/ (e.g. service.d/) that may be used to add configuration
that affects all corresponding unit files.
In the stockholm there is a module called krebs.on-failure which essentially links a separate on-failure.plans.<service-name> to the service:
The essential piece of configuration can be found at:
However it only attaches to explicitly marked services: krebs.on-failure.plans.snapraid-sync.name = "snapraid-sync";[source]
The idea is that most of the time you know which services are important and where you definitly want to receive a mail once something dies.
Thank you for your suggestions. I learned about some cool stuff. Much appreciated
For the moment I just use my solution and specify explicitly the services for that I definitely want to receive notifications.