I wonder if the ACME renew service should depend on nss-lookup.target or not.
I have unbound as local DNS, when doing nixos-rebuild, the ACME renew services cannot resolve letsencrypt hosts because the local DNS (unbound) is not yet started.
Unbound systemd sets it should start before nss-lookup.
The most logical thing should be that the ACME renew services should execute after nss-lookup.target
I can fix this in my own configuration with some hacks, but I wonder if this shouldn;t be the default in NixOS?
Chances are high you are one of very few users of this setup, and it‘s awesome that you put your idea into the open instead of being satisfied with your hack. Why not submit a PR and include your reasoning? Seems like ideally it is just one line of code. I‘ll be happy to review if you reference the PR in this thread. That would be yet another little thing that Just Works™, and your contribution will make NixOS and the world a better place.
However, this still doesn’t fly for me. In the journal at boot, I can clearly see that the ACME renewal is started after Unbound, but it still fails:
Aug 10 04:42:29 henrimenke.com systemd[1]: Started Unbound recursive Domain Name Server.
Aug 10 04:42:29 henrimenke.com systemd[1]: Reached target Host and Network Name Lookups.
Aug 10 04:42:29 henrimenke.com systemd[1]: Starting Renew ACME Certificate for henrimenke.com...
Aug 10 04:42:29 henrimenke.com yz6frbp9w9s150pq449sxva06iyk95fz-acme-start[790]: 2020/08/10 04:42:29 Could not create client: get directory at 'https://acme-v02.api.letsencrypt.org/directory': Get "https://acme-v02.api.letsencrypt.org/directory": dial tcp: lookup acme-v02.api.letsencrypt.org: no such host
Aug 10 04:42:29 henrimenke.com yz6frbp9w9s150pq449sxva06iyk95fz-acme-start[790]: 2020/08/10 04:42:29 Could not create client: get directory at 'https://acme-v02.api.letsencrypt.org/directory': Get "https://acme-v02.api.letsencrypt.org/directory": dial tcp: lookup acme-v02.api.letsencrypt.org: no such host
Aug 10 04:42:30 henrimenke.com systemd[1]: acme-henrimenke.com.service: Main process exited, code=exited, status=1/FAILURE
Aug 10 04:42:30 henrimenke.com systemd[1]: acme-henrimenke.com.service: Failed with result 'exit-code'.
Aug 10 04:42:30 henrimenke.com systemd[1]: Failed to start Renew ACME Certificate for henrimenke.com.
Unfortunately it can’t fix that outside of system start. The way that NixOS is currently changing generations doesn’t allow enforcing service dependencies in targets that have previously been reached. It is a limitation with systemctl. There was some talk about rewriting our activation script with that limitation in mind but nobody has put in a lot of effort yet. @arianvp is one of those that I discussed it with in #nixos-systemd (on Freenode).