Highly respected community,
I’m facing a problem of provisioning fully replicated system. The target is to replicate NixOS by simply transferring config to a new system and doing nixos-rebuild switch
, but it’s not always possible and sometimes manual interactions are needed. The problem occurs during certificate resolution via ACME.
Sometimes(which is a known issue), ACME resolution fails due to busy socket or tcp dialup timeout. If you retry launching ACME service manually, by running systemctl restart acme-$DOMAIN.service
it works just great.
I’m wondering, is it possible to configure ACME in some way, so Lego will retry resolution a few times, before failing?
Just for curiosity , what is the known issue?
Sometimes, when performing initial setup, ACME fails to resolve certificate due to busy socket or tcp lookout timeout. It’s rather not an issue but pecularity of the process
that sounds like a race condition of some kind? Are you running some kernel based resource like file descriptors ???
A copy of the logs might be of interest. is this related ACME renewals fail due to DNS being unavailable during switch · Issue #85794 · NixOS/nixpkgs · GitHub
Sorry for the delayed response. I’m not able to reproduce this at the moment. I’ll offload myself from work a bit and try to get issue reproduced.
Regarding possible thread race, I also had such assumption when I was using ACME to obtain 5 Let’sEncrypt LA certificates, but now, I reconfigured ACME to obtain one wildcart certificate by passing DNS-01 challenge instead of HTTP-01, so now, it’s unlikely to be a thread race as connection between server and Let’sEncrypt infrastructure is being established only to transfer obtained certs and only once.