Resolv.conf different on two same-provider vps with same settings

I have two almost identical VPS at the same provider, with exactly the same networking settings.

  networking = {
    hostName = "myhostA" # myhostB on the other one (just example)
    enableIPv6 = true;
    useDHCP = false;
    interfaces.ens3.useDHCP = true;
    wireless.enable = false;
   ...
};

On “hostA”, however (my mail server), addresses under *.mail.protection.outlook.com resolve differently than on all other hosts I tried (including the other VPS).

Checking their respective /etc/resolv.conf they are completely different (wrong one vs ok one):

# Generated by resolvconf
domain quicksrv.de
nameserver 127.0.0.1
options edns0
# Generated by resolvconf
domain luckysrv.de
nameserver 46.38.225.230
nameserver 46.38.252.230
options edns0

Observe the missing “real” nameserver fields for hostA. This host can still resolve most/many domains correctly, but apparently not the one my mail server needs in this case, leading to errors like this (hostnames slightly redacted):

... postfix/smtp[8704]: warning: no MX host for yyy.zzz has a valid address record
... postfix/smtp[8704]: B0967C00E0: to=<xxx@yyy.zzz>, relay=none, delay=9.2, delays=0.34/0.05/8.8/0, dsn=4.4.3, status=deferred (Host or domain name not found. Name service error for name=yyy.zzz.mail.protection.outlook.com type=A: Host not found, try again)

So the question would first be: why does the same network setup on the same VPS provider not yield the same (or at least valid) nameservers on a DHCP query, and how to debug this better (nixos specific).

As you enabled DHCP, you have to check the logs of the DHCP client. My guess is that its different between the hosts. For whatever reason.

Ah, of course, should’ve thought of that.

But: there is not much of a difference between the two logs, only that the “half-broken” one shows a few more

Apr 13 23:03:40 1nnovps1 dhcpcd[7474]: ens3: Router Advertisement from fe80::4
Apr 13 23:03:40 1nnovps1 dhcpcd[7474]: ens3: no global addresses for default route
Apr 13 23:03:40 1nnovps1 dhcpcd[7474]: ens3: Router Advertisement from fe80::1
Apr 13 23:03:40 1nnovps1 dhcpcd[7474]: ens3: no global addresses for default route
Apr 13 23:03:40 1nnovps1 dhcpcd[7474]: ens3: Router Advertisement from fe80::7
Apr 13 23:03:40 1nnovps1 dhcpcd[7474]: ens3: no global addresses for default route
Apr 13 23:03:40 1nnovps1 dhcpcd[7474]: ens3: Router Advertisement from fe80::3
Apr 13 23:03:40 1nnovps1 dhcpcd[7474]: ens3: no global addresses for default route
Apr 13 23:03:40 1nnovps1 dhcpcd[7474]: ens3: Router Advertisement from fe80::6
Apr 13 23:03:40 1nnovps1 dhcpcd[7474]: ens3: no global addresses for default route
Apr 13 23:03:40 1nnovps1 dhcpcd[7474]: ens3: Router Advertisement from fe80::5
Apr 13 23:03:40 1nnovps1 dhcpcd[7474]: ens3: no global addresses for default route
Apr 13 23:03:40 1nnovps1 dhcpcd[7474]: ens3: Router Advertisement from fe80::2
Apr 13 23:03:40 1nnovps1 dhcpcd[7474]: ens3: no global addresses for default route

where the ok one only has one

Apr 13 22:08:51 raincldvps1 dhcpcd[1299510]: ens3: Router Advertisement from fe80::1
Apr 13 22:08:51 raincldvps1 dhcpcd[1299510]: ens3: no global addresses for default route

Further there is one (more suspicious) difference where the first says:

Apr 13 23:03:44 1nnovps1 dhcpcd[7474]: ens3: ignoring offer of www.xxx.yyy.zzz from 46.38.225.234

but it still goes on to lease the correct IP address, and the rest of the log looks the same as for the working one (but resolv.conf does not)

The dhcpcd.conf (linked from the nix store) are identical.

Ok, I found the culprit:

SNM (simple nixos mailserver) defaults to mailserver.localDnsResolver = true, which causes services.resolvconf to be activated and overridden with some DNS proxy (?) services.kresd. This one apparently doesn’t always work correctly, causing the no MX host ... lookup errors for some domains.

Switching it off results in the error being gone and the resolv.conf looking correct (i.e. with DHCP assigned name servers).