Hi all,
I’ve been having some issues getting Docker containers to network “properly”. As an example, I’m in the process of transitioning a couple of my server machines over to NixOS from Ubuntu server; to start, I tested out the Pi-hole Docker container, duplicating the setup as closely as I could. This includes:
Ensuring the Docker version is the same (28)
Copying over the Compose file
Duplicating relevant DNS configuration
Putting them on a generic bridge network
As it stands right now, everything works correctly, and as far as I can tell based on misc. docker container/network inspect commands, practically everything is identical. There’s one issue, though: the Ubuntu Pi-hole resolves a pi.hole request to the host machine’s IP address; the NixOS Pi-Hole resolves it to the Docker network IP address. The only way to “fix” this issue that I’ve found is to turn the network_mode to “host”, which is not an ideal solution. In all other scenarios where I have some containerized web server that is resolving IP addresses (e.g. a service discovery server used for a microservice setup), similar behavior arises, and “host” mode remains the only fix.
It’s consistently a NixOS issue; the exact same container/compose setup works perfectly fine on Ubuntu or Arch, but then proceeds to break in that particular way on NixOS. I’ve spent a few hours troubleshooting the issue, including attempted solutions like:
Disabling the firewall on NixOS entirely
Enabling IP forwarding with boot.kernel.sysctl."net.ipv4.ip_forward" = "1"; in configuration.nix
Adding --ip-forward=true to the Docker daemon flags via virtualisation.docker.extraOptions
Examining routing rules with iptables -L; no notable differences.
At this point I’m just banging my head against the wall hoping to brute force a solution that isn’t just enabling “host” mode. I’d love to hear if anyone has any experience with Docker on NixOS and has run into a similar problem. I’m happy to provide any needed info or logs.
Here’s some extra info which may be useful:
The behavior persists across multiple NixOS machines, and as mentioned, across multiple containers/images.
In regards to PiHole specifically, digging pi.hole within the container itself resolves to the machine address on Ubuntu, and to the loopback address on NixOS. digging it from outside the container resolves to the machine and Docker addresses respectively.
I’ve tested this across 4 machines: two with NixOS, one with Arch, and one with Ubuntu. On both NixOS machines it exhibits the exact same behavior with multiple containers, and the Arch/Ubuntu machines work just fine; all machines are using the exact same compose file and have the same network configuration, and they all have a fairly generic setup.
I’m sure it’s a network problem; the question is what about NixOS specifically (whether it’s the docker module or some other config) is causing the networking problem, as a “stock” setup works just fine on more traditional machines.
To be fair other “distros” sometimes open a lot by default on networking - which may be a security concerns.
In Nixos, you’d likely need to enable / configure what you need explicitly.
It’s a very barebones setup. I’ve added/removed some networking changes to both machines in an attempt to troubleshoot, e.g. boot.kernel.sysctl."net.ipv4.ip_forward" = "1"; (an admittedly random attempt at a solution), but these settings are all that remain.
This is admittedly part of why I’m so confused, since I disabled the firewall entirely (I trust my router firewalls for now). Besides, even with the firewall enabled, allowing the Pi-Hole ports still causes the same issue.
At this point, the only difference in configuration that I’ve personally interacted with is using resolvd on the Ubuntu machine, which it comes with by default.
I took a closer look to compare to my config. It looks nearly identical, especially the compose file since mine is copied almost 1:1 from the Pi-hole Docker repo. The only real config difference as far as networking goes is the hardcoded DHCP/gateway config; I was hoping to leave as much of that workload on the router, as it’s worked well for the Ubuntu Pi-Hole; I also doubt it would affect the other containers I’ve had issues with, as the goal with those is to spin up a small microservice cluster on other networks where I don’t have control over DHCP.
It feels like a Docker network problem, but the fact that 1) the docker network inspect <pihole network> is identical across all machines, and 2) the fact that they’re using the exact same compose file seems to indicate that it’s not. Quite the baffling issue.
Also make sure to remove all previous “test” configurations, maybe reboot to make sure changes apply correctly for network-manager.
I’m assuming that your host has “extra” configs that may conflict / cause issue(s), as both Nixos and docker-compose gives reproducible env / system (aka if copy paste all config, these should work on any instances).
There was a reason I put myself in the sudo group instead of wheel… but admittedly that was ages ago and I don’t recall why. Regardless, on the other machine, I put myself in wheel generically as recommended by the NixOS install guide.
Yes, dnsmasq settings are persistent. The Compose file is copied directly from the Docker Pi-Hole repo; the only changes were removing the DHCP port mappings/config.
That’s the thing; to be doubly sure, I put NixOS on an entirely separate system with the most barebones setup imaginable. We’re talking the lines I showed above and a user’s config added to the generated NixOS config file. The fact that it’s given me the exact same error tells me the problem isn’t necessarily with my existing config, but a lack of config needed to override some module setting.
In a confusing (and somewhat embarrassing) revelation, I dumped my Pi-Hole config from all machines with teleporter and ran through them with diff, only to find out that I set the dns.reply.host setting to the IP address of the host on Ubuntu (I don’t recall doing this but it’s not set by default, so I must have). I could have sworn I didn’t do that on the Arch machine and dig pi.hole @<Arch machine IP> worked out of the box, but at this point I’m doubting everything.
I’m in the process of debugging the other containers where resolution was breaking. All seems to be working for now on bridge mode; something must have changed since we switched it over to host mode. Very confusing.
Just out of curiosity, in that Pi-Hole config you sent, does it automatically resolve pi.hole to the host IP address without having dns.reply.host set (or whatever the equivalent was before v6)?
I’m going to tentatively chalk this up as “a series of incredibly ironic and unfortunate coincidences all pointing to the objectively incorrect thing” and mark it resolved, for now.