Docker network isolation not working properly

Hi folks,

So I was setting up multiple Docker Compose services, using several compose.yaml stacks.

One thing I just discovered is that for some reason, Docker doesn’t isolate containers belonging to different networks - i.e. I can ping a container belonging to service 1 stack (from its own network) from another container belonging to service 2 stack (from its own network).

Looking at iptables, I see that this is due to incomplete setup of DOCKER-ISOLATION-STAGE-1 and DOCKER-ISOLATION-STAGE-2 chains, for example:

➜  nixos git:(main) ✗ sudo iptables -L DOCKER-ISOLATION-STAGE-1 -v
Chain DOCKER-ISOLATION-STAGE-1 (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DOCKER-ISOLATION-STAGE-2  all  --  br-df34f54325ac !br-df34f54325ac  anywhere             anywhere            
   60  7487 RETURN     all  --  any    any     anywhere             anywhere            
➜  nixos git:(main) ✗ sudo iptables -L DOCKER-ISOLATION-STAGE-2 -v
Chain DOCKER-ISOLATION-STAGE-2 (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DROP       all  --  any    br-df34f54325ac  anywhere             anywhere            
   15  1688 RETURN     all  --  any    any     anywhere             anywhere 

br-df34f54325ac here corresponds to only one of the networks, but there’s no mention of the other networks…

Does anyone know what might be going on? I only found this issue about Docker possibly screwing up iptables (but not sure how relevant it is): Docker bypasses NixOS firewall exposing ports on the external interface · Issue #111852 · NixOS/nixpkgs · GitHub

EDIT: Here’s a minimal example repro of the issue, even with just bare-bones Docker CLI:

➜ sudo docker network create net1
68c8a95fb351f1313be310a252fc6dfc767b98d4af78aed25b492df296e59074

➜  sudo docker network create net2
f0ca7813ee49cbea5747ee2e37154ff51e96b40a9fe7f7fecd212fb835759cf2

➜  sudo iptables -L DOCKER-ISOLATION-STAGE-1 -v
Chain DOCKER-ISOLATION-STAGE-1 (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DOCKER-ISOLATION-STAGE-2  all  --  br-68c8a95fb351 !br-68c8a95fb351  anywhere             anywhere            
 2519  351K RETURN     all  --  any    any     anywhere             anywhere   
         
➜  sudo iptables -L DOCKER-ISOLATION-STAGE-2 -v
Chain DOCKER-ISOLATION-STAGE-2 (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DROP       all  --  any    br-68c8a95fb351  anywhere             anywhere            
    5   275 RETURN     all  --  any    any     anywhere             anywhere     
       
➜  sudo docker network ls
NETWORK ID     NAME      DRIVER    SCOPE
32022c4c2f8e   bridge    bridge    local
2e980eafaff8   host      host      local
68c8a95fb351   net1      bridge    local
f0ca7813ee49   net2      bridge    local
b8e8d85b187f   none      null      local

➜ sudo docker run --rm -it --net net1 --name cont1 alpine
/ # ifconfig
eth0      Link encap:Ethernet  HWaddr 02:42:AC:12:00:02  
          inet addr:172.18.0.2  Bcast:172.18.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:36 errors:0 dropped:0 overruns:0 frame:0
          TX packets:3 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:4413 (4.3 KiB)  TX bytes:290 (290.0 B)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

➜ sudo docker run --rm -it --net net2 --name cont2 alpine
/ # ifconfig
eth0      Link encap:Ethernet  HWaddr 02:42:AC:13:00:02  
          inet addr:172.19.0.2  Bcast:172.19.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:33 errors:0 dropped:0 overruns:0 frame:0
          TX packets:3 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:4214 (4.1 KiB)  TX bytes:290 (290.0 B)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
/ # ping 172.18.0.2 # cont1
PING 172.18.0.2 (172.18.0.2): 56 data bytes
64 bytes from 172.18.0.2: seq=0 ttl=63 time=0.237 ms
64 bytes from 172.18.0.2: seq=1 ttl=63 time=0.155 ms
/ # traceroute 172.18.0.2
traceroute to 172.18.0.2 (172.18.0.2), 30 hops max, 46 byte packets
 1  172.19.0.1 (172.19.0.1)  0.014 ms  0.010 ms  0.003 ms
 2  172.18.0.2 (172.18.0.2)  0.004 ms  0.010 ms  0.004 ms

Notice that 68c8a95fb351 in iptables rules is net1, and there’s no mention of net2

Thanks!

As a workaround, looks like manually adding these 2 rules (using networking.firewall.extraCommands) achieves the necessary isolation:

# allow inter-network traffic
iptables -A DOCKER-USER -m physdev --physdev-is-bridged -j ACCEPT

# drop inter-network traffic
iptables -A DOCKER-USER -i br-+ -o br-+ -j DROP