Simple NixOS Mailserver and outgoing SMTP over VPN

Hello,

I’m running on a VPS that blocks outgoing email (outgoing TCP connections to port 25 (unless the receiving side uses IPv6, which my VPS neglected to block)).

What are my options to bypass the restriction?

  1. Set up WireGuard VPN to my home router, and teach the Postfix in Simple NixOS Mailserver to route traffic via it
  2. Set up Postfix on home router - unfortunately, there’s not enough storage space, and if I extend my OpenWRT onto a USB stick, it’ll start overheating and losing WiFi
  3. Set up a Raspberry or and old Android phone or something to run Postfix - That’s an option, but it would mean yet another wall wart, and yet another potential mode of failure
  4. … are there any other options I’m not considering?

I noticed that Postfix has an option called smtp_bind_address that lets one choose the source address of outgoing SMTP connections. I suspect that if I set WireGuard’s address (the one attached to wg0 interface) there, Linux will route it over WireGuard’s interface[1], or there could be some non-default way to ask it to route it like this.
Could I get some help with setting up option 1: I need to add a new option to SNM. I forked Simple NixOS Mailserver and added this commit to set the option, then switched my NixOS conf to this fork:

$ git show -U0
...
@@ -11,4 +11,3 @@ in {
-               (builtins.fetchTarball {
-                       url = "https://gitlab.com/simple-nixos-mailserver/nixos-mailserver/-/archive/${release}/nixos-mailserver-${release}.tar.gz";
-                       # This hash needs to be updated
-                       sha256 = "1ngil2shzkf61qxiqw11awyl81cr7ks2kv3r3k243zz7v2xakm5c";
+               (builtins.fetchGit {
+                       url = "https://gitlab.com/cizra/nixos-mailserver.git";
+                       ref = "master";
@@ -226,0 +226 @@ in {
+               smtpBindAddress = "10.100.0.2";  # route outgoing mail through VPN, to bypass Azure's block on outgoing SMTP port 25

This gave me some error about lastModified. I grepped the SNM repo for this word, and deleted the flake configuration that mentioned it. That didn’t help. I verified, the flake bits are now gone from /nix/store/clone-of-my-repo, yet I still get this error:

# nixos-rebuild switch
error: The option `lastModified' does not exist. Definition values:
       - In `/home/elmo/.config/nixos/server.nix': 1688422006
(use '--show-trace' to show detailed location information)
# nixos-rebuild switch --show-trace
error:
       … while evaluating the attribute 'config'

       at /nix/var/nix/profiles/per-user/root/channels/nixos/lib/modules.nix:326:9:

          325|         options = checked options;
          326|         config = checked (removeAttrs config [ "_module" ]);
             |         ^
          327|         _module = checked (config._module);

       error: The option `lastModified' does not exist. Definition values:
       - In `/home/elmo/.config/nixos/server.nix': 1688422006
building Nix...
error:
       … while evaluating the attribute 'config'

       at /nix/var/nix/profiles/per-user/root/channels/nixos/lib/modules.nix:326:9:

          325|         options = checked options;
          326|         config = checked (removeAttrs config [ "_module" ]);
             |         ^
          327|         _module = checked (config._module);

       error: The option `lastModified' does not exist. Definition values:
       - In `/home/elmo/.config/nixos/server.nix': 1688422006
building the system configuration...
error:
       … while evaluating the attribute 'config.system.build.toplevel'

       at /nix/var/nix/profiles/per-user/root/channels/nixos/lib/modules.nix:326:9:

          325|         options = checked options;
          326|         config = checked (removeAttrs config [ "_module" ]);
             |         ^
          327|         _module = checked (config._module);

       error: The option `lastModified' does not exist. Definition values:
       - In `/home/elmo/.config/nixos/server.nix': 1688422006

No files in my NixOS conf mention the word lastModified, and the only change is the switch to my own fork of SNM repo. What am I doing wrong that I’m getting this error?

Also, would this approach of setting the bind address help my case at all?

I hear iptables (and these days nft) can be used to set different firewall rules per user, and Postfix runs as its own user. Should I investigate instead how to accomplish this routing using the firewall, instead?

[1] - though I’ll have to figure out what to set the allowedIps value - if I set it to an empty list, WireGuard will refuse to route anything. If I set i to 0.0.0.0/0, it’ll set wg as a default route for everything, which I don’t want. As a horrible hack, I might create a systemd service that deletes the new default route, and set it to run after WireGuard comes up. Or something.

No, it would not. The interface/address to which a service is bound does not influence routing.

You’ll want to create a separate routing table for your VPN. Then the only remaining exercise is ensuring said table is used for traffic originating from Postfix. A firewall rule (the MARK target) based on uid would indeed work, paired with a routing rule to use the VPN table for routing lookups.

Not sure how you’re running WireGuard, but there should be a way around that.

For example, there’s networking.wireguard.interfaces.<name>.allowedIPsAsRoutes in NixOS, for wg-quick there’s the Table option to tell it to add its routes to a different table.

Consider configuring all of this using networkd instead - apart from supporting everything you need (except iptables) it doesn’t create routes unless you tell it to…

1 Like

Thank you for your tips, @ius - I abandoned my faulty approach, and tried to set it up as you said it.

I’m following Netfilter & iproute - marking packets - it has a recipe similar to what I need.
(I’m adding a filter by some random website (and HTTP too, so it’d generate a redirect to HTTPS instead of dumping a megabyte of HTML on me).)

Set up logging:

# iptables -A OUTPUT -p tcp -d 185.154.221.183/30 --dport 80 -m mark ! --mark 0xdeadbeef -j LOG --log-prefix "No mark: "
# iptables -A OUTPUT -p tcp -d 185.154.221.183/30 --dport 80 -m mark --mark 0xdeadbeef -j LOG --log-prefix "With mark: "

Add marking by user:

# iptables -I OUTPUT -m owner --uid-owner postgres -j MARK --set-mark 0xdeadbeef

Watch logs in another tmux pane

# sudo -u postgres curl -i http://185.154.221.183
HTTP/1.1 302 Found
content-length: 0
location: https://185.154.221.183/
cache-control: no-cache

$ dmesg -w
[307937.069525] With mark: IN= OUT=eth0 SRC=10.1.0.4 DST=185.154.221.183 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=53104 DF PROTO=TCP SPT=54202 DPT=80 WINDOW=64240 RES=0x00 SYN URGP=0 MARK=0xdeadbeef
[307937.080823] With mark: IN= OUT=eth0 SRC=10.1.0.4 DST=185.154.221.183 LEN=52 TOS=0x00 PREC=0x00 TTL=64 ID=53105 DF PROTO=TCP SPT=54202 DPT=80 WINDOW=502 RES=0x00 ACK URGP=0 MARK=0xdeadbeef
[307937.080862] With mark: IN= OUT=eth0 SRC=10.1.0.4 DST=185.154.221.183 LEN=122 TOS=0x00 PREC=0x00 TTL=64 ID=53106 DF PROTO=TCP SPT=54202 DPT=80 WINDOW=502 RES=0x00 ACK PSH URGP=0 MARK=0xdeadbeef
[307937.091962] With mark: IN= OUT=eth0 SRC=10.1.0.4 DST=185.154.221.183 LEN=52 TOS=0x00 PREC=0x00 TTL=64 ID=53107 DF PROTO=TCP SPT=54202 DPT=80 WINDOW=502 RES=0x00 ACK URGP=0 MARK=0xdeadbeef
[307937.092153] With mark: IN= OUT=eth0 SRC=10.1.0.4 DST=185.154.221.183 LEN=52 TOS=0x00 PREC=0x00 TTL=64 ID=53108 DF PROTO=TCP SPT=54202 DPT=80 WINDOW=502 RES=0x00 ACK FIN URGP=0 MARK=0xdeadbeef
[307937.103191] No mark: IN= OUT=eth0 SRC=10.1.0.4 DST=185.154.221.183 LEN=52 TOS=0x00 PREC=0x00 TTL=64 ID=53109 DF PROTO=TCP SPT=54202 DPT=80 WINDOW=502 RES=0x00 ACK URGP=0

So far, so good - the mark is set appropriately, according to the user.
(Makes me wonder - if the final ACK doesn’t have the mark, thus gets dropped, will it cause connections to linger on the receiving end?)

However, I can’t get the ip route to route them correctly. Here’s what I did:

Create a new routing table:

# echo 201 wg >> /etc/iproute2/rt_tables  # What's the NixOSy way of doing this?

Start using the new table for MARKed traffic (10.100.0.1 is the WireGuard’s remote peer, it’s reachable as evidenced by ping + tcpdump on the remote end):

# ip rule add fwmark 0xdeadbeef table wg
# ip rule ls
0:      from all lookup local
32765:  from all fwmark 0xdeadbeef lookup wg  # lowest number wins, so it should be active
32766:  from all lookup main
32767:  from all lookup default
# ip route add default via 10.100.0.1 dev wg0 table wg
# ip route ls table wg
default via 10.100.0.1 dev wg0

Nothing happens. The traffic just isn’t routed through wg0. Is OUTPUT the right place to put my firewall marking rule? I tried to use PREROUTING as per the article, but PREROUTING doesn’t appear to support owner match:

# iptables -A PREROUTING -t mangle -m owner --uid-owner postgres -j MARK --set-mark 0xdeadbeef
(dmesg) [ 1191.941967] x_tables: ip_tables: owner match: used from hooks PREROUTING, but only valid from OUTPUT/POSTROUTING

I tried to add it to POSTROUTING - there the command succeeds, but (as routing is probably done by now), traffic still doesn’t end up in VPN. I still think filter/OUTPUT should be fine, as it’s before the “reroute check” on https://upload.wikimedia.org/wikipedia/commons/3/37/Netfilter-packet-flow.svg - not that I really understand that graph.
Nevertheless, I added all HTTP traffic (that’s more easily testable than SMTP) to PREROUTING, just to see if it’d work:

# iptables -I PREROUTING -t mangle -p tcp --dport 80 -j MARK --set-mark 0xdeadbeef`

Nope, that didn’t affect routing either.

What else could I try?

I don’t see any obvious mistakes either…

You could try ip route get mark 1.1.1.1 0xdeadbeef dport 25 to see if it returns the expected result.

Otherwise: I recently noticed ip rule also supports matching on uidrange - try that?

This works:

# ip route get mark 0x123 1.1.1.1 dport 80
1.1.1.1 via 10.1.0.1 dev eth0 src 10.1.0.4 mark 0x123 uid 0
    cache

# ip route get mark 0xdeadbeef 1.1.1.1 dport 80
1.1.1.1 via 10.100.0.1 dev wg0 table wg src 10.100.0.2 mark 0xdeadbeef uid 0
    cache

yet packets still travel through eth0.

Yes, I noticed that option. It makes traffic go to wg0, but:

  1. this would cause asymmetric routing for responses of incoming connections. I’m afraid it wouldn’t play nice with NATs along the way, and the remote endpoint might also dislike it.
  2. I tried it for an outgoing connection, and though I can see the response packets arriving back through the VPN, somehow they’re not picked up by the program - curl just hangs. I don’t know why.

Let’s abandon this. It’s not important enough to warrant so much effort. I’ll try to see if I can install Postfix on my NAS instead.