sni-router: host-net HAProxy to preserve real client IPs by dolonet · Pull Request #522 · 9seconds/mtg

dolonet · 2026-05-18T11:54:27Z

Follow-up to discussion in #498 — bam80 reported that real client IPs never made it through contrib/sni-router despite all four PROXY-protocol pieces being wired up correctly.

Root cause

When the HAProxy container is on a bridge network and :443 / :80 are published via ports:, the source IP of every inbound connection is rewritten to the bridge gateway before HAProxy sees it:

Docker (default): docker-proxy accepts on the host and re-opens the connection from the bridge gateway.
Docker with userland-proxy: false: kernel DNAT should preserve the source, but on Docker 29 / Fedora the MASQUERADE rewrite (moby/moby#48854) intermittently drops or rewrites traffic.
Podman rootless: slirp4netns / pasta userspace forwarder, no equivalent flag.

In every case HAProxy stamps the gateway address (e.g. 172.x.x.1) into the PROXY v2 header, so mtg and Caddy faithfully log the wrong IP. The fix has to lift HAProxy out of the rewrite path — no amount of backend-side configuration can recover what HAProxy never received.

Change

HAProxy → network_mode: host. Binds :443/:80 in the host netns directly. No NAT, no userspace forwarder, no source rewrite. Real client IPs (v4 and v6) propagate end-to-end via PROXY v2.
mtg and Caddy stay on the compose bridge, published on 127.0.0.1 only (127.0.0.1:3128:3128, 127.0.0.1:8080:80, 127.0.0.1:8443:8443). HAProxy reaches them via host loopback.
HAProxy frontends gain explicit IPv6 binds (bind :443,[::]:443 / bind :80,[::]:80). bind *:443 is IPv4-only; the old example accepted IPv6 only on hosts where dual-stack quirks happened to cover for it.
Caddy allow-list gains 127.0.0.1/32 to cover the new loopback hop from HAProxy. The RFC1918 ranges stay for the fronting path (mtg → Caddy on the compose bridge).
README gains a short subsection explaining the host-mode choice and its trade-offs.

mtg-config.toml is intentionally unchanged — mtg and Caddy are still on the compose bridge, so fronting can keep host = "web" and resolve over compose-network DNS.

Alternatives considered

{"userland-proxy": false} in /etc/docker/daemon.json. Smaller change (host config, no compose edits) but: (a) requires host-level daemon config that the contrib example can't ship, (b) flaky on Docker 29 (MASQUERADE rewrite), (c) doesn't help Podman rootless at all.
All three services in network_mode: host. Cleaner network model but requires changing mtg-config.toml (fronting host and listen address) and loses compose-network isolation. Bigger change for no functional gain over the current layout.

Trade-offs / platform notes

HAProxy owns the host's :443 and :80. Don't run anything else on those ports on the same host.
Linux host only. On Docker Desktop (macOS/Windows), network_mode: host binds inside the Linux VM, so external clients can't reach the proxy. Out of scope for this contrib example, which is server-deployment-oriented anyway.
With Docker userns-remap, the in-container "root" loses the privilege to bind <1024. README documents the workaround.

Test status

End-to-end validated by @bam80 on Fedora + Docker 29 — both the original revision (real client IPv4/IPv6 visible in mtg and Caddy logs) and the current revision (DOMAIN=localhost run after the review-fixups, see thread).

Branch rebased on master after the 2026-05-20 batch — picks up #514 (image bump to :master, hard dep so the stack actually exposes proxy-protocol-listener end-to-end) and #525 (mtg-config.toml rendered from tracked .example). A fresh checkout now brings up a working stack without manual patching.

Closes #498.

dolonet · 2026-05-18T12:14:26Z

@bam80 friendly ping — could you re-run this on your Fedora + Docker 29 setup before it leaves draft? The shape changed slightly from the version you tested:

backends use 127.0.0.1:... instead of [::1]:... (PROXY v2 still carries the real v6 client IP regardless of the loopback transport, so v6 publishing isn't needed)
HAProxy frontends explicitly bind 0.0.0.0:443 + bind [::]:443 v6only (and same for :80)
Caddy allow includes 127.0.0.1/32 for the new loopback hop

Concretely, what I'd like to confirm:

docker compose up -d from a fresh checkout brings everything up cleanly.
An IPv4 client hitting :443 → its real address in mtg log + Caddy access log when probing the domain.
Same for an IPv6 client (this is the bit I most want validated since I changed the publishing layout vs. your tested version).

If anything breaks I'll iterate. Thanks for the patience on the round-trip.

bam80 · 2026-05-18T22:26:52Z

+    # host's net.ipv6.bindv6only sysctl.  `v6only` on the v6 bind prevents it
+    # from also accepting v4-mapped connections, which would otherwise
+    # conflict with the explicit v4 bind on the same port.
+    bind 0.0.0.0:80
+    bind [::]:80 v6only


Suggested change

# host's net.ipv6.bindv6only sysctl. `v6only` on the v6 bind prevents it

# from also accepting v4-mapped connections, which would otherwise

# conflict with the explicit v4 bind on the same port.

bind 0.0.0.0:80

bind [::]:80 v6only

# host's net.ipv6.bindv6only sysctl.

bind :80,[::]:80

*:, 0.0.0.0: and : are equivalent per the doc .
I don't have v6only here in my patch variant (which is pretty the same) and still didn't notice any conflicts (with net.ipv6.bindv6only = 0). Not sure if it's allowed in the one-line notation.

You're right that *:, 0.0.0.0: and : are equivalent, and re: v6only — I checked the actual behavior: with SO_REUSEADDR (HAProxy's default) and bindv6only=0, the v6 bind succeeds alongside the v4 bind, and the kernel routes v4 packets to the more-specific AF_INET socket. So both forms produce identical runtime behavior on Linux. My earlier comment overstated the v6only/sysctl interaction — it's not load-bearing, it's self-documentation.

That makes the choice purely stylistic:

Two binds + v6only: spells out why two binds coexist for someone reading the cfg without having to reason about SO_REUSEADDR semantics.

One-liner: shorter; the comment doesn't have to explain v6only because it's not there.

I have a mild preference for the explicit form for a contrib/ example, but you're the one actually running sni-router and closer to the audience copying this config — if you'd rather have the one-liner, I'll switch. Either way I'm fine.

On v6only in comma syntax: HAProxy docs say bind options apply to all sockets on the line, so bind :80,[::]:80 v6only would set IPV6_V6ONLY on the v4 socket too — no-op there, but cosmetically odd. If we go one-liner, I'd drop v6only entirely, as your suggestion does.

Either way, the gate I'd still like to clear before un-drafting is an actual compose up -d with this layout — v4 and v6 client landing in mtg + Caddy logs with real addresses. The bind nit is a quick swap after; that e2e run is the bit I can't reproduce from my side.

Why I would prefer one-liner -
it makes adding new ports easier, and look better, e.g.:

bind :80,[::]:80 bind :8080,[::]:8080

I'm personally exploiting the multi-port configuration, I keep them all on one line but someone else might prefer just add a new line with the ports. I don't have a hard preference, though.

I'll test it tomorrow, thanks.

Done in 2a63578 — switched both :80 and :443 blocks to bind :PORT,[::]:PORT, dropped v6only, trimmed the comment to one sentence (nothing about v6only to explain anymore). Multi-port-scaling point taken; future ports can just add another comma-separated line.

I'll test it tomorrow, thanks.

Много крови мне попил этот тест (#525 (comment)), но вроде работает, спасибо.

Кстати, я так и не понял в чем проблема была протестировать самому.
Я всё равно не мог тестировать в обычном режиме (80 порт недоступен снаружи), пришлось тестировать с DOMAIN=localhost, но этого д.б. достаточно - обе версии IP видны нормально.

Ну и масла подлило в огонь отсутствие #514 - тоже побился головой об стену.

Перечитал свою отговорку про «не могу воспроизвести у себя» — ты прав, она не выдерживает критики. Реальная причина: ты был исходным тестером с уже верифицированной средой, и в голове это сложилось как «дешевле попросить ещё раз, чем поднимать чистую тачку». Но это ровно тот случай, когда «дешевле» = «свалить на другого». DOMAIN=localhost на любом dev-VPS — то, что нужно было сделать самому до того, как просить третий проход. Учту.

Да, и это была не «соседняя» проблема, а жёсткая зависимость: nineseconds/mtg:2 без proxy-protocol-listener, без #514/#480 стэк объективно не работает end-to-end, отсюда твой ручной патч во время теста. Должен был либо явно зачейнить #514 в описании, либо включить bump образа сюда же. Сейчас #514 в master — после rebase следующий тестер получит рабочий стэк без ручной возни.

bam80 · 2026-05-18T22:27:13Z

-    bind *:443
+    bind 0.0.0.0:443
+    bind [::]:443 v6only


Switched here too in 2a63578.

bam80 · 2026-05-19T01:14:48Z

+    # Explicit v4 + v6 binds so IPv6 clients are accepted regardless of the
+    # host's net.ipv6.bindv6only sysctl.  `v6only` on the v6 bind prevents it


Note: We could also just do bind [::]:80 v4v6 without explicit v4 and v6 ports but then we would get ffffffff:1.2.3.4 in the logs for IPv4 addresses.

Right — that ::ffff:1.2.3.4 noise is exactly why I went with explicit dual binds rather than v4v6. Sticking with bind :PORT,[::]:PORT so v4 stays v4 in PROXY-v2 and downstream logs.

Switch to one-line `bind :80,[::]:80` and `bind :443,[::]:443` per review feedback in #522. The v6only flag was self-documentation, not load-bearing: with SO_REUSEADDR (HAProxy's default) and bindv6only=0 the kernel routes v4 packets to the more-specific AF_INET socket regardless. Comment trimmed to match — the v6only paragraph is gone because v6only itself is gone. The shorter form also scales more cleanly when adding ports later, e.g. `bind :8080,[::]:8080` on a new line.

@bam80

Bridge ingress (Docker's docker-proxy userland forwarder, Podman's slirp4netns/pasta) rewrites the source IP of inbound connections on a published port to the bridge gateway address. HAProxy then stamps that gateway address into the PROXY v2 header it forwards to mtg and Caddy, so neither backend ever sees a real client IP. Move HAProxy into the host netns (network_mode: host) so it binds :443/:80 directly with no NAT in the path. mtg and Caddy stay on the compose bridge and are published on 127.0.0.1 only; HAProxy reaches them via host loopback and PROXY v2 carries the real client IP (v4 or v6) end-to-end. Also accept IPv6 clients explicitly on the HAProxy frontends — `bind *:443` is IPv4-only and missed v6 clients on hosts where the previous example happened to "work" only because of dual-stack quirks. Add 127.0.0.0/8 to Caddy's PROXY allow-list to cover the new loopback hop from HAProxy. README gains a short subsection explaining the host-mode choice and its trade-off (HAProxy occupies host :443/:80). Diagnosed and tested by @bam80 on Fedora + Docker 29. Fixes #498.

…rrow Caddy allow) - Caddy allow: 127.0.0.0/8 → 127.0.0.1/32 (only loopback peer is HAProxy). - haproxy.cfg: rewrite v6only comment to describe what it actually does (suppresses v4-mapped accept, preventing conflict with the v4 bind), not the symptom. - docker-compose.yml: trim the 8-line haproxy comment to 3 lines and defer the rationale to README. Add one-line note explaining why web uses host port 8080 (HAProxy owns :80). - README: condense the "Why network_mode: host" subsection. Spell out trade-offs as a list: own-the-host-ports, Linux-only (Docker Desktop doesn't make this layout reachable), userns-remap incompatibility. Note that mtg-config.toml stays as-is because mtg/web remain on the compose bridge.

Switch to one-line `bind :80,[::]:80` and `bind :443,[::]:443` per review feedback in #522. The v6only flag was self-documentation, not load-bearing: with SO_REUSEADDR (HAProxy's default) and bindv6only=0 the kernel routes v4 packets to the more-specific AF_INET socket regardless. Comment trimmed to match — the v6only paragraph is gone because v6only itself is gone. The shorter form also scales more cleanly when adding ports later, e.g. `bind :8080,[::]:8080` on a new line.

dolonet · 2026-05-20T13:17:15Z

Rebased on master — picks up #514 (image bump) and #525 (config rendering), both of which were friction sources during @bam80's test. Body updated, draft lifted. Ready for review.

dolonet changed the title ~~sni-router: switch HAProxy to host networking for real client IPs~~ sni-router: host-net HAProxy to preserve real client IPs May 18, 2026

dolonet mentioned this pull request May 18, 2026

contrib/sni-router: use host networking for HAProxy to preserve client IPs #520

Closed

bam80 reviewed May 18, 2026

View reviewed changes

bam80 reviewed May 19, 2026

View reviewed changes

dolonet added 3 commits May 20, 2026 13:15

dolonet force-pushed the sni-router-host-mode-real-ips branch from 2a63578 to a7febc2 Compare May 20, 2026 13:15

dolonet marked this pull request as ready for review May 20, 2026 13:17

		# Explicit v4 + v6 binds so IPv6 clients are accepted regardless of the
		# host's net.ipv6.bindv6only sysctl. `v6only` on the v6 bind prevents it

Conversation

dolonet commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Root cause

Change

Alternatives considered

Trade-offs / platform notes

Test status

Uh oh!

dolonet commented May 18, 2026

Uh oh!

bam80 May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dolonet commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dolonet commented May 18, 2026 •

edited

Loading

bam80 May 18, 2026 •

edited

Loading