Date: Wed, 4 Jun 2025 22:58:05 +0300 From: Christos Chatzaras <chris@cretaforce.gr> To: Michael Tuexen <michael.tuexen@lurchi.franken.de> Cc: questions@freebsd.org, freebsd-net <freebsd-net@freebsd.org> Subject: Re: Problem with net.inet.tcp.path_mtu_discovery=1 Message-ID: <13C8668F-0594-4D6D-AFE3-C9DC676570B9@cretaforce.gr> In-Reply-To: <C36F3F3E-F6B2-47B7-BED7-CEE4DAF11354@lurchi.franken.de> References: <9728060D-2C02-426B-BACE-F2D2F651A62F@cretaforce.gr> <bf557c42-625f-4b8a-b5df-7a45c84e40ee@app.fastmail.com> <D9DE01A2-96DF-4804-875C-2424BEF733F3@cretaforce.gr> <C36F3F3E-F6B2-47B7-BED7-CEE4DAF11354@lurchi.franken.de>
next in thread | previous in thread | raw e-mail | index | archive | help
--Apple-Mail=_CB41E076-C7B8-4223-B633-705A15AF680A Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On 4 Jun 2025, at 22:06, Michael Tuexen = <michael.tuexen@lurchi.franken.de> wrote: >=20 >> On 4. Jun 2025, at 19:29, Christos Chatzaras <chris@cretaforce.gr> = wrote: >>=20 >>=20 >>=20 >>> On 4 Jun 2025, at 19:36, Dave Cottlehuber <dch@skunkwerks.at> wrote: >>>=20 >>> On Wed, 4 Jun 2025, at 16:36, Christos Chatzaras wrote: >>>> Hello, >>>>=20 >>>> I manage some servers hosting websites. >>>=20 >>> What does tcpdump/wireshark show for traffic, particularly icmp? = Wireshark is very helpful in explaining some issues. >>>=20 >>> What is the actual MTU on the working net vs the failing one? >>>=20 >>> Is there a local MTU where the failing websites start working again? >>>=20 >>> see ping(8) and use -v -D -s =E2=80=A6. together to find a working = MTU and cross check with tcpdump to find where things seem to break. >>>=20 >>> On a recent cloud environment I needed to add =E2=80=98 set = reassemble yes no-df=E2=80=99 to my pf.conf to address MTU issues = between VNET jails and the internet. >>>=20 >>> Happy hunting >>> Dave >>>=20 >>=20 >> First, I reverted the server settings to their defaults: >> sysctl net.inet.tcp.path_mtu_discovery=3D1 >> sysctl net.inet.tcp.pmtud_blackhole_detection=3D0 >>=20 >> Next, I set the MTU on my local computer to 1460 and everything = worked as expected: >> tcpdump: listening on en0, link-type EN10MB (Ethernet), snapshot = length 524288 bytes >> 20:15:05.651375 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], = proto TCP (6), length 64) >> 192.168.2.18.65322 > 94.130.217.87.443: Flags [S], cksum 0x293e = (correct), seq 3503095669, win 65535, options [mss 1420,nop,wscale = 6,nop,nop,TS val 639376397 ecr 0,sackOK,eol], length 0 >> 20:15:05.705913 IP (tos 0x0, ttl 54, id 0, offset 0, flags [DF], = proto TCP (6), length 60) >> 94.130.217.87.443 > 192.168.2.18.65322: Flags [S.], cksum 0x9c22 = (correct), seq 3647364942, ack 3503095670, win 65535, options [mss = 1460,nop,wscale 6,sackOK,TS val 1782053626 ecr 639376397], length 0 >>=20 >> However, when I set my local computer=E2=80=99s MTU back to 1500 (the = default), the issue reappeared: >> tcpdump: listening on en0, link-type EN10MB (Ethernet), snapshot = length 524288 bytes >> 20:17:45.662993 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], = proto TCP (6), length 64) >> 192.168.2.18.65333 > 94.130.217.87.443: Flags [S], cksum 0x4a07 = (correct), seq 3674289142, win 65535, options [mss 1460,nop,wscale = 6,nop,nop,TS val 681359835 ecr 0,sackOK,eol], length 0 >> 20:17:45.726988 IP (tos 0x0, ttl 54, id 0, offset 0, flags [DF], = proto TCP (6), length 60) >> 94.130.217.87.443 > 192.168.2.18.65333: Flags [S.], cksum 0x9b1d = (correct), seq 1443843488, ack 3674289143, win 65535, options [mss = 1460,nop,wscale 6,sackOK,TS val 2890559459 ecr 681359835], length 0 >>=20 >> So, with local computer MTU 1460, everything works, but with MTU = 1500, the problem persists. > The difference is that you announce a smaller MSS in SYN segment you > sent. This means that the peer can only send you smaller TCP segments. >=20 > So there seems to be a problem if the peer sends too large TCP = segments. > That means that the peer must do PMTUD or TCP blackhole detection, not > the local node. >=20 > Best regards > Michael >=20 Hello Michael, sysctl net.inet.tcp.path_mtu_discovery=3D1 sysctl net.inet.tcp.pmtud_blackhole_detection=3D1 With these settings, is the connection supposed to work even if an = intermediate router is dropping the ICMP messages required for Path MTU = Discovery? I tried this configuration, but it didn=E2=80=99t resolve the = issue.= --Apple-Mail=_CB41E076-C7B8-4223-B633-705A15AF680A Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 <html><head><meta http-equiv=3D"content-type" content=3D"text/html; = charset=3Dutf-8"></head><body style=3D"overflow-wrap: break-word; = -webkit-nbsp-mode: space; line-break: after-white-space;"><br = id=3D"lineBreakAtBeginningOfMessage"><div><br><blockquote = type=3D"cite"><div>On 4 Jun 2025, at 22:06, Michael Tuexen = <michael.tuexen@lurchi.franken.de> wrote:</div><br = class=3D"Apple-interchange-newline"><div><div><blockquote type=3D"cite">On= 4. Jun 2025, at 19:29, Christos Chatzaras <chris@cretaforce.gr> = wrote:<br><br><br><br><blockquote type=3D"cite">On 4 Jun 2025, at 19:36, = Dave Cottlehuber <dch@skunkwerks.at> wrote:<br><br>On Wed, 4 Jun = 2025, at 16:36, Christos Chatzaras wrote:<br><blockquote = type=3D"cite">Hello,<br><br>I manage some servers hosting = websites.<br></blockquote><br>What does tcpdump/wireshark show for = traffic, particularly icmp? Wireshark is very helpful in explaining some = issues.<br><br>What is the actual MTU on the working net vs the failing = one?<br><br>Is there a local MTU where the failing websites start = working again?<br><br>see ping(8) and use -v -D -s =E2=80=A6. together = to find a working MTU and cross check with tcpdump to find where things = seem to break.<br><br>On a recent cloud environment I needed to add =E2=80= =98 set reassemble yes no-df=E2=80=99 to my pf.conf to address MTU = issues between VNET jails and the internet.<br><br>Happy = hunting<br>Dave<br><br></blockquote><br>First, I reverted the server = settings to their defaults:<br>sysctl = net.inet.tcp.path_mtu_discovery=3D1<br>sysctl = net.inet.tcp.pmtud_blackhole_detection=3D0<br><br>Next, I set the MTU on = my local computer to 1460 and everything worked as expected:<br>tcpdump: = listening on en0, link-type EN10MB (Ethernet), snapshot length 524288 = bytes<br>20:15:05.651375 IP (tos 0x0, ttl 64, id 0, offset 0, flags = [DF], proto TCP (6), length 64)<br> 192.168.2.18.65322 = > 94.130.217.87.443: Flags [S], cksum 0x293e (correct), seq = 3503095669, win 65535, options [mss 1420,nop,wscale 6,nop,nop,TS val = 639376397 ecr 0,sackOK,eol], length 0<br>20:15:05.705913 IP (tos 0x0, = ttl 54, id 0, offset 0, flags [DF], proto TCP (6), length 60)<br> = 94.130.217.87.443 > 192.168.2.18.65322: Flags [S.], = cksum 0x9c22 (correct), seq 3647364942, ack 3503095670, win 65535, = options [mss 1460,nop,wscale 6,sackOK,TS val 1782053626 ecr 639376397], = length 0<br><br>However, when I set my local computer=E2=80=99s MTU back = to 1500 (the default), the issue reappeared:<br>tcpdump: listening on = en0, link-type EN10MB (Ethernet), snapshot length 524288 = bytes<br>20:17:45.662993 IP (tos 0x0, ttl 64, id 0, offset 0, flags = [DF], proto TCP (6), length 64)<br> 192.168.2.18.65333 = > 94.130.217.87.443: Flags [S], cksum 0x4a07 (correct), seq = 3674289142, win 65535, options [mss 1460,nop,wscale 6,nop,nop,TS val = 681359835 ecr 0,sackOK,eol], length 0<br>20:17:45.726988 IP (tos 0x0, = ttl 54, id 0, offset 0, flags [DF], proto TCP (6), length 60)<br> = 94.130.217.87.443 > 192.168.2.18.65333: Flags [S.], = cksum 0x9b1d (correct), seq 1443843488, ack 3674289143, win 65535, = options [mss 1460,nop,wscale 6,sackOK,TS val 2890559459 ecr 681359835], = length 0<br><br>So, with local computer MTU 1460, everything works, but = with MTU 1500, the problem persists.<br></blockquote>The difference is = that you announce a smaller MSS in SYN segment you<br>sent. This means = that the peer can only send you smaller TCP segments.<br><br>So there = seems to be a problem if the peer sends too large TCP segments.<br>That = means that the peer must do PMTUD or TCP blackhole detection, not<br>the = local node.<br><br>Best = regards<br>Michael<br><br></div></div></blockquote><br></div><div>Hello = Michael,</div><div><br></div><div><div style=3D"caret-color: rgba(0, 0, = 0, 0.847); color: rgba(0, 0, 0, 0.847); font-family: = -apple-system-font;">sysctl = net.inet.tcp.path_mtu_discovery=3D1<br></div><span style=3D"caret-color: = rgba(0, 0, 0, 0.847); color: rgba(0, 0, 0, 0.847); font-family: = -apple-system-font;">sysctl = net.inet.tcp.pmtud_blackhole_detection=3D1</span></div><div><p = class=3D"p1">With these settings, is the connection supposed to work = even if an intermediate router is dropping the ICMP messages required = for Path MTU Discovery? I tried this configuration, but it didn=E2=80=99t = resolve the issue.</p></div></body></html>= --Apple-Mail=_CB41E076-C7B8-4223-B633-705A15AF680A--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?13C8668F-0594-4D6D-AFE3-C9DC676570B9>