Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 4 Jun 2025 22:58:05 +0300
From:      Christos Chatzaras <chris@cretaforce.gr>
To:        Michael Tuexen <michael.tuexen@lurchi.franken.de>
Cc:        questions@freebsd.org, freebsd-net <freebsd-net@freebsd.org>
Subject:   Re: Problem with net.inet.tcp.path_mtu_discovery=1
Message-ID:  <13C8668F-0594-4D6D-AFE3-C9DC676570B9@cretaforce.gr>
In-Reply-To: <C36F3F3E-F6B2-47B7-BED7-CEE4DAF11354@lurchi.franken.de>
References:  <9728060D-2C02-426B-BACE-F2D2F651A62F@cretaforce.gr> <bf557c42-625f-4b8a-b5df-7a45c84e40ee@app.fastmail.com> <D9DE01A2-96DF-4804-875C-2424BEF733F3@cretaforce.gr> <C36F3F3E-F6B2-47B7-BED7-CEE4DAF11354@lurchi.franken.de>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail=_CB41E076-C7B8-4223-B633-705A15AF680A
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8



> On 4 Jun 2025, at 22:06, Michael Tuexen =
<michael.tuexen@lurchi.franken.de> wrote:
>=20
>> On 4. Jun 2025, at 19:29, Christos Chatzaras <chris@cretaforce.gr> =
wrote:
>>=20
>>=20
>>=20
>>> On 4 Jun 2025, at 19:36, Dave Cottlehuber <dch@skunkwerks.at> wrote:
>>>=20
>>> On Wed, 4 Jun 2025, at 16:36, Christos Chatzaras wrote:
>>>> Hello,
>>>>=20
>>>> I manage some servers hosting websites.
>>>=20
>>> What does tcpdump/wireshark show for traffic, particularly icmp? =
Wireshark is very helpful in explaining some issues.
>>>=20
>>> What is the actual MTU on the working net vs the failing one?
>>>=20
>>> Is there a local MTU where the failing websites start working again?
>>>=20
>>> see ping(8) and use -v -D -s =E2=80=A6. together to find a working =
MTU and cross check with tcpdump to find where things seem to break.
>>>=20
>>> On a recent cloud environment I needed to add =E2=80=98 set =
reassemble yes no-df=E2=80=99 to my pf.conf to address MTU issues =
between VNET jails and the internet.
>>>=20
>>> Happy hunting
>>> Dave
>>>=20
>>=20
>> First, I reverted the server settings to their defaults:
>> sysctl net.inet.tcp.path_mtu_discovery=3D1
>> sysctl net.inet.tcp.pmtud_blackhole_detection=3D0
>>=20
>> Next, I set the MTU on my local computer to 1460 and everything =
worked as expected:
>> tcpdump: listening on en0, link-type EN10MB (Ethernet), snapshot =
length 524288 bytes
>> 20:15:05.651375 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], =
proto TCP (6), length 64)
>>    192.168.2.18.65322 > 94.130.217.87.443: Flags [S], cksum 0x293e =
(correct), seq 3503095669, win 65535, options [mss 1420,nop,wscale =
6,nop,nop,TS val 639376397 ecr 0,sackOK,eol], length 0
>> 20:15:05.705913 IP (tos 0x0, ttl 54, id 0, offset 0, flags [DF], =
proto TCP (6), length 60)
>>    94.130.217.87.443 > 192.168.2.18.65322: Flags [S.], cksum 0x9c22 =
(correct), seq 3647364942, ack 3503095670, win 65535, options [mss =
1460,nop,wscale 6,sackOK,TS val 1782053626 ecr 639376397], length 0
>>=20
>> However, when I set my local computer=E2=80=99s MTU back to 1500 (the =
default), the issue reappeared:
>> tcpdump: listening on en0, link-type EN10MB (Ethernet), snapshot =
length 524288 bytes
>> 20:17:45.662993 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], =
proto TCP (6), length 64)
>>    192.168.2.18.65333 > 94.130.217.87.443: Flags [S], cksum 0x4a07 =
(correct), seq 3674289142, win 65535, options [mss 1460,nop,wscale =
6,nop,nop,TS val 681359835 ecr 0,sackOK,eol], length 0
>> 20:17:45.726988 IP (tos 0x0, ttl 54, id 0, offset 0, flags [DF], =
proto TCP (6), length 60)
>>    94.130.217.87.443 > 192.168.2.18.65333: Flags [S.], cksum 0x9b1d =
(correct), seq 1443843488, ack 3674289143, win 65535, options [mss =
1460,nop,wscale 6,sackOK,TS val 2890559459 ecr 681359835], length 0
>>=20
>> So, with local computer MTU 1460, everything works, but with MTU =
1500, the problem persists.
> The difference is that you announce a smaller MSS in SYN segment you
> sent. This means that the peer can only send you smaller TCP segments.
>=20
> So there seems to be a problem if the peer sends too large TCP =
segments.
> That means that the peer must do PMTUD or TCP blackhole detection, not
> the local node.
>=20
> Best regards
> Michael
>=20

Hello Michael,

sysctl net.inet.tcp.path_mtu_discovery=3D1
sysctl net.inet.tcp.pmtud_blackhole_detection=3D1
With these settings, is the connection supposed to work even if an =
intermediate router is dropping the ICMP messages required for Path MTU =
Discovery? I tried this configuration, but it didn=E2=80=99t resolve the =
issue.=

--Apple-Mail=_CB41E076-C7B8-4223-B633-705A15AF680A
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=utf-8

<html><head><meta http-equiv=3D"content-type" content=3D"text/html; =
charset=3Dutf-8"></head><body style=3D"overflow-wrap: break-word; =
-webkit-nbsp-mode: space; line-break: after-white-space;"><br =
id=3D"lineBreakAtBeginningOfMessage"><div><br><blockquote =
type=3D"cite"><div>On 4 Jun 2025, at 22:06, Michael Tuexen =
&lt;michael.tuexen@lurchi.franken.de&gt; wrote:</div><br =
class=3D"Apple-interchange-newline"><div><div><blockquote type=3D"cite">On=
 4. Jun 2025, at 19:29, Christos Chatzaras &lt;chris@cretaforce.gr&gt; =
wrote:<br><br><br><br><blockquote type=3D"cite">On 4 Jun 2025, at 19:36, =
Dave Cottlehuber &lt;dch@skunkwerks.at&gt; wrote:<br><br>On Wed, 4 Jun =
2025, at 16:36, Christos Chatzaras wrote:<br><blockquote =
type=3D"cite">Hello,<br><br>I manage some servers hosting =
websites.<br></blockquote><br>What does tcpdump/wireshark show for =
traffic, particularly icmp? Wireshark is very helpful in explaining some =
issues.<br><br>What is the actual MTU on the working net vs the failing =
one?<br><br>Is there a local MTU where the failing websites start =
working again?<br><br>see ping(8) and use -v -D -s =E2=80=A6. together =
to find a working MTU and cross check with tcpdump to find where things =
seem to break.<br><br>On a recent cloud environment I needed to add =E2=80=
=98 set reassemble yes no-df=E2=80=99 to my pf.conf to address MTU =
issues between VNET jails and the internet.<br><br>Happy =
hunting<br>Dave<br><br></blockquote><br>First, I reverted the server =
settings to their defaults:<br>sysctl =
net.inet.tcp.path_mtu_discovery=3D1<br>sysctl =
net.inet.tcp.pmtud_blackhole_detection=3D0<br><br>Next, I set the MTU on =
my local computer to 1460 and everything worked as expected:<br>tcpdump: =
listening on en0, link-type EN10MB (Ethernet), snapshot length 524288 =
bytes<br>20:15:05.651375 IP (tos 0x0, ttl 64, id 0, offset 0, flags =
[DF], proto TCP (6), length 64)<br> &nbsp;&nbsp;&nbsp;192.168.2.18.65322 =
&gt; 94.130.217.87.443: Flags [S], cksum 0x293e (correct), seq =
3503095669, win 65535, options [mss 1420,nop,wscale 6,nop,nop,TS val =
639376397 ecr 0,sackOK,eol], length 0<br>20:15:05.705913 IP (tos 0x0, =
ttl 54, id 0, offset 0, flags [DF], proto TCP (6), length 60)<br> =
&nbsp;&nbsp;&nbsp;94.130.217.87.443 &gt; 192.168.2.18.65322: Flags [S.], =
cksum 0x9c22 (correct), seq 3647364942, ack 3503095670, win 65535, =
options [mss 1460,nop,wscale 6,sackOK,TS val 1782053626 ecr 639376397], =
length 0<br><br>However, when I set my local computer=E2=80=99s MTU back =
to 1500 (the default), the issue reappeared:<br>tcpdump: listening on =
en0, link-type EN10MB (Ethernet), snapshot length 524288 =
bytes<br>20:17:45.662993 IP (tos 0x0, ttl 64, id 0, offset 0, flags =
[DF], proto TCP (6), length 64)<br> &nbsp;&nbsp;&nbsp;192.168.2.18.65333 =
&gt; 94.130.217.87.443: Flags [S], cksum 0x4a07 (correct), seq =
3674289142, win 65535, options [mss 1460,nop,wscale 6,nop,nop,TS val =
681359835 ecr 0,sackOK,eol], length 0<br>20:17:45.726988 IP (tos 0x0, =
ttl 54, id 0, offset 0, flags [DF], proto TCP (6), length 60)<br> =
&nbsp;&nbsp;&nbsp;94.130.217.87.443 &gt; 192.168.2.18.65333: Flags [S.], =
cksum 0x9b1d (correct), seq 1443843488, ack 3674289143, win 65535, =
options [mss 1460,nop,wscale 6,sackOK,TS val 2890559459 ecr 681359835], =
length 0<br><br>So, with local computer MTU 1460, everything works, but =
with MTU 1500, the problem persists.<br></blockquote>The difference is =
that you announce a smaller MSS in SYN segment you<br>sent. This means =
that the peer can only send you smaller TCP segments.<br><br>So there =
seems to be a problem if the peer sends too large TCP segments.<br>That =
means that the peer must do PMTUD or TCP blackhole detection, not<br>the =
local node.<br><br>Best =
regards<br>Michael<br><br></div></div></blockquote><br></div><div>Hello =
Michael,</div><div><br></div><div><div style=3D"caret-color: rgba(0, 0, =
0, 0.847); color: rgba(0, 0, 0, 0.847); font-family: =
-apple-system-font;">sysctl =
net.inet.tcp.path_mtu_discovery=3D1<br></div><span style=3D"caret-color: =
rgba(0, 0, 0, 0.847); color: rgba(0, 0, 0, 0.847); font-family: =
-apple-system-font;">sysctl =
net.inet.tcp.pmtud_blackhole_detection=3D1</span></div><div><p =
class=3D"p1">With these settings, is the connection supposed to work =
even if an intermediate router is dropping the ICMP messages required =
for Path MTU Discovery? I tried this configuration, but it didn=E2=80=99t =
resolve the issue.</p></div></body></html>=

--Apple-Mail=_CB41E076-C7B8-4223-B633-705A15AF680A--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?13C8668F-0594-4D6D-AFE3-C9DC676570B9>