From nobody Wed Jun 4 19:58:05 2025 X-Original-To: freebsd-net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4bCJKr5sqxz5xY3F for ; Wed, 04 Jun 2025 19:58:24 +0000 (UTC) (envelope-from chris@cretaforce.gr) Received: from smtp5.cretaforce.gr (smtp5.cretaforce.gr [5.75.221.23]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "*.cretaforce.gr", Issuer "RapidSSL TLS RSA CA G1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4bCJKr1nPNz3X8L for ; Wed, 04 Jun 2025 19:58:24 +0000 (UTC) (envelope-from chris@cretaforce.gr) Authentication-Results: mx1.freebsd.org; none Received: from server1.cretaforce.gr (server1.cretaforce.gr [94.130.217.104]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "*.cretaforce.gr", Issuer "RapidSSL TLS RSA CA G1" (verified OK)) by smtp.cretaforce.gr (Postfix) with ESMTPS id 10C41208CA for ; Wed, 4 Jun 2025 22:58:19 +0300 (EEST) Received: from smtpclient.apple (unknown [149.210.4.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) (Authenticated sender: chris@cretaforce.gr) by server1.cretaforce.gr (Postfix) with ESMTPSA id 6F19411842; Wed, 04 Jun 2025 22:58:16 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cretaforce.gr; s=cretaforce; t=1749067087; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Qli4Mcaw30U7RkTq+iPEWgKj8T3PmqgH33ev1Gh25Kc=; b=p10U2h/eo87pzdZNiit7+zUcadSl9A8ieuXPRt1Ec+GS/J2cdcZSqj5EWsmHrvdV+ouhum Z5M9/bfkmjxovrUMxU6+CF85GgcXKxy/kOMRWA1A9n+5Xy5Gxam1oab15T3ySKGsqwITH2 P0L0n5BchJ1/9Vr7jtuB0EBNo5Cwrh8= From: Christos Chatzaras Message-Id: <13C8668F-0594-4D6D-AFE3-C9DC676570B9@cretaforce.gr> Content-Type: multipart/alternative; boundary="Apple-Mail=_CB41E076-C7B8-4223-B633-705A15AF680A" List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@FreeBSD.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3826.600.51.1.1\)) Subject: Re: Problem with net.inet.tcp.path_mtu_discovery=1 Date: Wed, 4 Jun 2025 22:58:05 +0300 In-Reply-To: Cc: questions@freebsd.org, freebsd-net To: Michael Tuexen References: <9728060D-2C02-426B-BACE-F2D2F651A62F@cretaforce.gr> X-Mailer: Apple Mail (2.3826.600.51.1.1) X-Rspamd-Queue-Id: 4bCJKr1nPNz3X8L X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:24940, ipnet:5.75.128.0/17, country:DE] --Apple-Mail=_CB41E076-C7B8-4223-B633-705A15AF680A Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On 4 Jun 2025, at 22:06, Michael Tuexen = wrote: >=20 >> On 4. Jun 2025, at 19:29, Christos Chatzaras = wrote: >>=20 >>=20 >>=20 >>> On 4 Jun 2025, at 19:36, Dave Cottlehuber wrote: >>>=20 >>> On Wed, 4 Jun 2025, at 16:36, Christos Chatzaras wrote: >>>> Hello, >>>>=20 >>>> I manage some servers hosting websites. >>>=20 >>> What does tcpdump/wireshark show for traffic, particularly icmp? = Wireshark is very helpful in explaining some issues. >>>=20 >>> What is the actual MTU on the working net vs the failing one? >>>=20 >>> Is there a local MTU where the failing websites start working again? >>>=20 >>> see ping(8) and use -v -D -s =E2=80=A6. together to find a working = MTU and cross check with tcpdump to find where things seem to break. >>>=20 >>> On a recent cloud environment I needed to add =E2=80=98 set = reassemble yes no-df=E2=80=99 to my pf.conf to address MTU issues = between VNET jails and the internet. >>>=20 >>> Happy hunting >>> Dave >>>=20 >>=20 >> First, I reverted the server settings to their defaults: >> sysctl net.inet.tcp.path_mtu_discovery=3D1 >> sysctl net.inet.tcp.pmtud_blackhole_detection=3D0 >>=20 >> Next, I set the MTU on my local computer to 1460 and everything = worked as expected: >> tcpdump: listening on en0, link-type EN10MB (Ethernet), snapshot = length 524288 bytes >> 20:15:05.651375 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], = proto TCP (6), length 64) >> 192.168.2.18.65322 > 94.130.217.87.443: Flags [S], cksum 0x293e = (correct), seq 3503095669, win 65535, options [mss 1420,nop,wscale = 6,nop,nop,TS val 639376397 ecr 0,sackOK,eol], length 0 >> 20:15:05.705913 IP (tos 0x0, ttl 54, id 0, offset 0, flags [DF], = proto TCP (6), length 60) >> 94.130.217.87.443 > 192.168.2.18.65322: Flags [S.], cksum 0x9c22 = (correct), seq 3647364942, ack 3503095670, win 65535, options [mss = 1460,nop,wscale 6,sackOK,TS val 1782053626 ecr 639376397], length 0 >>=20 >> However, when I set my local computer=E2=80=99s MTU back to 1500 (the = default), the issue reappeared: >> tcpdump: listening on en0, link-type EN10MB (Ethernet), snapshot = length 524288 bytes >> 20:17:45.662993 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], = proto TCP (6), length 64) >> 192.168.2.18.65333 > 94.130.217.87.443: Flags [S], cksum 0x4a07 = (correct), seq 3674289142, win 65535, options [mss 1460,nop,wscale = 6,nop,nop,TS val 681359835 ecr 0,sackOK,eol], length 0 >> 20:17:45.726988 IP (tos 0x0, ttl 54, id 0, offset 0, flags [DF], = proto TCP (6), length 60) >> 94.130.217.87.443 > 192.168.2.18.65333: Flags [S.], cksum 0x9b1d = (correct), seq 1443843488, ack 3674289143, win 65535, options [mss = 1460,nop,wscale 6,sackOK,TS val 2890559459 ecr 681359835], length 0 >>=20 >> So, with local computer MTU 1460, everything works, but with MTU = 1500, the problem persists. > The difference is that you announce a smaller MSS in SYN segment you > sent. This means that the peer can only send you smaller TCP segments. >=20 > So there seems to be a problem if the peer sends too large TCP = segments. > That means that the peer must do PMTUD or TCP blackhole detection, not > the local node. >=20 > Best regards > Michael >=20 Hello Michael, sysctl net.inet.tcp.path_mtu_discovery=3D1 sysctl net.inet.tcp.pmtud_blackhole_detection=3D1 With these settings, is the connection supposed to work even if an = intermediate router is dropping the ICMP messages required for Path MTU = Discovery? I tried this configuration, but it didn=E2=80=99t resolve the = issue.= --Apple-Mail=_CB41E076-C7B8-4223-B633-705A15AF680A Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8

On 4 Jun 2025, at 22:06, Michael Tuexen = <michael.tuexen@lurchi.franken.de> wrote:

On= 4. Jun 2025, at 19:29, Christos Chatzaras <chris@cretaforce.gr> = wrote:



On 4 Jun 2025, at 19:36, = Dave Cottlehuber <dch@skunkwerks.at> wrote:

On Wed, 4 Jun = 2025, at 16:36, Christos Chatzaras wrote:
Hello,

I manage some servers hosting = websites.

What does tcpdump/wireshark show for = traffic, particularly icmp? Wireshark is very helpful in explaining some = issues.

What is the actual MTU on the working net vs the failing = one?

Is there a local MTU where the failing websites start = working again?

see ping(8) and use -v -D -s =E2=80=A6. together = to find a working MTU and cross check with tcpdump to find where things = seem to break.

On a recent cloud environment I needed to add =E2=80= =98 set reassemble yes no-df=E2=80=99 to my pf.conf to address MTU = issues between VNET jails and the internet.

Happy = hunting
Dave


First, I reverted the server = settings to their defaults:
sysctl = net.inet.tcp.path_mtu_discovery=3D1
sysctl = net.inet.tcp.pmtud_blackhole_detection=3D0

Next, I set the MTU on = my local computer to 1460 and everything worked as expected:
tcpdump: = listening on en0, link-type EN10MB (Ethernet), snapshot length 524288 = bytes
20:15:05.651375 IP (tos 0x0, ttl 64, id 0, offset 0, flags = [DF], proto TCP (6), length 64)
   192.168.2.18.65322 = > 94.130.217.87.443: Flags [S], cksum 0x293e (correct), seq = 3503095669, win 65535, options [mss 1420,nop,wscale 6,nop,nop,TS val = 639376397 ecr 0,sackOK,eol], length 0
20:15:05.705913 IP (tos 0x0, = ttl 54, id 0, offset 0, flags [DF], proto TCP (6), length 60)
=    94.130.217.87.443 > 192.168.2.18.65322: Flags [S.], = cksum 0x9c22 (correct), seq 3647364942, ack 3503095670, win 65535, = options [mss 1460,nop,wscale 6,sackOK,TS val 1782053626 ecr 639376397], = length 0

However, when I set my local computer=E2=80=99s MTU back = to 1500 (the default), the issue reappeared:
tcpdump: listening on = en0, link-type EN10MB (Ethernet), snapshot length 524288 = bytes
20:17:45.662993 IP (tos 0x0, ttl 64, id 0, offset 0, flags = [DF], proto TCP (6), length 64)
   192.168.2.18.65333 = > 94.130.217.87.443: Flags [S], cksum 0x4a07 (correct), seq = 3674289142, win 65535, options [mss 1460,nop,wscale 6,nop,nop,TS val = 681359835 ecr 0,sackOK,eol], length 0
20:17:45.726988 IP (tos 0x0, = ttl 54, id 0, offset 0, flags [DF], proto TCP (6), length 60)
=    94.130.217.87.443 > 192.168.2.18.65333: Flags [S.], = cksum 0x9b1d (correct), seq 1443843488, ack 3674289143, win 65535, = options [mss 1460,nop,wscale 6,sackOK,TS val 2890559459 ecr 681359835], = length 0

So, with local computer MTU 1460, everything works, but = with MTU 1500, the problem persists.
The difference is = that you announce a smaller MSS in SYN segment you
sent. This means = that the peer can only send you smaller TCP segments.

So there = seems to be a problem if the peer sends too large TCP segments.
That = means that the peer must do PMTUD or TCP blackhole detection, not
the = local node.

Best = regards
Michael


Hello = Michael,

sysctl = net.inet.tcp.path_mtu_discovery=3D1
sysctl = net.inet.tcp.pmtud_blackhole_detection=3D1

With these settings, is the connection supposed to work = even if an intermediate router is dropping the ICMP messages required = for Path MTU Discovery? I tried this configuration, but it didn=E2=80=99t = resolve the issue.

= --Apple-Mail=_CB41E076-C7B8-4223-B633-705A15AF680A--