From nobody Tue Jun  8 22:32:19 2021
X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
	by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 6FF767E95F2
	for <freebsd-current@mlmmj.nyi.freebsd.org>; Tue,  8 Jun 2021 22:32:22 +0000 (UTC)
	(envelope-from tuexen@freebsd.org)
Received: from drew.franken.de (drew.ipv6.franken.de [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "*.franken.de", Issuer "Sectigo RSA Domain Validation Secure Server CA" (verified OK))
	by mx1.freebsd.org (Postfix) with ESMTPS id 4G04jy20RFz3HV3
	for <freebsd-current@FreeBSD.org>; Tue,  8 Jun 2021 22:32:21 +0000 (UTC)
	(envelope-from tuexen@freebsd.org)
Received: from smtpclient.apple (unknown [IPv6:2a02:8109:1140:c3d:10c:5917:5301:f59a])
	(Authenticated sender: macmic)
	by mail-n.franken.de (Postfix) with ESMTPSA id 6328A721E2825;
	Wed,  9 Jun 2021 00:32:19 +0200 (CEST)
Content-Type: text/plain;
	charset=us-ascii
List-Id: Discussions about the use of FreeBSD-current <freebsd-current.freebsd.org>
List-Archive: https://lists.freebsd.org/archives/freebsd-current
List-Help: <mailto:freebsd-current+help@freebsd.org>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Subscribe: <mailto:freebsd-current+subscribe@freebsd.org>
List-Unsubscribe: <mailto:freebsd-current+unsubscribe@freebsd.org>
Sender: owner-freebsd-current@freebsd.org
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.100.0.2.22\))
Subject: Re: ssh connections break with "Fssh_packet_write_wait" on 13
 [SOLVED]
From: tuexen@freebsd.org
In-Reply-To: <202106082220.158MKu4f010441@gndrsh.dnsmgr.net>
Date: Wed, 9 Jun 2021 00:32:19 +0200
Cc: Michael Gmelin <freebsd@grem.de>,
 "freebsd-current@freebsd.org" <freebsd-current@FreeBSD.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <D910AC3F-F367-4B8D-9247-2A3553771829@freebsd.org>
References: <202106082220.158MKu4f010441@gndrsh.dnsmgr.net>
To: "Rodney W. Grimes" <freebsd-rwg@gndrsh.dnsmgr.net>
X-Mailer: Apple Mail (2.3654.100.0.2.22)
X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00
	autolearn=disabled version=3.4.1
X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail-n.franken.de
X-Rspamd-Queue-Id: 4G04jy20RFz3HV3
X-Spamd-Bar: ----
Authentication-Results: mx1.freebsd.org;
	none
X-Spamd-Result: default: False [-4.00 / 15.00];
	 REPLY(-4.00)[]
X-ThisMailContainsUnwantedMimeParts: N

> On 9. Jun 2021, at 00:20, Rodney W. Grimes =
<freebsd-rwg@gndrsh.dnsmgr.net> wrote:
>=20
>>=20
>> On Thu, 3 Jun 2021 15:09:06 +0200
>> Michael Gmelin <freebsd@grem.de> wrote:
>>=20
>>> On Tue, 1 Jun 2021 13:47:47 +0200
>>> Michael Gmelin <freebsd@grem.de> wrote:
>>>=20
>>>> Hi,
>>>>=20
>>>> Since upgrading servers from 12.2 to 13.0, I get
>>>>=20
>>>> Fssh_packet_write_wait: Connection to 1.2.3.4 port 22: Broken pipe
>>>>=20
>>>> consistently, usually after about 11 idle minutes, that's with and
>>>> without pf enabled. Client (11.4 in a VM) wasn't altered.
>>>>=20
>>>> Verbose logging (client and server side) doesn't show anything
>>>> special when the connection breaks. In the past, QoS problems
>>>> caused these disconnects, but I didn't see anything apparent
>>>> changing between 12.2 and 13 in this respect.
>>>>=20
>>>> I did a test on a newly commissioned server to rule out other
>>>> factors (so, same client connections, some routes, same
>>>> everything). On 12.2 before the update: Connection stays open for
>>>> hours. After the update (same server): connections breaks
>>>> consistently after < 15 minutes (this is with unaltered
>>>> configurations, no *AliveInterval configured on either side of the
>>>> connection).=20
>>>=20
>>> I did a little bit more testing and realized that the problem goes
>>> away when I disable "Proportional Rate Reduction per RFC 6937" on =
the
>>> server side:
>>>=20
>>> sysctl net.inet.tcp.do_prr=3D0
>>>=20
>>> Keeping it on and enabling net.inet.tcp.do_prr_conservative doesn't
>>> fix the problem.
>>>=20
>>> This seems to be specific to Parallels. After some more digging, I
>>> realized that Parallels Desktop's NAT daemon (prl_naptd) handles
>>> keep-alive between the VM and the external server on its own. There =
is
>>> no direct communication between the client and the server. This =
means:
>>>=20
>>> - The NAT daemon starts sending keep-alive packages right away (not
>>> after the VM's net.inet.tcp.keepidle), every 75 seconds.
>>> - Keep-alive packages originating in the VM never reach the server.
>>> - Keep-alive originating on the server never reaches the VM.
>>> - Client and server basically do keep-alive with the nat daemon, not
>>> with each other.
>>>=20
>>> It also seems like Parallels is filtering the tos field (so it's
>>> always 0x00), but that's unrelated.
>>>=20
>>> I configured a bhyve VM running FreeBSD 11.4 on a separate laptop on
>>> the same network for comparison and is has no such issues.
>>>=20
>>> Looking at TCP dump output on the server, this is what a keep-alive
>>> package sent by Parallels looks like:
>>>=20
>>> 10:14:42.449681 IP (tos 0x0, ttl 64, id 15689, offset 0, flags
>>> [none], proto TCP (6), length 40)
>>>   192.168.1.1.58222 > 192.168.1.2.22: Flags [.], cksum x (correct),
>>>   seq 2534, ack 3851, win 4096, length 0
>>>=20
>>> While those originating from the bhyve VM (after lowering
>>> net.inet.tcp.keepidle) look like this:
>>>=20
>>> 12:18:43.105460 IP (tos 0x0, ttl 62, id 0, offset 0, flags [DF],
>>>   proto TCP (6), length 52)
>>>   192.168.1.3.57555 > 192.168.1.2.22: Flags [.], cksum x
>>>   (correct), seq 1780337696, ack 45831723, win 1026, options
>>>   [nop,nop,TS val 3003646737 ecr 3331923346], length 0
>>>=20
>>> Like written above, once net.inet.tcp.do_prr is disabled, keepalive
>>> seems to be working just fine. Otherwise, Parallel's NAT daemon =
kills
>>> the connection, as its keep-alive requests are not answered (well,
>>> that's what I think is happening):
>>>=20
>>> 10:19:43.614803 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF],
>>>   proto TCP (6), length 40)
>>>   192.168.1.1.58222 > 192.168.1.2.22: Flags [R.], cksum x (correct),
>>>   seq 2535, ack 3851, win 4096, length 0
>>>=20
>>> The easiest way to work around the problem Client side is to =
configure
>>> ServerAliveInterval in ~/.ssh/config in the Client VM.
>>>=20
>>> I'm curious though if this is basically a Parallels problem that has
>>> only been exposed by PRR being more correct (which is what I =
suspect),
>>> or if this is actually a FreeBSD problem.
>>>=20
>>=20
>> So, PRR probably was a red herring and the real reason that's =
happening
>> is that FreeBSD (since version 13[0]) by default discards packets
>> without timestamps for connections that formally had negotiated to =
have
>> them. This new behavior seems to be in line with RFC 7323, section
>> 3.2[1]:
>>=20
>>   "Once TSopt has been successfully negotiated, that is both <SYN> =
and
>>   <SYN,ACK> contain TSopt, the TSopt MUST be sent in every non-<RST>
>>   segment for the duration of the connection, and SHOULD be sent in =
an
>>   <RST> segment (see Section 5.2 for details)."
>>=20
>> As it turns out, macOS does exactly this - send keep-alive packets
>> without a timestamp for connections that were negotiated to have =
them.
>>=20
>> Under normal circumstances - ssh from macOS to a server running =
FreeBSD
>> 13 - this won't be noticed, since macOS uses the same default =
settings
>> as FreeBSD (2 hours idle time, 75 seconds intervals), so the server
>> side initiated keep-alive will save the connection before it has a
>> chance to break due to eight consecutive unanswered keep-alives at =
the
>> client side.
>>=20
>> This is different for ssh connections originating from a VM inside
>> Parallels, as connections created by prl_naptd will start sending tcp
>> keep-alives shortly after the connection becomes idle. As a result,
>> idle connections break after about 11 minutes of idle time (60s
>> + 8*75s =3D 660s =3D=3D 11m), unless countermeasures are taken.
>>=20
>> An easy way to demonstrate the problem is to change keep-alive =
defaults
>> on *macOS* using sysctl and sshing to a FreeBSD 13 server:
>>=20
>>   $ sudo sysctl net.inet.tcp.keepidle=3D5000
>>   $ sudo sysctl net.inet.tcp.keepintvl=3D5000
>>   $ ssh -oTCPKeepAlive=3Dyes myserver
>>=20
>> This way, the problem described can be reproduced quite easily:
>> Disconnect due to broken pipe after 45-60 seconds of idle time, =
tcpdump
>> confirming that keep-alive packets don't have tcp timestamps, while
>> they were used when negotiating the connection.
>>=20
>> There are various ways to work around the issue.
>>=20
>> Client side workarounds:
>> - Use ServerAlive* settings in ~/.ssh/config (ssh only)
>> - Tune net.inet.tcp.keep* sysctls on macOS (for all services)
>>=20
>> Server side workarounds:
>> - Use ClientAlive* settings in ~/.ssh/config (ssh only)
>> - Tolerate missing timestamps in packets using sysctl, which makes
>> FreeBSD 13 behave like previous versions did:
>>=20
>>   sysctl net.inet.tcp.tolerate_missing_ts=3D1
>>=20
>> The last option probably being the most practical one.
>>=20
>> rscheff@ and tuexen@ (thank you!) were able to reproduce the issue =
and
>> reached out to Apple to see if there is something they can do to fix
>> this at their end (macOS) in the future.
>=20
> Can we please have the default of tolerate_missing_ts in
> current, stable/13 and an errata issued to releng_13 changing
> this value to =3D1 and staying that way until the buggy tcp
> stacks are found and eliminated.
The interesting part of your statement is that we find these stacks
now by using the value of 0. But seriously, I thought that reasonable
modern stacks would follow the RFC, which is not the case.
I'll change the value in current and stable/13. But I have no idea
about an errata. Will bring it up at the next transport call.

Best regards
Michael
>=20
>>=20
>> Best
>> Michael
>>=20
>> =
[0]https://cgit.freebsd.org/src/commit/?id=3D283c76c7c3f2f634f19f303a771a3=
f81fe890cab
>> [1]https://datatracker.ietf.org/doc/html/rfc7323#section-3.2
>>=20
>> --=20
>> Michael Gmelin
>>=20
>>=20
>=20
> --=20
> Rod Grimes                                                 =
rgrimes@freebsd.org
>=20