From nobody Tue Jun 8 22:32:19 2021 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 6FF767E95F2 for ; Tue, 8 Jun 2021 22:32:22 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from drew.franken.de (drew.ipv6.franken.de [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.franken.de", Issuer "Sectigo RSA Domain Validation Secure Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4G04jy20RFz3HV3 for ; Tue, 8 Jun 2021 22:32:21 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from smtpclient.apple (unknown [IPv6:2a02:8109:1140:c3d:10c:5917:5301:f59a]) (Authenticated sender: macmic) by mail-n.franken.de (Postfix) with ESMTPSA id 6328A721E2825; Wed, 9 Jun 2021 00:32:19 +0200 (CEST) Content-Type: text/plain; charset=us-ascii List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.100.0.2.22\)) Subject: Re: ssh connections break with "Fssh_packet_write_wait" on 13 [SOLVED] From: tuexen@freebsd.org In-Reply-To: <202106082220.158MKu4f010441@gndrsh.dnsmgr.net> Date: Wed, 9 Jun 2021 00:32:19 +0200 Cc: Michael Gmelin , "freebsd-current@freebsd.org" Content-Transfer-Encoding: quoted-printable Message-Id: References: <202106082220.158MKu4f010441@gndrsh.dnsmgr.net> To: "Rodney W. Grimes" X-Mailer: Apple Mail (2.3654.100.0.2.22) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=disabled version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail-n.franken.de X-Rspamd-Queue-Id: 4G04jy20RFz3HV3 X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-ThisMailContainsUnwantedMimeParts: N > On 9. Jun 2021, at 00:20, Rodney W. Grimes = wrote: >=20 >>=20 >> On Thu, 3 Jun 2021 15:09:06 +0200 >> Michael Gmelin wrote: >>=20 >>> On Tue, 1 Jun 2021 13:47:47 +0200 >>> Michael Gmelin wrote: >>>=20 >>>> Hi, >>>>=20 >>>> Since upgrading servers from 12.2 to 13.0, I get >>>>=20 >>>> Fssh_packet_write_wait: Connection to 1.2.3.4 port 22: Broken pipe >>>>=20 >>>> consistently, usually after about 11 idle minutes, that's with and >>>> without pf enabled. Client (11.4 in a VM) wasn't altered. >>>>=20 >>>> Verbose logging (client and server side) doesn't show anything >>>> special when the connection breaks. In the past, QoS problems >>>> caused these disconnects, but I didn't see anything apparent >>>> changing between 12.2 and 13 in this respect. >>>>=20 >>>> I did a test on a newly commissioned server to rule out other >>>> factors (so, same client connections, some routes, same >>>> everything). On 12.2 before the update: Connection stays open for >>>> hours. After the update (same server): connections breaks >>>> consistently after < 15 minutes (this is with unaltered >>>> configurations, no *AliveInterval configured on either side of the >>>> connection).=20 >>>=20 >>> I did a little bit more testing and realized that the problem goes >>> away when I disable "Proportional Rate Reduction per RFC 6937" on = the >>> server side: >>>=20 >>> sysctl net.inet.tcp.do_prr=3D0 >>>=20 >>> Keeping it on and enabling net.inet.tcp.do_prr_conservative doesn't >>> fix the problem. >>>=20 >>> This seems to be specific to Parallels. After some more digging, I >>> realized that Parallels Desktop's NAT daemon (prl_naptd) handles >>> keep-alive between the VM and the external server on its own. There = is >>> no direct communication between the client and the server. This = means: >>>=20 >>> - The NAT daemon starts sending keep-alive packages right away (not >>> after the VM's net.inet.tcp.keepidle), every 75 seconds. >>> - Keep-alive packages originating in the VM never reach the server. >>> - Keep-alive originating on the server never reaches the VM. >>> - Client and server basically do keep-alive with the nat daemon, not >>> with each other. >>>=20 >>> It also seems like Parallels is filtering the tos field (so it's >>> always 0x00), but that's unrelated. >>>=20 >>> I configured a bhyve VM running FreeBSD 11.4 on a separate laptop on >>> the same network for comparison and is has no such issues. >>>=20 >>> Looking at TCP dump output on the server, this is what a keep-alive >>> package sent by Parallels looks like: >>>=20 >>> 10:14:42.449681 IP (tos 0x0, ttl 64, id 15689, offset 0, flags >>> [none], proto TCP (6), length 40) >>> 192.168.1.1.58222 > 192.168.1.2.22: Flags [.], cksum x (correct), >>> seq 2534, ack 3851, win 4096, length 0 >>>=20 >>> While those originating from the bhyve VM (after lowering >>> net.inet.tcp.keepidle) look like this: >>>=20 >>> 12:18:43.105460 IP (tos 0x0, ttl 62, id 0, offset 0, flags [DF], >>> proto TCP (6), length 52) >>> 192.168.1.3.57555 > 192.168.1.2.22: Flags [.], cksum x >>> (correct), seq 1780337696, ack 45831723, win 1026, options >>> [nop,nop,TS val 3003646737 ecr 3331923346], length 0 >>>=20 >>> Like written above, once net.inet.tcp.do_prr is disabled, keepalive >>> seems to be working just fine. Otherwise, Parallel's NAT daemon = kills >>> the connection, as its keep-alive requests are not answered (well, >>> that's what I think is happening): >>>=20 >>> 10:19:43.614803 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], >>> proto TCP (6), length 40) >>> 192.168.1.1.58222 > 192.168.1.2.22: Flags [R.], cksum x (correct), >>> seq 2535, ack 3851, win 4096, length 0 >>>=20 >>> The easiest way to work around the problem Client side is to = configure >>> ServerAliveInterval in ~/.ssh/config in the Client VM. >>>=20 >>> I'm curious though if this is basically a Parallels problem that has >>> only been exposed by PRR being more correct (which is what I = suspect), >>> or if this is actually a FreeBSD problem. >>>=20 >>=20 >> So, PRR probably was a red herring and the real reason that's = happening >> is that FreeBSD (since version 13[0]) by default discards packets >> without timestamps for connections that formally had negotiated to = have >> them. This new behavior seems to be in line with RFC 7323, section >> 3.2[1]: >>=20 >> "Once TSopt has been successfully negotiated, that is both = and >> contain TSopt, the TSopt MUST be sent in every non- >> segment for the duration of the connection, and SHOULD be sent in = an >> segment (see Section 5.2 for details)." >>=20 >> As it turns out, macOS does exactly this - send keep-alive packets >> without a timestamp for connections that were negotiated to have = them. >>=20 >> Under normal circumstances - ssh from macOS to a server running = FreeBSD >> 13 - this won't be noticed, since macOS uses the same default = settings >> as FreeBSD (2 hours idle time, 75 seconds intervals), so the server >> side initiated keep-alive will save the connection before it has a >> chance to break due to eight consecutive unanswered keep-alives at = the >> client side. >>=20 >> This is different for ssh connections originating from a VM inside >> Parallels, as connections created by prl_naptd will start sending tcp >> keep-alives shortly after the connection becomes idle. As a result, >> idle connections break after about 11 minutes of idle time (60s >> + 8*75s =3D 660s =3D=3D 11m), unless countermeasures are taken. >>=20 >> An easy way to demonstrate the problem is to change keep-alive = defaults >> on *macOS* using sysctl and sshing to a FreeBSD 13 server: >>=20 >> $ sudo sysctl net.inet.tcp.keepidle=3D5000 >> $ sudo sysctl net.inet.tcp.keepintvl=3D5000 >> $ ssh -oTCPKeepAlive=3Dyes myserver >>=20 >> This way, the problem described can be reproduced quite easily: >> Disconnect due to broken pipe after 45-60 seconds of idle time, = tcpdump >> confirming that keep-alive packets don't have tcp timestamps, while >> they were used when negotiating the connection. >>=20 >> There are various ways to work around the issue. >>=20 >> Client side workarounds: >> - Use ServerAlive* settings in ~/.ssh/config (ssh only) >> - Tune net.inet.tcp.keep* sysctls on macOS (for all services) >>=20 >> Server side workarounds: >> - Use ClientAlive* settings in ~/.ssh/config (ssh only) >> - Tolerate missing timestamps in packets using sysctl, which makes >> FreeBSD 13 behave like previous versions did: >>=20 >> sysctl net.inet.tcp.tolerate_missing_ts=3D1 >>=20 >> The last option probably being the most practical one. >>=20 >> rscheff@ and tuexen@ (thank you!) were able to reproduce the issue = and >> reached out to Apple to see if there is something they can do to fix >> this at their end (macOS) in the future. >=20 > Can we please have the default of tolerate_missing_ts in > current, stable/13 and an errata issued to releng_13 changing > this value to =3D1 and staying that way until the buggy tcp > stacks are found and eliminated. The interesting part of your statement is that we find these stacks now by using the value of 0. But seriously, I thought that reasonable modern stacks would follow the RFC, which is not the case. I'll change the value in current and stable/13. But I have no idea about an errata. Will bring it up at the next transport call. Best regards Michael >=20 >>=20 >> Best >> Michael >>=20 >> = [0]https://cgit.freebsd.org/src/commit/?id=3D283c76c7c3f2f634f19f303a771a3= f81fe890cab >> [1]https://datatracker.ietf.org/doc/html/rfc7323#section-3.2 >>=20 >> --=20 >> Michael Gmelin >>=20 >>=20 >=20 > --=20 > Rod Grimes = rgrimes@freebsd.org >=20