From owner-freebsd-stable@freebsd.org Mon Jul 27 12:36:36 2020 Return-Path: Delivered-To: freebsd-stable@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 7C243361684 for ; Mon, 27 Jul 2020 12:36:36 +0000 (UTC) (envelope-from jclarke@marcuscom.com) Received: from creme-brulee.marcuscom.com (creme-brulee.marcuscom.com [IPv6:2607:fc50:1:f300::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "*.marcuscom.com", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4BFfSM3t4rz3RNy for ; Mon, 27 Jul 2020 12:36:35 +0000 (UTC) (envelope-from jclarke@marcuscom.com) Received: from [IPv6:2600:1700:b00:b239:2039:6d93:21e1:a984] ([IPv6:2600:1700:b00:b239:2039:6d93:21e1:a984]) (authenticated bits=0) by creme-brulee.marcuscom.com (8.16.1/8.16.1) with ESMTPSA id 06RCaTAL079256 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 27 Jul 2020 08:36:33 -0400 (EDT) (envelope-from jclarke@marcuscom.com) From: Joe Clarke Message-Id: <1C16B9F2-73E2-4B0B-8A69-1A7E060E8D1B@marcuscom.com> Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.1\)) Subject: Re: Traffic "corruption" in 12-stable Date: Mon, 27 Jul 2020 08:36:25 -0400 In-Reply-To: <9d6dc414-2866-e6c8-6b66-22af23efc728@grosbein.net> Cc: freebsd-stable@freebsd.org To: Eugene Grosbein References: <9FAE54DE-F409-4A53-B91E-59AE52A86513@marcuscom.com> <9d6dc414-2866-e6c8-6b66-22af23efc728@grosbein.net> X-Mailer: Apple Mail (2.3608.120.23.2.1) X-Spam-Status: No, score=3.2 required=5.0 tests=HELO_MISC_IP,HELO_NO_DOMAIN, HTML_MESSAGE,RDNS_NONE,TW_MX,TW_PF,TW_RX,TW_TX,TW_VM,TW_XC autolearn=disabled version=3.4.4 X-Spam-Level: *** X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on creme-brulee.marcuscom.com X-Rspamd-Queue-Id: 4BFfSM3t4rz3RNy X-Spamd-Bar: ++ Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=none (mx1.freebsd.org: domain of jclarke@marcuscom.com has no SPF policy when checking 2607:fc50:1:f300::2) smtp.mailfrom=jclarke@marcuscom.com X-Spamd-Result: default: False [2.84 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MV_CASE(0.50)[]; NEURAL_SPAM_SHORT(0.47)[0.471]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; DMARC_NA(0.00)[marcuscom.com]; AUTH_NA(1.00)[]; NEURAL_SPAM_MEDIUM(0.61)[0.614]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_SPAM_LONG(0.35)[0.354]; R_SPF_NA(0.00)[no SPF record]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; ASN(0.00)[asn:36236, ipnet:2607:fc50::/36, country:US]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[] Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.33 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Jul 2020 12:36:36 -0000 > On Jul 27, 2020, at 01:00, Eugene Grosbein wrote: >=20 > 27.07.2020 5:16, Joe Clarke wrote: >=20 >> About two weeks ago, I upgraded from the latest 11-stable to the = latest 12-stable. After that, I periodically see the network throughput = come to a near standstill. This FreeBSD machine is an ESXi VM with two = interfaces. It acts as a router. It uses vmxnet3 interfaces for both = LAN and WAN. It runs ipfw with in-kernel NAT. The LAN side uses a = bridge with vmx0 and a tap0 L2 VPN interface. My LAN side uses an MTU = of 9000, and my vmx1 (WAN side) uses the default 1500. >>=20 >> Besides seeing massive packet loss and huge latency (~ 200 ms for = on-LAN ping times), I know the problem has occurred because my lldpd = reports: >>=20 >> Jul 26 15:47:03 namale lldpd[1126]: frame too short for tlv received = on bridge0 >>=20 >> And if I turn on ipfw verbose messages, I see tons of: >>=20 >> Jul 26 16:02:23 namale kernel: ipfw: pullup failed >>=20 >> This leads to me to believe packets are being corrupted on ingress. = I=E2=80=99ve applied all the recent iflib changes, but the problem = persists. What causes it, I don=E2=80=99t know. >>=20 >> The only thing that changed (and yes, it=E2=80=99s a big one) is I = upgraded to 12-stable. Meaning, the rest of the network infra and = topology has remained the same. This did not happen at all in = 11-stable. >>=20 >> I=E2=80=99m open to suggestions. >=20 > First, try: ifconfig $ifname -rxcsum -txcsum Thanks for the suggestion. I should have mentioned I=E2=80=99ve been = initializing these two interfaces since 11-stable with: ifconfig_vmx0=3D"up mtu 9000 -tso -lro -vlanhwtso -rxcsum -txcsum = -rxcsum6 -txcsum6 -tso4 -tso6 -vlanhwcsum=E2=80=9D ifconfig_vmx1=3D"DHCP -tso -lro -vlanhwtso -rxcsum -txcsum -rxcsum6 = -txcsum6 -tso4 -tso6 -vlanhwcsum=E2=80=9D And I=E2=80=99m running: FreeBSD namale.marcuscom.com 12.1-STABLE FreeBSD 12.1-STABLE NAMALE = amd64 1201520 1201520 I most recently built this yesterday, but the previous kernel that = exhibited the problem was built about a week ago. It had the fragment = fixes for iflib.c. Joe >=20 --- PGP Key : http://www.marcuscom.com/pgp.asc