Date: Mon, 27 Jul 2020 15:41:50 -0400 From: Joe Clarke <jclarke@marcuscom.com> To: Mark Johnston <markj@freebsd.org> Cc: freebsd-stable@freebsd.org Subject: Re: Traffic "corruption" in 12-stable Message-ID: <2F974A4E-95B3-4C65-A5F8-6FBBB575B756@marcuscom.com> In-Reply-To: <20200727190147.GC59953@raichu> References: <9FAE54DE-F409-4A53-B91E-59AE52A86513@marcuscom.com> <20200727190147.GC59953@raichu>
next in thread | previous in thread | raw e-mail | index | archive | help
> On Jul 27, 2020, at 15:01, Mark Johnston <markj@freebsd.org> wrote: >=20 > On Sun, Jul 26, 2020 at 06:16:07PM -0400, Joe Clarke wrote: >> About two weeks ago, I upgraded from the latest 11-stable to the = latest 12-stable. After that, I periodically see the network throughput = come to a near standstill. This FreeBSD machine is an ESXi VM with two = interfaces. It acts as a router. It uses vmxnet3 interfaces for both = LAN and WAN. It runs ipfw with in-kernel NAT. The LAN side uses a = bridge with vmx0 and a tap0 L2 VPN interface. My LAN side uses an MTU = of 9000, and my vmx1 (WAN side) uses the default 1500. >>=20 >> Besides seeing massive packet loss and huge latency (~ 200 ms for = on-LAN ping times), I know the problem has occurred because my lldpd = reports: >>=20 >> Jul 26 15:47:03 namale lldpd[1126]: frame too short for tlv received = on bridge0 >>=20 >> And if I turn on ipfw verbose messages, I see tons of: >>=20 >> Jul 26 16:02:23 namale kernel: ipfw: pullup failed >>=20 >> This leads to me to believe packets are being corrupted on ingress. = I=E2=80=99ve applied all the recent iflib changes, but the problem = persists. What causes it, I don=E2=80=99t know. >>=20 >> The only thing that changed (and yes, it=E2=80=99s a big one) is I = upgraded to 12-stable. Meaning, the rest of the network infra and = topology has remained the same. This did not happen at all in = 11-stable. >>=20 >> I=E2=80=99m open to suggestions. >=20 > There are some fixes for vmx not present in stable/12 (yet). I did a > merge of a number of outstanding revisions. Would you be able to test > the patch? I haven't observed any problems with it on a host using = igb, > but I have no ability to test vmx at the moment. I=E2=80=99m down to test anything. I did notice quite a few vmxnet3 = changes around performance that appealed to me. I tried a few of them = on my last kernel. That took much longer to exhibit the problem, but = eventually did. I can tell you I don=E2=80=99t have all of these patches in, though. = I=E2=80=99ll build with this diff and start running it now. I=E2=80=99ll = let you know how it goes. Thanks! Joe --- PGP Key : http://www.marcuscom.com/pgp.asc
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2F974A4E-95B3-4C65-A5F8-6FBBB575B756>