From owner-freebsd-net@FreeBSD.ORG  Fri Mar 21 02:25:53 2014
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id B2083EB2
 for <freebsd-net@freebsd.org>; Fri, 21 Mar 2014 02:25:53 +0000 (UTC)
Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca
 [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 7144B877
 for <freebsd-net@freebsd.org>; Fri, 21 Mar 2014 02:25:52 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AqAEAH2iK1ODaFve/2dsb2JhbABZhBiDB79fgSt0giUBAQEEI1YbDgoCAg0ZAlkGE4d5rR+iYheBKYxzFQEzB4JvgUkEqnqDSSGBLAEfBB4
X-IronPort-AV: E=Sophos;i="4.97,700,1389762000"; d="scan'208";a="107781963"
Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.222])
 by esa-jnhn.mail.uoguelph.ca with ESMTP; 20 Mar 2014 22:25:51 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id AA2F279283;
 Thu, 20 Mar 2014 22:25:51 -0400 (EDT)
Date: Thu, 20 Mar 2014 22:25:51 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Christopher Forgeron <csforgeron@gmail.com>
Message-ID: <477992488.642193.1395368751685.JavaMail.root@uoguelph.ca>
In-Reply-To: <CAB2_NwDGb=NS8ghWfcuB7mrmr9_VzRnZ_yg9M-qAGESCShB4VQ@mail.gmail.com>
Subject: Re: 9.2 ixgbe tx queue hang
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Originating-IP: [172.17.91.203]
X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790)
Cc: freebsd-net@freebsd.org, Jack Vogel <jfvogel@gmail.com>,
 Markus Gebert <markus.gebert@hostpoint.ch>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Mar 2014 02:25:53 -0000

Christopher Forgeron wrote:
>=20
>=20
>=20
>=20
>=20
>=20
> On Thu, Mar 20, 2014 at 7:40 AM, Markus Gebert <
> markus.gebert@hostpoint.ch > wrote:
>=20
>=20
>=20
>=20
>=20
> Possible. We still see this on nfsclients only, but I=E2=80=99m not convi=
nced
> that nfs is the only trigger.
>=20
>=20
Since Christopher is getting a bunch of the "before" printf()s from
my patch, it indicates that a packet/TSO segment that is > 65535 bytes
in length is showing up at ixgbe_xmit(). I've asked him to add a printf()
for the m_pkthdr.csum_flags field to see if it is really a TSO segment.

If it is a TSO segment, that indicates to me that the code in tcp_output() =
that should
generate a TSO segment no greater than 65535 bytes in length is busted.
And this would imply just about any app doing large sosend()s could cause
this, I think? (NFS read replies/write requests of 64K would be one of them=
.)

rick

>=20
>=20
>=20
> Just to clarify, I'm experiencing this error with NFS, but also with
> iSCSI - I turned off my NFS server in rc.conf and rebooted, and I'm
> still able to create the error. This is not just a NFS issue on my
> machine.
>=20
>=20
>=20
> I our case, when it happens, the problem persists for quite some time
> (minutes or hours) if we don=E2=80=99t interact (ifconfig or reboot).
>=20
>=20
>=20
> The first few times that I ran into it, I had similar issues -
> Because I was keeping my system up and treating it like a temporary
> problem/issue. Worst case scenario resulted in reboots to reset the
> NIC. Then again, I find the ix's to be cranky if you ifconfig them
> too much.
>=20
> Now, I'm trying to find a root cause, so as soon as I start seeing
> any errors, I abort and reboot the machine to test the next theory.
>=20
>=20
> Additionally, I'm often able to create the problem with just 1 VM
> running iometer on the SAN storage. When the problem occurs, that
> connection is broken temporarily, taking network load off the SAN -
> That may improve my chances of keeping this running.
>=20
>=20
>=20
>=20
>=20
> > I am able to reproduce it fairly reliably within 15 min of a reboot
> > by
> > loading the server via NFS with iometer and some large NFS file
> > copies at
> > the same time. I seem to need to sustain ~2 Gbps for a few minutes.
>=20
> That=E2=80=99s probably why we can=E2=80=99t reproduce it reliably here. =
Although
> having 10gig cards in our blade servers, the ones affected are
> connected to a 1gig switch.
>=20
>=20
>=20
>=20
>=20
> It seems that it needs a lot of traffic. I have a 10 gig backbone
> between my SANs and my ESXi machines, so I can saturate quite
> quickly (just now I hit a record.. the error occurred within ~5 min
> of reboot and testing). In your case, I recommend firing up multiple
> VM's running iometer on different 1 gig connections and see if you
> can make it pop. I also often turn off ix1 to drive all traffic
> through ix0 - I've noticed it happens faster this way, but once
> again I'm not taking enough observations to make decent time
> predictions.
>=20
>=20
>=20
>=20
>=20
>=20
> Can you try this when the problem occurs?
>=20
> for CPU in {0..7}; do echo "CPU${CPU}"; cpuset -l ${CPU} ping -i 0.2
> -c 2 -W 1 10.0.0.1 | grep sendto; done
>=20
> It will tie ping to certain cpus to test the different tx queues of
> your ix interface. If the pings reliably fail only on some queues,
> then your problem is more likely to be the same as ours.
>=20
> Also, if you have dtrace available:
>=20
> kldload dtraceall
> dtrace -n 'fbt:::return / arg1 =3D=3D EFBIG && execname =3D=3D "ping" / {
> stack(); }'
>=20
> while you run pings over the interface affected. This will give you
> hints about where the EFBIG error comes from.
>=20
> > [=E2=80=A6]
>=20
>=20
> Markus
>=20
>=20
>=20
>=20
> Will do. I'm not sure what shell the first script was written for,
> it's not working in csh, here's a re-write that does work in csh in
> case others are using the default shell:
>=20
> #!/bin/csh
> foreach CPU (`seq 0 23`)
> echo "CPU$CPU";
> cpuset -l $CPU ping -i 0.2 -c 2 -W 1 10.0.0.1 | grep sendto;
> end
>=20
>=20
> Thanks for your input. I should have results to post to the list
> shortly.
>=20
>=20