Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 22 Mar 2014 08:55:59 -0300
From:      Christopher Forgeron <csforgeron@gmail.com>
To:        Rick Macklem <rmacklem@uoguelph.ca>
Cc:        FreeBSD Net <freebsd-net@freebsd.org>, Garrett Wollman <wollman@freebsd.org>, Jack Vogel <jfvogel@gmail.com>, Markus Gebert <markus.gebert@hostpoint.ch>
Subject:   Re: 9.2 ixgbe tx queue hang
Message-ID:  <CAB2_NwDRAxmnszh7jKKPfvxBdgaA9Z0CcJ9c1wSNncKb55Td5w@mail.gmail.com>
In-Reply-To: <CAB2_NwADUfs%2BbKV9QE_C4B1vchnzGWr1TK4C7wP8Fh8m94_mHA@mail.gmail.com>
References:  <CAB2_NwAzRyaHpvS=JXuksfsGmuYXJgYvMdD1tubWLgrDX4gdLw@mail.gmail.com> <1613242078.1214156.1395455976156.JavaMail.root@uoguelph.ca> <CAB2_NwADUfs%2BbKV9QE_C4B1vchnzGWr1TK4C7wP8Fh8m94_mHA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Status Update: Hopeful, but not done.

So the 9.2-STABLE ixgbe with Rick's TSO patch has been running all night
while iometer hammered away at it. It's got over 8 hours of test time on
it.

It's still running, the CPU queues are not clogged, and everything is
functional.

However, my ping_logger.py did record 23 incidents of "sendto: File too
large" over the 8 hour run.

That's really nothing compared to what I usually run into - Normally I'd
have 23 incidents within a 5 minute span.

During those 23 incidents, (ping_logger.py triggers a cpuset ping) I see
it's having the same symptoms of clogging on a few CPU cores. That clogging
does go away, a symptom that Markus says he sometimes experiences.

So I would say the TSO patch makes things remarkably better, but something
else is still up. Unfortunately, with the TSO patch in place it's now
harder to trigger the error, so testing will be more difficult.

Could someone confirm for me where the jumbo clusters denied/mbuf denied
counters come from for netstat? Would it be from a m_defrag call that fails?

I feel the netstat -m stats on boot are part of this issue - I was able to
greatly reduce them during one of my test iterations. I'm going to see if I
can repeat that with the TSO patch.

Getting this working on the 10-STABLE ixgbe:

Mike's contributed some edits (slightly different thread) I want to try on
that driver. At the same time, a diff of 9.2 <-> 10.0 may give hints, as
the 10.0 driver with TSO patch has issues quickly, and frequently... it's
doing something that aggravates this condition.


Thanks for all the help, please keep the suggestions or tidbits of info
coming.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAB2_NwDRAxmnszh7jKKPfvxBdgaA9Z0CcJ9c1wSNncKb55Td5w>