Date: Fri, 21 Mar 2014 22:39:36 -0400 (EDT) From: Rick Macklem <rmacklem@uoguelph.ca> To: Christopher Forgeron <csforgeron@gmail.com> Cc: FreeBSD Net <freebsd-net@freebsd.org>, Garrett Wollman <wollman@freebsd.org>, Jack Vogel <jfvogel@gmail.com>, Markus Gebert <markus.gebert@hostpoint.ch> Subject: Re: 9.2 ixgbe tx queue hang Message-ID: <1613242078.1214156.1395455976156.JavaMail.root@uoguelph.ca> In-Reply-To: <CAB2_NwAzRyaHpvS=JXuksfsGmuYXJgYvMdD1tubWLgrDX4gdLw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
------=_Part_1214154_1974091844.1395455976155 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Christopher Forgeron wrote: > It may be a little early, but I think that's it! > > It's been running without error for nearly an hour - It's very rare > it > would go this long under this much load. > > I'm going to let it run longer, then abort and install the kernel > with the > extra printfs so I can see what value ifp->if_hw_tsomax is before you > set > it. > I think you'll just find it set to 0. Code in if_attach_internal() { in sys/net/if.c } sets it to IP_MAXPACKET (which is 65535) if it is 0. In other words, if the if_attach routine in the driver doesn't set it, this code sets it to the maximum possible value. Here's the snippet: /* Initialize to max value. */ 657 if (ifp->if_hw_tsomax == 0) 658 ifp->if_hw_tsomax = IP_MAXPACKET; Anyhow, this sounds like progress. As far as NFS is concerned, I'd rather set it to a smaller value (maybe 56K) so that m_defrag() doesn't need to be called, but I suspect others wouldn't like this. Hopefully Jack can decide if this patch is ok? Thanks yet again for doing this testing, rick ps: I've attached it again, so Jack (and anyone else who reads this) can look at it. pss: Please report if it keeps working for you. > It still had netstat -m denied entries on boot, but they are not > climbing > like they did before: > > > $ uptime > 9:32PM up 25 mins, 4 users, load averages: 2.43, 6.15, 4.65 > $ netstat -m > 21556/7034/28590 mbufs in use (current/cache/total) > 4080/3076/7156/6127254 mbuf clusters in use (current/cache/total/max) > 4080/2281 mbuf+clusters out of packet secondary zone in use > (current/cache) > 0/53/53/3063627 4k (page size) jumbo clusters in use > (current/cache/total/max) > 16444/118/16562/907741 9k jumbo clusters in use > (current/cache/total/max) > 0/0/0/510604 16k jumbo clusters in use (current/cache/total/max) > 161545K/9184K/170729K bytes allocated to network > (current/cache/total) > 17972/2230/4111 requests for mbufs denied > (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) > 35/8909/0 requests for jumbo clusters denied (4k/9k/16k) > 0 requests for sfbufs denied > 0 requests for sfbufs delayed > 0 requests for I/O initiated by sendfile > > - Started off bad with the 9k denials, but it's not going up! > > uptime > 10:20PM up 1:13, 6 users, load averages: 2.10, 3.15, 3.67 > root@SAN0:/usr/home/aatech # netstat -m > 21569/7141/28710 mbufs in use (current/cache/total) > 4080/3308/7388/6127254 mbuf clusters in use (current/cache/total/max) > 4080/2281 mbuf+clusters out of packet secondary zone in use > (current/cache) > 0/53/53/3063627 4k (page size) jumbo clusters in use > (current/cache/total/max) > 16447/121/16568/907741 9k jumbo clusters in use > (current/cache/total/max) > 0/0/0/510604 16k jumbo clusters in use (current/cache/total/max) > 161575K/9702K/171277K bytes allocated to network > (current/cache/total) > 17972/2261/4111 requests for mbufs denied > (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) > 35/8913/0 requests for jumbo clusters denied (4k/9k/16k) > 0 requests for sfbufs denied > 0 requests for sfbufs delayed > 0 requests for I/O initiated by sendfile > > This is the 9.2 ixgbe that I'm patching into 10.0, I'll move into the > base > 10.0 code tomorrow. > > > On Fri, Mar 21, 2014 at 8:44 PM, Rick Macklem <rmacklem@uoguelph.ca> > wrote: > > > Christopher Forgeron wrote: > > > > > > > > > > > > > > > > > > > > > Hello all, > > > > > > I ran Jack's ixgbe MJUM9BYTES removal patch, and let iometer > > > hammer > > > away at the NFS store overnight - But the problem is still there. > > > > > > > > > From what I read, I think the MJUM9BYTES removal is probably good > > > cleanup (as long as it doesn't trade performance on a lightly > > > memory > > > loaded system for performance on a heavily memory loaded system). > > > If > > > I can stabilize my system, I may attempt those benchmarks. > > > > > > > > > I think the fix will be obvious at boot for me - My 9.2 has a > > > 'clean' > > > netstat > > > - Until I can boot and see a 'netstat -m' that looks similar to > > > that, > > > I'm going to have this problem. > > > > > > > > > Markus: Do your systems show denied mbufs at boot like mine does? > > > > > > > > > Turning off TSO works for me, but at a performance hit. > > > > > > I'll compile Rick's patch (and extra debugging) this morning and > > > let > > > you know soon. > > > > > > > > > > > > > > > > > > > > > On Thu, Mar 20, 2014 at 11:47 PM, Christopher Forgeron < > > > csforgeron@gmail.com > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > BTW - I think this will end up being a TSO issue, not the patch > > > that > > > Jack applied. > > > > > > When I boot Jack's patch (MJUM9BYTES removal) this is what > > > netstat -m > > > shows: > > > > > > 21489/2886/24375 mbufs in use (current/cache/total) > > > 4080/626/4706/6127254 mbuf clusters in use > > > (current/cache/total/max) > > > 4080/587 mbuf+clusters out of packet secondary zone in use > > > (current/cache) > > > 16384/50/16434/3063627 4k (page size) jumbo clusters in use > > > (current/cache/total/max) > > > 0/0/0/907741 9k jumbo clusters in use (current/cache/total/max) > > > > > > 0/0/0/510604 16k jumbo clusters in use (current/cache/total/max) > > > 79068K/2173K/81241K bytes allocated to network > > > (current/cache/total) > > > 18831/545/4542 requests for mbufs denied > > > (mbufs/clusters/mbuf+clusters) > > > > > > 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) > > > 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) > > > 15626/0/0 requests for jumbo clusters denied (4k/9k/16k) > > > > > > 0 requests for sfbufs denied > > > 0 requests for sfbufs delayed > > > 0 requests for I/O initiated by sendfile > > > > > > Here is an un-patched boot: > > > > > > 21550/7400/28950 mbufs in use (current/cache/total) > > > 4080/3760/7840/6127254 mbuf clusters in use > > > (current/cache/total/max) > > > 4080/2769 mbuf+clusters out of packet secondary zone in use > > > (current/cache) > > > 0/42/42/3063627 4k (page size) jumbo clusters in use > > > (current/cache/total/max) > > > 16439/129/16568/907741 9k jumbo clusters in use > > > (current/cache/total/max) > > > > > > 0/0/0/510604 16k jumbo clusters in use (current/cache/total/max) > > > 161498K/10699K/172197K bytes allocated to network > > > (current/cache/total) > > > 18345/155/4099 requests for mbufs denied > > > (mbufs/clusters/mbuf+clusters) > > > > > > 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) > > > 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) > > > 3/3723/0 requests for jumbo clusters denied (4k/9k/16k) > > > > > > 0 requests for sfbufs denied > > > 0 requests for sfbufs delayed > > > 0 requests for I/O initiated by sendfile > > > > > > > > > > > > See how removing the MJUM9BYTES is just pushing the problem from > > > the > > > 9k jumbo cluster into the 4k jumbo cluster? > > > > > > Compare this to my FreeBSD 9.2 STABLE machine from ~ Dec 2013 : > > > Exact > > > same hardware, revisions, zpool size, etc. Just it's running an > > > older FreeBSD. > > > > > > # uname -a > > > FreeBSD SAN1.XXXXX 9.2-STABLE FreeBSD 9.2-STABLE #0: Wed Dec 25 > > > 15:12:14 AST 2013 aatech@FreeBSD-Update > > > Server:/usr/obj/usr/src/sys/GENERIC amd64 > > > > > > root@SAN1:/san1 # uptime > > > 7:44AM up 58 days, 38 mins, 4 users, load averages: 0.42, 0.80, > > > 0.91 > > > > > > root@SAN1:/san1 # netstat -m > > > 37930/15755/53685 mbufs in use (current/cache/total) > > > 4080/10996/15076/524288 mbuf clusters in use > > > (current/cache/total/max) > > > 4080/5775 mbuf+clusters out of packet secondary zone in use > > > (current/cache) > > > 0/692/692/262144 4k (page size) jumbo clusters in use > > > (current/cache/total/max) > > > 32773/4257/37030/96000 9k jumbo clusters in use > > > (current/cache/total/max) > > > > > > 0/0/0/508538 16k jumbo clusters in use (current/cache/total/max) > > > 312599K/67011K/379611K bytes allocated to network > > > (current/cache/total) > > > > > > 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > > > 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) > > > 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) > > > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > > > 0/0/0 sfbufs in use (current/peak/max) > > > 0 requests for sfbufs denied > > > 0 requests for sfbufs delayed > > > 0 requests for I/O initiated by sendfile > > > 0 calls to protocol drain routines > > > > > > Lastly, please note this link: > > > > > > http://lists.freebsd.org/pipermail/freebsd-net/2012-October/033660.html > > > > > Hmm, this mentioned the ethernet header being in the TSO segment. I > > think > > I already mentioned my TCP/IP is rusty and I know diddly about TSO. > > However, at a glance it does appear the driver uses ether_output() > > for > > TSO segments and, as such, I think an ethernet header is prepended > > to the > > TSO segment. (This makes sense, since how else would the hardware > > know > > what ethernet header to use for the TCP segments generated.) > > > > I think prepending the ethernet header could push the total length > > over 64K, given a default if_hw_tsomax == IP_MAXPACKET. And over > > 64K > > isn't going to fit in 32 * 2K (mclbytes) clusters, etc and so > > forth. > > > > Anyhow, I think the attached patch will reduce if_hw_tsomax, so > > that > > the result should fit in 32 clusters and avoid EFBIG for this case, > > so it might be worth a try? > > (I still can't think of why the CSUM_TSO bit isn't set for the > > printf() > > case, but it seems TSO segments could generate EFBIG errors.) > > > > Maybe worth a try, rick > > > > > It's so old that I assume the TSO leak that he speaks of has been > > > patched, but perhaps not. More things to look into tomorrow. > > > > > > > > > > > > > > > > > > > _______________________________________________ > > freebsd-net@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-net > > To unsubscribe, send any mail to > > "freebsd-net-unsubscribe@freebsd.org" > > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to > "freebsd-net-unsubscribe@freebsd.org" > ------=_Part_1214154_1974091844.1395455976155 Content-Type: text/x-patch; name=ixgbe.patch Content-Disposition: attachment; filename=ixgbe.patch Content-Transfer-Encoding: base64 LS0tIGRldi9peGdiZS9peGdiZS5jLnNhdgkyMDE0LTAzLTE5IDE3OjQ0OjM0LjAwMDAwMDAwMCAt MDQwMAorKysgZGV2L2l4Z2JlL2l4Z2JlLmMJMjAxNC0wMy0yMSAxOToyNTo0Ni4wMDAwMDAwMDAg LTA0MDAKQEAgLTI2MTQsNiArMjYxNCw5IEBAIGl4Z2JlX3NldHVwX2ludGVyZmFjZShkZXZpY2Vf dCBkZXYsIHN0cnUKIAlpZnAtPmlmX3NuZC5pZnFfZHJ2X21heGxlbiA9IGFkYXB0ZXItPm51bV90 eF9kZXNjIC0gMjsKIAlJRlFfU0VUX1JFQURZKCZpZnAtPmlmX3NuZCk7CiAjZW5kaWYKKwlpZiAo KGFkYXB0ZXItPm51bV9zZWdzICogTUNMQllURVMgLSBFVEhFUl9IRFJfTEVOKSA8IElQX01BWFBB Q0tFVCkKKwkJaWZwLT5pZl9od190c29tYXggPSBhZGFwdGVyLT5udW1fc2VncyAqIE1DTEJZVEVT IC0KKwkJICAgIEVUSEVSX0hEUl9MRU47CiAKIAlldGhlcl9pZmF0dGFjaChpZnAsIGFkYXB0ZXIt Pmh3Lm1hYy5hZGRyKTsKIAo= ------=_Part_1214154_1974091844.1395455976155--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1613242078.1214156.1395455976156.JavaMail.root>