From owner-freebsd-net@FreeBSD.ORG Wed Feb 9 18:45:57 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 110081065673 for ; Wed, 9 Feb 2011 18:45:57 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from mail-n.franken.de (drew.ipv6.franken.de [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa]) by mx1.freebsd.org (Postfix) with ESMTP id 1D6188FC0A for ; Wed, 9 Feb 2011 18:45:56 +0000 (UTC) Received: from [192.168.1.113] (p508FA862.dip.t-dialin.net [80.143.168.98]) (Authenticated sender: macmic) by mail-n.franken.de (Postfix) with ESMTP id 884611C0C0BD8; Wed, 9 Feb 2011 19:45:53 +0100 (CET) Mime-Version: 1.0 (Apple Message framework v1082) Content-Type: text/plain; charset=us-ascii From: Michael Tuexen In-Reply-To: Date: Wed, 9 Feb 2011 19:45:52 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <12838373-FE96-443E-8979-AF5408705BF0@freebsd.org> References: To: Jack Vogel X-Mailer: Apple Mail (2.1082) Cc: Karim Fodil-Lemelin , Pyun YongHyeon , freebsd-net@freebsd.org, beezarliu Subject: Re: igb driver RX (was TX) hangs when out of mbuf clusters X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Feb 2011 18:45:57 -0000 On Feb 9, 2011, at 6:35 PM, Jack Vogel wrote: > OK, but the question is why does the ring get totally consumed this = way, the > ring has 1024 descriptors, it seems unintuitive that that whole = quantity can be > used without some being recharged. Do you see the system mbuf pool = being > depleted at the same time? That was the test case I created: I set up a server accepting = connections but not reading anything. So the driver passes the mbufs to the = transport stack and they are not consumed. Then the problem occurs. Then I kill = the server. Now there are mbufs available again, but the driver doesn't = know. I had the impression that these were the circumstances in which the = problem showed up (mbuf allocations failing). >=20 > Since you can reproduce it, do me a favor, in rxeof, change the = processed > value from 8 to 4 and then 1, effectively call refresh every = descriptor, see if > that eliminates the issue. I will do. Need to see if I can do it remotely, since I'm not in my lab right now. Can do it tomorrow for sure. But I do not think that this solves the problem, since I did the things very slowly and you call it at least when you are leaving rxeof. Best regards Michael >=20 > Thanks for your help, >=20 > Jack >=20 >=20 > On Wed, Feb 9, 2011 at 2:36 AM, Michael Tuexen = wrote: > Hi Jack, >=20 > I could recreate the problem. When the problem occurs, we see >=20 > rx_nxt_check =3D n > rx_nxt_refresh =3D n + 1 >=20 > (This was also reported in a mail from Karim) >=20 > This means that the *whole* receive ring has no buffers anymore. This = can > occur if, for some amount of time, no clusters are available. >=20 > Now outside of the driver, at some point of time, clusters are freed. > I don't think that igb_refresh_mbufs() gets called, since it only gets > called from igb_rxeof(), which gets called when a packet has been = received, > which can not happen since the receive ring is empty. So how can the = driver > know? I have no idea. Maybe we can periodically check for such an = event > and call igb_refresh_mbufs(). >=20 > Does this make sense to you? >=20 > Best regards > Michael >=20 >=20 > On Feb 9, 2011, at 8:32 AM, Jack Vogel wrote: >=20 > > Hmmm, well so much for that theory :) > > > > Jack > > > > > > On Tue, Feb 8, 2011 at 4:06 PM, Karim Fodil-Lemelin = wrote: > > > > > > 2011/2/8 Jack Vogel > > > > > > I have been following this, and thinking about it. I still am = working from a theoretical > > standpoint, but based on a patch I got quite a long time back and = never quite groked, > > I believe now that I might have a solution. > > > > The original PR and patch was kern/150516 from Beezar Liu, I was = never quite comfortable > > with the code changes, nor convinced that it was a real issue and = not a misunderstanding. > > However I think now that this very report might be behind what we = are seeing today. I have > > a slightly different approach to solving it, of course it remains to = be seen if it handles it > > properly. > > > > Please try the patch I've attached, I'm open to further correction = or polishing of the > > changes. And thanks to Beezar for his original report and changes, = this is not for em, > > but if this eliminates the problem its clearly needed in all = drivers. > > > > Jack > > > > > > Hi Jack, > > > > Thanks for your help. I tried your patch and it didn't work so I = added a couple of printf to see if the added code was getting hit: > > > > --- a/freebsd/sys/dev/e1000/if_igb.c > > --More--(byte 1253)+++ b/freebsd/sys/dev/e1000/if_igb.c > > @@ -612,7 +612,7 @@ igb_attach(device_t dev) > > device_get_nameunit(dev)); > > > > INIT_DEBUGOUT("igb_attach: end"); > > - > > + printf("this driver has a patch from Jack Vogel\n"); > > return (0); > > > > err_late: > > @@ -4131,6 +4131,7 @@ igb_rxeof(struct igb_queue *que, int count, = int *done) > > struct mbuf *sendmp, *mh, *mp; > > struct igb_rx_buf *rxbuf; > > u16 hlen, plen, hdr, vtag; > > + int commit; > > bool eop =3D FALSE; > > > > cur =3D &rxr->rx_base[i]; > > @@ -4255,10 +4256,23 @@ next_desc: > > bus_dmamap_sync(rxr->rxdma.dma_tag, = rxr->rxdma.dma_map, > > BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); > > > > + commit =3D i; /* capture the old index */ > > + > > /* Advance our pointers to the next descriptor. */ > > if (++i =3D=3D adapter->num_rx_desc) > > i =3D 0; > > /* > > + ** Sanity test for ring full, if this > > + ** happens we need to refresh immediately > > + ** or refresh may deadlock. > > + */ > > + if (i =3D=3D rxr->next_to_refresh) { > > + igb_refresh_mbufs(rxr, commit); > > + printf("igb_refresh_mbufs called with commit = %d\n", commit); > > + processed =3D 0; > > + } > > + > > + /* > > ** Send to the stack or LRO > > */ > > if (sendmp !=3D NULL) { > > > > Here is the results: > > > > # dmesg | grep Vogel > > this driver has a patch from Jack Vogel > > this driver has a patch from Jack Vogel > > > > # netstat -m > > 60453/52707/113160 mbufs in use (current/cache/total) > > 48416/51584/100000/100000 mbuf clusters in use = (current/cache/total/max) > > 2894/690 mbuf+clusters out of packet secondary zone in use = (current/cache) > > 11946/854/12800/12800 4k (page size) jumbo clusters in use = (current/cache/total/max) > > 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) > > 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) > > 164834K/119760K/284595K bytes allocated to network = (current/cache/total) > > 0/339/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > > 0/4/6656 sfbufs in use (current/peak/max) > > 0 requests for sfbufs denied > > 0 requests for sfbufs delayed > > 0 requests for I/O initiated by sendfile > > 0 calls to protocol drain routines > > # dmesg | grep commit > > > > At this point RX has hung. > > > > Somehow the check (i =3D=3D rxr->next_to_refresh) is never true in = this case. Also, I did read kern/150516 and couldn't wrap my head around = the patch for the em driver that Beezar Liu suggested. > > > > Regards, > > > > Karim. > > > > >=20 >=20