Date: Mon, 7 Feb 2011 22:29:53 -0500 From: Karim Fodil-Lemelin <fodillemlinkarim@gmail.com> To: pyunyh@gmail.com Cc: jfv@freebsd.org, freebsd-net@freebsd.org Subject: Re: Fwd: igb driver RX (was TX) hangs when out of mbuf clusters Message-ID: <AANLkTikrjkHDaBq%2Bx6MTZhzOeqWA=xtFpqQPsthFGmuf@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
2011/2/7 Pyun YongHyeon <pyunyh@gmail.com> > On Mon, Feb 07, 2011 at 09:21:45PM -0500, Karim Fodil-Lemelin wrote: > > 2011/2/7 Pyun YongHyeon <pyunyh@gmail.com> > > > > > On Mon, Feb 07, 2011 at 05:33:47PM -0500, Karim Fodil-Lemelin wrote: > > > > Subject: Re: igb driver tx hangs when out of mbuf clusters > > > > > > > > > To: Lev Serebryakov <lev@serebryakov.spb.ru> > > > > > Cc: freebsd-net@freebsd.org > > > > > > > > > > > > > > > 2011/2/7 Lev Serebryakov <lev@serebryakov.spb.ru> > > > > > > > > > > Hello, Karim. > > > > >> You wrote 7 =D1=84=D0=B5=D0=B2=D1=80=D0=B0=D0=BB=D1=8F 2011 =D0= =B3., 19:58:04: > > > > >> > > > > >> > > > > >> > The issue is with the igb driver from 7.4 RC3 r218406. If the > driver > > > > >> runs > > > > >> > out of mbuf clusters it simply stops receiving even after the > > > clusters > > > > >> have > > > > >> > been freed. > > > > >> It looks like my problems with em0 (see thread "em0 hangs > without > > > > >> any messages like "Watchdog timeout", only down/up reset it.").= .. > > > > >> Codebase for em and igb is somewhat common... > > > > >> > > > > >> -- > > > > >> // Black Lion AKA Lev Serebryakov <lev@serebryakov.spb.ru> > > > > >> > > > > >> I agree. > > > > > > > > > > Do you get missed packets in mac_stats (sysctl dev.em | grep > missed)? > > > > > > > > > > I might not have mentioned but I can also 'fix' the problem by > doing > > > > > ifconfig igb0 down/up. > > > > > > > > > > I will try using POLLING to 'automatize' the reset as you mention= ed > in > > > your > > > > > thread. > > > > > > > > > > Karim. > > > > > > > > > > > > > > Follow up on tests with POLLING: The problem is still occurring > although > > > it > > > > takes more time ... Outputs of sysctl dev.igb0 and netstat -m will > > > follow: > > > > > > > > 9219/99426/108645 mbufs in use (current/cache/total) > > > > 9217/90783/100000/100000 mbuf clusters in use > (current/cache/total/max) > > > > > > Do you see network processes are stuck in keglim state? If you see > > > that I think that's not trivial to solve. You wouldn't even kill > > > that process if it is under keglim state unless some more mbuf > > > clusters are freed from other places. > > > > > > > No keglim state, here is a snapshot of top -SH while the problem is > > happening: > > > > 12 root 171 ki31 0K 8K CPU5 5 19:27 100.00% idle= : > > cpu5 > > 10 root 171 ki31 0K 8K CPU7 7 19:26 100.00% idle= : > > cpu7 > > 14 root 171 ki31 0K 8K CPU3 3 19:25 100.00% idle= : > > cpu3 > > 11 root 171 ki31 0K 8K CPU6 6 19:25 100.00% idle= : > > cpu6 > > 13 root 171 ki31 0K 8K CPU4 4 19:24 100.00% idle= : > > cpu4 > > 15 root 171 ki31 0K 8K CPU2 2 19:22 100.00% idle= : > > cpu2 > > 16 root 171 ki31 0K 8K CPU1 1 19:18 100.00% idle= : > > cpu1 > > 17 root 171 ki31 0K 8K RUN 0 19:12 100.00% idle= : > > cpu0 > > 18 root -32 - 0K 8K WAIT 6 0:04 0.10% swi4: > > clock s > > 20 root -44 - 0K 8K WAIT 4 0:08 0.00% swi1: > net > > 29 root -68 - 0K 8K - 0 0:02 0.00% igb0 > que > > 35 root -68 - 0K 8K - 2 0:02 0.00% em1 > taskq > > 28 root -68 - 0K 8K WAIT 5 0:01 0.00% irq25= 6: > > igb0 > > > > keep in mind that num_queues has been forced to 1. > > > > > > > > > > I think both igb(4) and em(4) pass received frame to upper stack > > > before allocating new RX buffer. If driver fails to allocate new RX > > > buffer driver will try to refill RX buffers in next run. Under > > > extreme resource shortage case, this situation can produce no more > > > RX buffers in RX descriptor ring and this will take the box out of > > > network. Other drivers avoid that situation by allocating new RX > > > buffer before passing received frame to upper stack. If RX buffer > > > allocation fails driver will just reuse old RX buffer without > > > passing received frame to upper stack. That does not completely > > > solve the keglim issue though. I think you should have enough mbuf > > > cluters to avoid keglim. > > > > > > However the output above indicates you have enough free mbuf > > > clusters. So I guess igb(4) encountered zero available RX buffer > > > situation in past but failed to refill the RX buffer again. I guess > > > driver may be able to periodically check available RX buffers. > > > Jack may have better idea if this was the case.(CCed) > > > > > > > That is exactly the pattern. The driver runs out of clusters but they > > eventually get consumed and freed although the driver refuses to proces= s > any > > new frames. It is, on the other hand, perfectly capable of sending out > > packets. > > > > Ok, this clearly indicates igb(4) failed to refill RX buffers since > you can still send frames. I'm not sure whether igb(4) controllers > could be configured to generate no RX buffer interrupts but that > interrupt would be better suited to trigger RX refilling than timer > based refilling. Since igb(4) keeps track of available RX buffers, > igb(4) can selectively enable that interrupt once it see no RX > buffers in the RX descriptor ring. However this does not work with > polling. > I think that your evaluation of the problem is correct although I do not understand the selective interrupt mechanism you described. Precisely, the exact same behavior happens (RX hang) if options DEVICE_POLLING is _not_ used in the kernel configuration file. I tried with POLLING since someone mentioned that it helped in a case mentioned earlier today. Unfortunately for igb with or without polling yields the same rx rin= g filing problem. By the way I fixed the subject where I erroneously said TX was hanging whil= e in fact RX is hanging and TX is just fine.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTikrjkHDaBq%2Bx6MTZhzOeqWA=xtFpqQPsthFGmuf>