FreeBSD Mail Archives

Date:      Mon, 7 Feb 2011 22:29:53 -0500
From:      Karim Fodil-Lemelin <fodillemlinkarim@gmail.com>
To:        pyunyh@gmail.com
Cc:        jfv@freebsd.org, freebsd-net@freebsd.org
Subject:   Re: Fwd: igb driver RX (was TX) hangs when out of mbuf clusters
Message-ID:  <AANLkTikrjkHDaBq%2Bx6MTZhzOeqWA=xtFpqQPsthFGmuf@mail.gmail.com>

index | next in thread | raw e-mail


2011/2/7 Pyun YongHyeon <pyunyh@gmail.com>

> On Mon, Feb 07, 2011 at 09:21:45PM -0500, Karim Fodil-Lemelin wrote:
> > 2011/2/7 Pyun YongHyeon <pyunyh@gmail.com>
> >
> > > On Mon, Feb 07, 2011 at 05:33:47PM -0500, Karim Fodil-Lemelin wrote:
> > > > Subject: Re: igb driver tx hangs when out of mbuf clusters
> > > >
> > > > > To: Lev Serebryakov <lev@serebryakov.spb.ru>
> > > > > Cc: freebsd-net@freebsd.org
> > > > >
> > > > >
> > > > > 2011/2/7 Lev Serebryakov <lev@serebryakov.spb.ru>
> > > > >
> > > > > Hello, Karim.
> > > > >> You wrote 7 февраля 2011 г., 19:58:04:
> > > > >>
> > > > >>
> > > > >> > The issue is with the igb driver from 7.4 RC3 r218406. If the
> driver
> > > > >> runs
> > > > >> > out of mbuf clusters it simply stops receiving even after the
> > > clusters
> > > > >> have
> > > > >> > been freed.
> > > > >>   It looks like my problems with em0 (see thread "em0 hangs
> without
> > > > >>  any messages like "Watchdog timeout", only down/up reset it.")...
> > > > >>  Codebase for em and igb is somewhat common...
> > > > >>
> > > > >> --
> > > > >> // Black Lion AKA Lev Serebryakov <lev@serebryakov.spb.ru>
> > > > >>
> > > > >> I agree.
> > > > >
> > > > > Do you get missed packets in mac_stats (sysctl dev.em | grep
> missed)?
> > > > >
> > > > > I might not have mentioned but I can also 'fix' the problem by
> doing
> > > > > ifconfig igb0 down/up.
> > > > >
> > > > > I will try using POLLING to 'automatize' the reset as you mentioned
> in
> > > your
> > > > > thread.
> > > > >
> > > > > Karim.
> > > > >
> > > > >
> > > > Follow up on tests with POLLING: The problem is still occurring
> although
> > > it
> > > > takes more time ... Outputs of sysctl dev.igb0 and netstat -m will
> > > follow:
> > > >
> > > > 9219/99426/108645 mbufs in use (current/cache/total)
> > > > 9217/90783/100000/100000 mbuf clusters in use
> (current/cache/total/max)
> > >
> > > Do you see network processes are stuck in keglim state? If you see
> > > that I think that's not trivial to solve. You wouldn't even kill
> > > that process if it is under keglim state unless some more mbuf
> > > clusters are freed from other places.
> > >
> >
> > No keglim state, here is a snapshot of top -SH while the problem is
> > happening:
> >
> >    12 root          171 ki31     0K     8K CPU5   5  19:27 100.00% idle:
> > cpu5
> >    10 root          171 ki31     0K     8K CPU7   7  19:26 100.00% idle:
> > cpu7
> >    14 root          171 ki31     0K     8K CPU3   3  19:25 100.00% idle:
> > cpu3
> >    11 root          171 ki31     0K     8K CPU6   6  19:25 100.00% idle:
> > cpu6
> >    13 root          171 ki31     0K     8K CPU4   4  19:24 100.00% idle:
> > cpu4
> >    15 root          171 ki31     0K     8K CPU2   2  19:22 100.00% idle:
> > cpu2
> >    16 root          171 ki31     0K     8K CPU1   1  19:18 100.00% idle:
> > cpu1
> >    17 root          171 ki31     0K     8K RUN    0  19:12 100.00% idle:
> > cpu0
> >    18 root          -32    -     0K     8K WAIT   6   0:04  0.10% swi4:
> > clock s
> >    20 root          -44    -     0K     8K WAIT   4   0:08  0.00% swi1:
> net
> >    29 root          -68    -     0K     8K -      0   0:02  0.00% igb0
> que
> >    35 root          -68    -     0K     8K -      2   0:02  0.00% em1
> taskq
> >    28 root          -68    -     0K     8K WAIT   5   0:01  0.00% irq256:
> > igb0
> >
> > keep in mind that num_queues has been forced to 1.
> >
> >
> > >
> > > I think both igb(4) and em(4) pass received frame to upper stack
> > > before allocating new RX buffer. If driver fails to allocate new RX
> > > buffer driver will try to refill RX buffers in next run. Under
> > > extreme resource shortage case, this situation can produce no more
> > > RX buffers in RX descriptor ring and this will take the box out of
> > > network. Other drivers avoid that situation by allocating new RX
> > > buffer before passing received frame to upper stack. If RX buffer
> > > allocation fails driver will just reuse old RX buffer without
> > > passing received frame to upper stack. That does not completely
> > > solve the keglim issue though. I think you should have enough mbuf
> > > cluters to avoid keglim.
> > >
> > > However the output above indicates you have enough free mbuf
> > > clusters. So I guess igb(4) encountered zero available RX buffer
> > > situation in past but failed to refill the RX buffer again. I guess
> > > driver may be able to periodically check available RX buffers.
> > > Jack may have better idea if this was the case.(CCed)
> > >
> >
> > That is exactly the pattern. The driver runs out of clusters but they
> > eventually get consumed and freed although the driver refuses to process
> any
> > new frames. It is, on the other hand, perfectly capable of sending out
> > packets.
> >
>
> Ok, this clearly indicates igb(4) failed to refill RX buffers since
> you can still send frames. I'm not sure whether igb(4) controllers
> could be configured to generate no RX buffer interrupts but that
> interrupt would be better suited to trigger RX refilling than timer
> based refilling. Since igb(4) keeps track of available RX buffers,
> igb(4) can selectively enable that interrupt once it see no RX
> buffers in the RX descriptor ring. However this does not work with
> polling.
>

I think that your evaluation of the problem is correct although I do not
understand the selective interrupt mechanism you described.

Precisely, the exact same behavior happens (RX hang) if options
DEVICE_POLLING is _not_ used in the kernel configuration file. I tried with
POLLING since someone mentioned that it helped in a case mentioned earlier
today. Unfortunately for igb with or without polling yields the same rx ring
filing problem.

By the way I fixed the subject where I erroneously said TX was hanging while
in fact RX is hanging and TX is just fine.

help

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTikrjkHDaBq%2Bx6MTZhzOeqWA=xtFpqQPsthFGmuf>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation