FreeBSD Mail Archives

Date:      Mon, 7 Feb 2011 21:21:45 -0500
From:      Karim Fodil-Lemelin <fodillemlinkarim@gmail.com>
To:        pyunyh@gmail.com
Cc:        jfv@freebsd.org, freebsd-net@freebsd.org
Subject:   Re: Fwd: igb driver tx hangs when out of mbuf clusters
Message-ID:  <AANLkTi=5QtpX0LJDUYwRYxA403eK4MOZFRDxv2CS=bDJ@mail.gmail.com>
In-Reply-To: <20110207235811.GA1306@michelle.cdnetworks.com>
References:  <AANLkTim=OYB5cC1H86N_-tDW1w_ipR5-gZjZnT6k%2BMv5@mail.gmail.com> <10510673199.20110207203507@serebryakov.spb.ru> <AANLkTing_L5eLe09xzwgmtF4cp4qO8n8mdRsp4d4ZAxY@mail.gmail.com> <AANLkTikPEK_69if-gZ0nygdLBOTtrUZmPNTnyhHtJr6K@mail.gmail.com> <AANLkTinuLMVif-tNE3s%2B%2BtJccm%2B-_3qvM=Z4BtTF0%2B1q@mail.gmail.com> <AANLkTi=AT6tVHbG-jz01275-8pMzVFfHiU9Hz5BzF4yZ@mail.gmail.com> <20110207235811.GA1306@michelle.cdnetworks.com>

2011/2/7 Pyun YongHyeon <pyunyh@gmail.com>

> On Mon, Feb 07, 2011 at 05:33:47PM -0500, Karim Fodil-Lemelin wrote:
> > Subject: Re: igb driver tx hangs when out of mbuf clusters
> >
> > > To: Lev Serebryakov <lev@serebryakov.spb.ru>
> > > Cc: freebsd-net@freebsd.org
> > >
> > >
> > > 2011/2/7 Lev Serebryakov <lev@serebryakov.spb.ru>
> > >
> > > Hello, Karim.
> > >> You wrote 7 =D1=84=D0=B5=D0=B2=D1=80=D0=B0=D0=BB=D1=8F 2011 =D0=B3.,=
 19:58:04:
> > >>
> > >>
> > >> > The issue is with the igb driver from 7.4 RC3 r218406. If the driv=
er
> > >> runs
> > >> > out of mbuf clusters it simply stops receiving even after the
> clusters
> > >> have
> > >> > been freed.
> > >>   It looks like my problems with em0 (see thread "em0 hangs without
> > >>  any messages like "Watchdog timeout", only down/up reset it.")...
> > >>  Codebase for em and igb is somewhat common...
> > >>
> > >> --
> > >> // Black Lion AKA Lev Serebryakov <lev@serebryakov.spb.ru>
> > >>
> > >> I agree.
> > >
> > > Do you get missed packets in mac_stats (sysctl dev.em | grep missed)?
> > >
> > > I might not have mentioned but I can also 'fix' the problem by doing
> > > ifconfig igb0 down/up.
> > >
> > > I will try using POLLING to 'automatize' the reset as you mentioned i=
n
> your
> > > thread.
> > >
> > > Karim.
> > >
> > >
> > Follow up on tests with POLLING: The problem is still occurring althoug=
h
> it
> > takes more time ... Outputs of sysctl dev.igb0 and netstat -m will
> follow:
> >
> > 9219/99426/108645 mbufs in use (current/cache/total)
> > 9217/90783/100000/100000 mbuf clusters in use (current/cache/total/max)
>
> Do you see network processes are stuck in keglim state? If you see
> that I think that's not trivial to solve. You wouldn't even kill
> that process if it is under keglim state unless some more mbuf
> clusters are freed from other places.
>

No keglim state, here is a snapshot of top -SH while the problem is
happening:

   12 root          171 ki31     0K     8K CPU5   5  19:27 100.00% idle:
cpu5
   10 root          171 ki31     0K     8K CPU7   7  19:26 100.00% idle:
cpu7
   14 root          171 ki31     0K     8K CPU3   3  19:25 100.00% idle:
cpu3
   11 root          171 ki31     0K     8K CPU6   6  19:25 100.00% idle:
cpu6
   13 root          171 ki31     0K     8K CPU4   4  19:24 100.00% idle:
cpu4
   15 root          171 ki31     0K     8K CPU2   2  19:22 100.00% idle:
cpu2
   16 root          171 ki31     0K     8K CPU1   1  19:18 100.00% idle:
cpu1
   17 root          171 ki31     0K     8K RUN    0  19:12 100.00% idle:
cpu0
   18 root          -32    -     0K     8K WAIT   6   0:04  0.10% swi4:
clock s
   20 root          -44    -     0K     8K WAIT   4   0:08  0.00% swi1: net
   29 root          -68    -     0K     8K -      0   0:02  0.00% igb0 que
   35 root          -68    -     0K     8K -      2   0:02  0.00% em1 taskq
   28 root          -68    -     0K     8K WAIT   5   0:01  0.00% irq256:
igb0

keep in mind that num_queues has been forced to 1.


>
> I think both igb(4) and em(4) pass received frame to upper stack
> before allocating new RX buffer. If driver fails to allocate new RX
> buffer driver will try to refill RX buffers in next run. Under
> extreme resource shortage case, this situation can produce no more
> RX buffers in RX descriptor ring and this will take the box out of
> network. Other drivers avoid that situation by allocating new RX
> buffer before passing received frame to upper stack. If RX buffer
> allocation fails driver will just reuse old RX buffer without
> passing received frame to upper stack. That does not completely
> solve the keglim issue though. I think you should have enough mbuf
> cluters to avoid keglim.
>
> However the output above indicates you have enough free mbuf
> clusters. So I guess igb(4) encountered zero available RX buffer
> situation in past but failed to refill the RX buffer again. I guess
> driver may be able to periodically check available RX buffers.
> Jack may have better idea if this was the case.(CCed)
>

That is exactly the pattern. The driver runs out of clusters but they
eventually get consumed and freed although the driver refuses to process an=
y
new frames. It is, on the other hand, perfectly capable of sending out
packets.


> > 0/640 mbuf+clusters out of packet secondary zone in use (current/cache)
> > 0/12800/12800/12800 4k (page size) jumbo clusters in use
> > (current/cache/total/max)
> > 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
> > 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
> > 20738K/257622K/278361K bytes allocated to network (current/cache/total)
> > 0/291/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
> > 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
> > 0/5/6656 sfbufs in use (current/peak/max)
> > 0 requests for sfbufs denied
> > 0 requests for sfbufs delayed
> > 0 requests for I/O initiated by sendfile
> > 0 calls to protocol drain routines
> >
> > dev.igb.0.%desc: Intel(R) PRO/1000 Network Connection version - 2.0.7
> > dev.igb.0.%driver: igb
> > dev.igb.0.%location: slot=3D0 function=3D0
> > dev.igb.0.%pnpinfo: vendor=3D0x8086 device=3D0x10a7 subvendor=3D0x8086
> > subdevice=3D0x0000 class=3D0x020000
> > dev.igb.0.%parent: pci7
> > dev.igb.0.nvm: -1
> > dev.igb.0.flow_control: 3
> > dev.igb.0.enable_aim: 1
> > dev.igb.0.rx_processing_limit: 100
> > dev.igb.0.link_irq: 4
> > dev.igb.0.dropped: 0
> > dev.igb.0.tx_dma_fail: 0
> > dev.igb.0.rx_overruns: 464
> > dev.igb.0.watchdog_timeouts: 0
> > dev.igb.0.device_control: 1490027073
> > dev.igb.0.rx_control: 67141658
> > dev.igb.0.interrupt_mask: 0
> > dev.igb.0.extended_int_mask: 0
> > dev.igb.0.tx_buf_alloc: 14
> > dev.igb.0.rx_buf_alloc: 34
> > dev.igb.0.fc_high_water: 29488
> > dev.igb.0.fc_low_water: 29480
> > dev.igb.0.queue0.interrupt_rate: 111111
> > dev.igb.0.queue0.txd_head: 877
> > dev.igb.0.queue0.txd_tail: 877
> > dev.igb.0.queue0.no_desc_avail: 0
> > dev.igb.0.queue0.tx_packets: 92013
> > dev.igb.0.queue0.rxd_head: 570
> > dev.igb.0.queue0.rxd_tail: 570
> > dev.igb.0.queue0.rx_packets: 163386
> > dev.igb.0.queue0.rx_bytes: 240260310
> > dev.igb.0.queue0.lro_queued: 0
> > dev.igb.0.queue0.lro_flushed: 0
> > dev.igb.0.mac_stats.excess_coll: 0
> > dev.igb.0.mac_stats.single_coll: 0
> > dev.igb.0.mac_stats.multiple_coll: 0
> > dev.igb.0.mac_stats.late_coll: 0
> > dev.igb.0.mac_stats.collision_count: 0
> > dev.igb.0.mac_stats.symbol_errors: 0
> > dev.igb.0.mac_stats.sequence_errors: 0
> > dev.igb.0.mac_stats.defer_count: 0
> > dev.igb.0.mac_stats.missed_packets: 3104
> > dev.igb.0.mac_stats.recv_no_buff: 4016
> > dev.igb.0.mac_stats.recv_undersize: 0
> > dev.igb.0.mac_stats.recv_fragmented: 0
> > dev.igb.0.mac_stats.recv_oversize: 0
> > dev.igb.0.mac_stats.recv_jabber: 0
> > dev.igb.0.mac_stats.recv_errs: 0
> > dev.igb.0.mac_stats.crc_errs: 0
> > dev.igb.0.mac_stats.alignment_errs: 0
> > dev.igb.0.mac_stats.coll_ext_errs: 0
> > dev.igb.0.mac_stats.xon_recvd: 0
> > dev.igb.0.mac_stats.xon_txd: 346
> > dev.igb.0.mac_stats.xoff_recvd: 0
> > dev.igb.0.mac_stats.xoff_txd: 3450
> > dev.igb.0.mac_stats.total_pkts_recvd: 166515
> > dev.igb.0.mac_stats.good_pkts_recvd: 163411
> > dev.igb.0.mac_stats.bcast_pkts_recvd: 0
> > dev.igb.0.mac_stats.mcast_pkts_recvd: 51
> > dev.igb.0.mac_stats.rx_frames_64: 10
> > dev.igb.0.mac_stats.rx_frames_65_127: 1601
> > dev.igb.0.mac_stats.rx_frames_128_255: 53
> > dev.igb.0.mac_stats.rx_frames_256_511: 42
> > dev.igb.0.mac_stats.rx_frames_512_1023: 18
> > dev.igb.0.mac_stats.rx_frames_1024_1522: 161687
> > dev.igb.0.mac_stats.good_octets_recvd: 240948229
> > dev.igb.0.mac_stats.good_octets_txd: 5947150
> > dev.igb.0.mac_stats.total_pkts_txd: 95809
> > dev.igb.0.mac_stats.good_pkts_txd: 92013
> > dev.igb.0.mac_stats.bcast_pkts_txd: 1516
> > dev.igb.0.mac_stats.mcast_pkts_txd: 1817
> > dev.igb.0.mac_stats.tx_frames_64: 90302
> > dev.igb.0.mac_stats.tx_frames_65_127: 1711
> > dev.igb.0.mac_stats.tx_frames_128_255: 0
> > dev.igb.0.mac_stats.tx_frames_256_511: 0
> > dev.igb.0.mac_stats.tx_frames_512_1023: 0
> > dev.igb.0.mac_stats.tx_frames_1024_1522: 0
> > dev.igb.0.mac_stats.tso_txd: 0
> > dev.igb.0.mac_stats.tso_ctx_fail: 0
> > dev.igb.0.interrupts.asserts: 5584
> > dev.igb.0.interrupts.rx_pkt_timer: 163411
> > dev.igb.0.interrupts.rx_abs_timer: 163386
> > dev.igb.0.interrupts.tx_pkt_timer: 92013
> > dev.igb.0.interrupts.tx_abs_timer: 0
> > dev.igb.0.interrupts.tx_queue_empty: 92013
> > dev.igb.0.interrupts.tx_queue_min_thresh: 0
> > dev.igb.0.interrupts.rx_desc_min_thresh: 19
> > dev.igb.0.interrupts.rx_overrun: 0
> > dev.igb.0.host.breaker_tx_pkt: 0
> > dev.igb.0.host.host_tx_pkt_discard: 0
> > dev.igb.0.host.rx_pkt: 0
> > dev.igb.0.host.breaker_rx_pkts: 0
> > dev.igb.0.host.breaker_rx_pkt_drop: 0
> > dev.igb.0.host.tx_good_pkt: 0
> > dev.igb.0.host.breaker_tx_pkt_drop: 0
> > dev.igb.0.host.rx_good_bytes: 240948229
> > dev.igb.0.host.tx_good_bytes: 5947150
> > dev.igb.0.host.length_errors: 0
> > dev.igb.0.host.serdes_violation_pkt: 0
> > dev.igb.0.host.header_redir_missed: 0
>

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTi=5QtpX0LJDUYwRYxA403eK4MOZFRDxv2CS=bDJ>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation