Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 7 Mar 2012 21:26:41 +0000
From:      "Pieper, Jeffrey E" <jeffrey.e.pieper@intel.com>
To:        Jason Wolfe <nitroboost@gmail.com>, "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Subject:   RE: Intel 82574L interface wedging - em7.3.2/8.2-STABLE
Message-ID:  <2A35EA60C3C77D438915767F458D65683CED5D11@ORSMSX101.amr.corp.intel.com>
In-Reply-To: <CAAAm0r281Bs-yKbf6ZjCTGPLf0voh=T10GpBc3e-e%2B=10AvU9g@mail.gmail.com>
References:  <CAAAm0r3Qj%2B2rf8cx54bcyAXGQezcE8J=xXYPq4W-jDy75r8qew@mail.gmail.com> <CAAAm0r281Bs-yKbf6ZjCTGPLf0voh=T10GpBc3e-e%2B=10AvU9g@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
I noticed that your FC counters aren't incrementing either...which is odd i=
f you're missing packets and the rx buffers are full.

Jeff


-----Original Message-----
From: owner-freebsd-net@freebsd.org [mailto:owner-freebsd-net@freebsd.org] =
On Behalf Of Jason Wolfe
Sent: Wednesday, March 07, 2012 12:58 PM
To: freebsd-net@freebsd.org
Subject: Re: Intel 82574L interface wedging - em7.3.2/8.2-STABLE

On Thu, Mar 1, 2012 at 12:31 PM, Jason Wolfe <nitroboost@gmail.com> wrote:
> So since the 7.3.0/7.3.2 code released out of the "Intel 82574L
> interface wedging on em 7.1.9/7.2.3 when MSIX enabled" thread I've
> been having some good results in 8.2-STABLE, and 'wedges' are much
> less common. =A0I am however still seeing them rarely, using some fuzzy
> math based on uptime on the new code and number of boxes, about once
> every 250 days. =A0MUCH better than prior, but wondering if there is
> something else still lingering? =A0It appears to have the same symptoms
> as before with a full buffer, where dropped packets start climbing and
> packets out stall. =A0These servers have MSI-X enabled.
>
> ...
>
> Jason

I'm sure it's getting old with all of the recent work put into the
e1000 driver, but this is still ongoing with MSI-X enabled.  Most
machines are running an 8.2-STABLE from early Feb, though it appears
there have been no relevant changes in RELENG_8 since then.  I've
disabled all possible em options on the devices also to rule that out
and am still seeing the issue.  I guess reverting back to MSI-X
disabled is the next step if nothing is spotted.  This box had been
doing between 1 and 1.5Gb/s steady for the 26 days before the network
hang.

These are probably the points of interest?

dev.em.1.mac_stats.missed_packets: 123241
dev.em.1.mac_stats.recv_no_buff: 29951
dev.em.1.interrupts.rx_overrun: 14

I bounced em1 because dropped packets incremented 653814 to 653916 and
the interface is not incrementing packets out.

12:00PM  up 26 days, 19:52, 0 users, load averages: 1.31, 1.43, 1.55

em0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=3D88<VLAN_MTU,VLAN_HWCSUM>
	ether 00:25:90:2c:c3:a5
	inet6 X%em0 prefixlen 64 scopeid 0x1
	nd6 options=3D1<PERFORMNUD>
	media: Ethernet autoselect (1000baseT <full-duplex>)
	status: active
em1: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=3D88<VLAN_MTU,VLAN_HWCSUM>
	ether 00:25:90:2c:c3:a5
	inet6 X%em1 prefixlen 64 scopeid 0x2
	nd6 options=3D3<PERFORMNUD,ACCEPT_RTADV>
	media: Ethernet autoselect (1000baseT <full-duplex>)
	status: active
ipfw0: flags=3D8801<UP,SIMPLEX,MULTICAST> metric 0 mtu 65536
lo0: flags=3D8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
	options=3D3<RXCSUM,TXCSUM>
	inet 127.0.0.1 netmask 0xff000000
	inet6 ::1 prefixlen 128
	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4
	inet X.X.X.X netmask 0xffffffff
	inet X.X.X.X netmask 0xffffffff
	inet X.X.X.X netmask 0xffffffff
	nd6 options=3D3<PERFORMNUD,ACCEPT_RTADV>
lagg0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 15=
00
	options=3D88<VLAN_MTU,VLAN_HWCSUM>
	ether 00:25:90:2c:c3:a5
	inet X.X.X.X netmask 0xffffff80 broadcast X.X.X.X
	inet6 fe80::225:90ff:fe2c:c3a5%lagg0 prefixlen 64 scopeid 0x5
	inet6 2607:f4e8:320:14:225:90ff:fe2c:c3a5 prefixlen 64 autoconf
	nd6 options=3D3<PERFORMNUD,ACCEPT_RTADV>
	media: Ethernet autoselect
	status: active
	laggproto loadbalance
	laggport: em0 flags=3D4<ACTIVE>
	laggport: em1 flags=3D4<ACTIVE>

interrupt                          total       rate
irq3: uart1                        13987          0
cpu0: timer                   4635919565       2000
irq256: em0:rx 0               145695212         62
irq257: em0:tx 0             12311594799       5311
irq258: em0:link                       4          0
irq259: em1:rx 0             14488041284       6250
irq260: em1:tx 0             12316161706       5313
irq261: em1:link                   28369          0
irq262: mps0                  1806741441        779
cpu2: timer                   4635903792       2000
cpu3: timer                   4635903694       2000
cpu1: timer                   4635903765       2000
Total                        59611907618      25718

25737/12138/37875 mbufs in use (current/cache/total)
4947/4053/9000/5956826 mbuf clusters in use (current/cache/total/max)
4947/887 mbuf+clusters out of packet secondary zone in use (current/cache)
14038/914/14952/2978413 4k (page size) jumbo clusters in use
(current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
72480K/14796K/87276K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
223077 requests for I/O initiated by sendfile
0 calls to protocol drain routines

Name    Mtu Network       Address              Ipkts Ierrs Idrop
Opkts Oerrs  Coll Drop
em0    1500 <Link#1>      00:25:90:2c:c3:a5 144343665     0     0
77957281762     0     0 1163360
em0    1500 fe80::225:90f fe80::225:90ff:fe        0     -     -
 5     -     -    -
em1    1500 <Link#2>      00:25:90:2c:c3:a5 72087463889 123241     0
77376157120     0     0 654252
em1    1500 fe80::225:90f fe80::225:90ff:fe        0     -     -
 1     -     -    -
lagg0  1500 <Link#5>      00:25:90:2c:c3:a5 72231348146     0     0
155301336362 1819909     0    0
lagg0  1500 X.X.X.X X.X.X.X     68503722172     -     - 155283480109
  -     -    -
lagg0  1500 fe80::225:90f fe80::225:90ff:fe     5059     -     -
5150     -     -    -
lagg0  1500 2607:f4e8:320 2607:f4e8:320:14: 44545526     -     -
46097056     -     -    -

kern.msgbuf:  <6>arp: X.X.X.X moved from 00:25:90:0e:ab:c7 to
00:25:90:0e:ab:c6 on lagg0
<6>arp: X.X.X.X moved from 00:25:90:0e:ab:c6 to 00:25:90:0e:ab:c7 on lagg0
<6>arp: X.X.X.X moved from 00:25:90:0e:ae:50 to 00:25:90:0e:ae:51 on lagg0
<6>arp: X.X.X.X moved from 00:25:90:0e:ad:7c to 00:25:90:0e:ad:7d on lagg0
Interface is RUNNING and ACTIVE
em0: hw tdh =3D 1385, hw tdt =3D 1385
em0: hw rdh =3D 452, hw rdt =3D 451
em0: Tx Queue Status =3D 0
em0: TX descriptors avail =3D 2048
em0: Tx Descriptors avail failure =3D 0
em0: RX discarded packets =3D 0
em0: RX Next to Check =3D 452
em0: RX Next to Refresh =3D 451
Interface is RUNNING and ACTIVE
em1: hw tdh =3D 221, hw tdt =3D 342
em1: hw rdh =3D 335, hw rdt =3D 233
em1: Tx Queue Status =3D 0
em1: TX descriptors avail =3D 2048
em1: Tx Descriptors avail failure =3D 0
em1: RX discarded packets =3D 0
em1: RX Next to Check =3D 832
em1: RX Next to Refresh =3D 838

Mar  7 12:00:07 cds447 kernel: Interface is RUNNING and ACTIVE
Mar  7 12:00:07 cds447 kernel: em0: hw tdh =3D 80, hw tdt =3D 83
Mar  7 12:00:07 cds447 kernel: em0: hw rdh =3D 1871, hw rdt =3D 1870
Mar  7 12:00:07 cds447 kernel: em0: Tx Queue Status =3D 1
Mar  7 12:00:07 cds447 kernel: em0: TX descriptors avail =3D 2039
Mar  7 12:00:07 cds447 kernel: em0: Tx Descriptors avail failure =3D 0
Mar  7 12:00:07 cds447 kernel: em0: RX discarded packets =3D 0
Mar  7 12:00:07 cds447 kernel: em0: RX Next to Check =3D 1872
Mar  7 12:00:07 cds447 kernel: em0: RX Next to Refresh =3D 1871
Mar  7 12:00:07 cds447 kernel: Interface is RUNNING and ACTIVE
Mar  7 12:00:07 cds447 kernel: em1: hw tdh =3D 1897, hw tdt =3D 1897
Mar  7 12:00:07 cds447 kernel: em1: hw rdh =3D 627, hw rdt =3D 625
Mar  7 12:00:07 cds447 kernel: em1: Tx Queue Status =3D 0
Mar  7 12:00:07 cds447 kernel: em1: TX descriptors avail =3D 2048
Mar  7 12:00:07 cds447 kernel: em1: Tx Descriptors avail failure =3D 0
Mar  7 12:00:07 cds447 kernel: em1: RX discarded packets =3D 0
Mar  7 12:00:07 cds447 kernel: em1: RX Next to Check =3D 740
Mar  7 12:00:07 cds447 kernel: em1: RX Next to Refresh =3D 758

net.inet.ip.intr_queue_maxlen: 512
net.inet.ip.intr_queue_drops: 0
dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.3.2
dev.em.0.%driver: em
dev.em.0.%location: slot=3D0 function=3D0
dev.em.0.%pnpinfo: vendor=3D0x8086 device=3D0x10d3 subvendor=3D0x15d9
subdevice=3D0x10d3 class=3D0x020000
dev.em.0.%parent: pci1
dev.em.0.nvm: -1
dev.em.0.debug: -1
dev.em.0.fc: 3
dev.em.0.rx_int_delay: 0
dev.em.0.tx_int_delay: 66
dev.em.0.rx_abs_int_delay: 66
dev.em.0.tx_abs_int_delay: 66
dev.em.0.rx_processing_limit: 100
dev.em.0.eee_control: 0
dev.em.0.link_irq: 4
dev.em.0.mbuf_alloc_fail: 0
dev.em.0.cluster_alloc_fail: 0
dev.em.0.dropped: 0
dev.em.0.tx_dma_fail: 0
dev.em.0.rx_overruns: 0
dev.em.0.watchdog_timeouts: 0
dev.em.0.device_control: 1049160
dev.em.0.rx_control: 67141634
dev.em.0.fc_high_water: 18432
dev.em.0.fc_low_water: 16932
dev.em.0.queue0.txd_head: 1678
dev.em.0.queue0.txd_tail: 1678
dev.em.0.queue0.tx_irq: 12311601270
dev.em.0.queue0.no_desc_avail: 0
dev.em.0.queue0.rxd_head: 1911
dev.em.0.queue0.rxd_tail: 1910
dev.em.0.queue0.rx_irq: 145694826
dev.em.0.mac_stats.excess_coll: 0
dev.em.0.mac_stats.single_coll: 0
dev.em.0.mac_stats.multiple_coll: 0
dev.em.0.mac_stats.late_coll: 0
dev.em.0.mac_stats.collision_count: 0
dev.em.0.mac_stats.symbol_errors: 0
dev.em.0.mac_stats.sequence_errors: 0
dev.em.0.mac_stats.defer_count: 0
dev.em.0.mac_stats.missed_packets: 0
dev.em.0.mac_stats.recv_no_buff: 0
dev.em.0.mac_stats.recv_undersize: 0
dev.em.0.mac_stats.recv_fragmented: 0
dev.em.0.mac_stats.recv_oversize: 0
dev.em.0.mac_stats.recv_jabber: 0
dev.em.0.mac_stats.recv_errs: 0
dev.em.0.mac_stats.crc_errs: 0
dev.em.0.mac_stats.alignment_errs: 0
dev.em.0.mac_stats.coll_ext_errs: 0
dev.em.0.mac_stats.xon_recvd: 0
dev.em.0.mac_stats.xon_txd: 0
dev.em.0.mac_stats.xoff_recvd: 0
dev.em.0.mac_stats.xoff_txd: 0
dev.em.0.mac_stats.total_pkts_recvd: 144436808
dev.em.0.mac_stats.good_pkts_recvd: 144436808
dev.em.0.mac_stats.bcast_pkts_recvd: 143716731
dev.em.0.mac_stats.mcast_pkts_recvd: 80138
dev.em.0.mac_stats.rx_frames_64: 143870227
dev.em.0.mac_stats.rx_frames_65_127: 359642
dev.em.0.mac_stats.rx_frames_128_255: 78152
dev.em.0.mac_stats.rx_frames_256_511: 10711
dev.em.0.mac_stats.rx_frames_512_1023: 7660
dev.em.0.mac_stats.rx_frames_1024_1522: 110416
dev.em.0.mac_stats.good_octets_recvd: 9421229514
dev.em.0.mac_stats.good_octets_txd: 104885010774599
dev.em.0.mac_stats.total_pkts_txd: 77957411708
dev.em.0.mac_stats.good_pkts_txd: 77957411708
dev.em.0.mac_stats.bcast_pkts_txd: 36
dev.em.0.mac_stats.mcast_pkts_txd: 15464
dev.em.0.mac_stats.tx_frames_64: 296356748
dev.em.0.mac_stats.tx_frames_65_127: 7232947896
dev.em.0.mac_stats.tx_frames_128_255: 58936912
dev.em.0.mac_stats.tx_frames_256_511: 118196124
dev.em.0.mac_stats.tx_frames_512_1023: 763686823
dev.em.0.mac_stats.tx_frames_1024_1522: 69487287205
dev.em.0.mac_stats.tso_txd: 0
dev.em.0.mac_stats.tso_ctx_fail: 0
dev.em.0.interrupts.asserts: 6
dev.em.0.interrupts.rx_pkt_timer: 0
dev.em.0.interrupts.rx_abs_timer: 0
dev.em.0.interrupts.tx_pkt_timer: 0
dev.em.0.interrupts.tx_abs_timer: 0
dev.em.0.interrupts.tx_queue_empty: 0
dev.em.0.interrupts.tx_queue_min_thresh: 0
dev.em.0.interrupts.rx_desc_min_thresh: 0
dev.em.0.interrupts.rx_overrun: 0
dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 7.3.2
dev.em.1.%driver: em
dev.em.1.%location: slot=3D0 function=3D0
dev.em.1.%pnpinfo: vendor=3D0x8086 device=3D0x10d3 subvendor=3D0x15d9
subdevice=3D0x10d3 class=3D0x020000
dev.em.1.%parent: pci2
dev.em.1.nvm: -1
dev.em.1.debug: -1
dev.em.1.fc: 3
dev.em.1.rx_int_delay: 0
dev.em.1.tx_int_delay: 66
dev.em.1.rx_abs_int_delay: 66
dev.em.1.tx_abs_int_delay: 66
dev.em.1.rx_processing_limit: 100
dev.em.1.eee_control: 0
dev.em.1.link_irq: 26412
dev.em.1.mbuf_alloc_fail: 0
dev.em.1.cluster_alloc_fail: 0
dev.em.1.dropped: 0
dev.em.1.tx_dma_fail: 0
dev.em.1.rx_overruns: 0
dev.em.1.watchdog_timeouts: 0
dev.em.1.device_control: 1049160
dev.em.1.rx_control: 67141634
dev.em.1.fc_high_water: 18432
dev.em.1.fc_low_water: 16932
dev.em.1.queue0.txd_head: 1897
dev.em.1.queue0.txd_tail: 1897
dev.em.1.queue0.tx_irq: 12316157844
dev.em.1.queue0.no_desc_avail: 0
dev.em.1.queue0.rxd_head: 1065
dev.em.1.queue0.rxd_tail: 1064
dev.em.1.queue0.rx_irq: 14386949031
dev.em.1.mac_stats.excess_coll: 0
dev.em.1.mac_stats.single_coll: 0
dev.em.1.mac_stats.multiple_coll: 0
dev.em.1.mac_stats.late_coll: 0
dev.em.1.mac_stats.collision_count: 0
dev.em.1.mac_stats.symbol_errors: 0
dev.em.1.mac_stats.sequence_errors: 0
dev.em.1.mac_stats.defer_count: 0
dev.em.1.mac_stats.missed_packets: 123241   <---------
dev.em.1.mac_stats.recv_no_buff: 29951  <----------
dev.em.1.mac_stats.recv_undersize: 0
dev.em.1.mac_stats.recv_fragmented: 0
dev.em.1.mac_stats.recv_oversize: 0
dev.em.1.mac_stats.recv_jabber: 0
dev.em.1.mac_stats.recv_errs: 0
dev.em.1.mac_stats.crc_errs: 0
dev.em.1.mac_stats.alignment_errs: 0
dev.em.1.mac_stats.coll_ext_errs: 0
dev.em.1.mac_stats.xon_recvd: 0
dev.em.1.mac_stats.xon_txd: 0
dev.em.1.mac_stats.xoff_recvd: 0
dev.em.1.mac_stats.xoff_txd: 0
dev.em.1.mac_stats.total_pkts_recvd: 72087571174
dev.em.1.mac_stats.good_pkts_recvd: 72087447931
dev.em.1.mac_stats.bcast_pkts_recvd: 143701030
dev.em.1.mac_stats.mcast_pkts_recvd: 80133
dev.em.1.mac_stats.rx_frames_64: 22231874204
dev.em.1.mac_stats.rx_frames_65_127: 36577432236
dev.em.1.mac_stats.rx_frames_128_255: 63194749
dev.em.1.mac_stats.rx_frames_256_511: 181871202
dev.em.1.mac_stats.rx_frames_512_1023: 249092520
dev.em.1.mac_stats.rx_frames_1024_1522: 12783983019
dev.em.1.mac_stats.good_octets_recvd: 23699741522520
dev.em.1.mac_stats.good_octets_txd: 104452730965535
dev.em.1.mac_stats.total_pkts_txd: 77376109147
dev.em.1.mac_stats.good_pkts_txd: 77376109142
dev.em.1.mac_stats.bcast_pkts_txd: 15732
dev.em.1.mac_stats.mcast_pkts_txd: 12
dev.em.1.mac_stats.tx_frames_64: 261580498
dev.em.1.mac_stats.tx_frames_65_127: 6970447634
dev.em.1.mac_stats.tx_frames_128_255: 57377612
dev.em.1.mac_stats.tx_frames_256_511: 113729872
dev.em.1.mac_stats.tx_frames_512_1023: 753508328
dev.em.1.mac_stats.tx_frames_1024_1522: 69219465205
dev.em.1.mac_stats.tso_txd: 0
dev.em.1.mac_stats.tso_ctx_fail: 0
dev.em.1.interrupts.asserts: 21011
dev.em.1.interrupts.rx_pkt_timer: 1
dev.em.1.interrupts.rx_abs_timer: 0
dev.em.1.interrupts.tx_pkt_timer: 0
dev.em.1.interrupts.tx_abs_timer: 1
dev.em.1.interrupts.tx_queue_empty: 0
dev.em.1.interrupts.tx_queue_min_thresh: 0
dev.em.1.interrupts.rx_desc_min_thresh: 0
dev.em.1.interrupts.rx_overrun: 14   <------
hw.em.eee_setting: 0
hw.em.rx_process_limit: 100
hw.em.enable_msix: 1
hw.em.sbp: 0
hw.em.smart_pwr_down: 0
hw.em.txd: 2048
hw.em.rxd: 2048
hw.em.rx_abs_int_delay: 66
hw.em.tx_abs_int_delay: 66
hw.em.rx_int_delay: 0
hw.em.tx_int_delay: 66

Jason
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2A35EA60C3C77D438915767F458D65683CED5D11>