FreeBSD Mail Archives

Date:      Fri, 12 Sep 2014 12:06:13 -0400
From:      Mike Tancsa <mike@sentex.net>
To:        Jack Vogel <jfvogel@gmail.com>
Cc:        "stable@freebsd.org" <stable@freebsd.org>
Subject:   Re: svn commit: r267935 - head/sys/dev/e1000 (with work around?)
Message-ID:  <541319F5.1020502@sentex.net>
In-Reply-To: <5412FEAB.1050707@sentex.net>
References:  <201406262133.s5QLXXP8029811@svn.freebsd.org> <b0afc76e77f28b14683094e1b59a4ccf@eumx.net> <CALCpEUFL26Pg%2BpYoP4KKEAzBsFF8fDunMewy%2BgqwU7o4ob8Zeg@mail.gmail.com> <CAFOYbcnLBW-AUHQx7KMQsxRE_Xy-1_ia2dCY4MeGV_8LWgrHDw@mail.gmail.com> <CAPyFy2AvMf42QGsYDrEb5E6%2Bse8scF9BXcwUugjCtx4t2D8sJA@mail.gmail.com> <CAFOYbc=i%2B=Gv6=_WPcXSo=Ds1Y3mw6mtevPGCxQ5HJPtu55mOw@mail.gmail.com> <20140804212220.GC48614@rancor.immure.com> <CAFOYbc=5wyo%2BbKwxdhsORH6WRRRDZReitL5wrCnp9dgT7qAVrQ@mail.gmail.com> <20140805130144.GF40246@rancor.immure.com> <CAFOYbcmuA1aiDCkHuJK%2B0EfO%2BnfGr4UrnJ7zB1rnPaSJn4AyQA@mail.gmail.com> <53E51D62.9000507@sentex.net> <53E52762.7040300@sentex.net> <53E536AC.9060304@sentex.net> <53E572B6.1090908@sentex.net> <5412FEAB.1050707@sentex.net>

index | next in thread | previous in thread | raw e-mail


On 9/12/2014 10:09 AM, Mike Tancsa wrote:
>
> FYI, I just ran into this bug on another box, with an onboard em nic, so
> I dont think its a one off hardware issue. AMD64,  FreeBSD 10.0-STABLE
> #4 r270560:
> This is on an Intel MB S1200BTL ( S1200BT.86B.02.00.0035.030220120927)
>
> Unfortunately, this is also a production box so its difficult to test. I
> am going to see if I can find a similar MB to test against.


I found another board I can test with. It takes a bit of random traffic 
to wedge, but I can lock up the NIC to the point where I have to down 
and up it

When the NIC is wedged, sending sysctl -w em.1.debug=1 shows


Sep 12 11:05:05 backup3 kernel: Interface is RUNNING and ACTIVE
Sep 12 11:05:05 backup3 kernel: em1: hw tdh = 414, hw tdt = 980
Sep 12 11:05:05 backup3 kernel: em1: hw rdh = 768, hw rdt = 767
Sep 12 11:05:05 backup3 kernel: em1: Tx Queue Status = 1
Sep 12 11:05:05 backup3 kernel: em1: TX descriptors avail = 449
Sep 12 11:05:05 backup3 kernel: em1: Tx Descriptors avail failure = 3
Sep 12 11:05:05 backup3 kernel: em1: RX discarded packets = 0
Sep 12 11:05:05 backup3 kernel: em1: RX Next to Check = 768
Sep 12 11:05:05 backup3 kernel: em1: RX Next to Refresh = 767

em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
 
options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO>
         ether 00:15:17:ed:68:a4
         inet 1.1.1.2 netmask 0xffffff00 broadcast 1.1.1.255
         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
         media: Ethernet autoselect (1000baseT <full-duplex>)
         status: active

The network traffic involves sending a lot of traffic via NFS.  I found 
that if I disable TSO on the nic, it seems to fix the problem, or at 
least makes its hard to reproduce. With tso enabled, it took perhaps 
30-120 seconds for the problem to manifest.  Both on my test and 
production box, I have not run into the problem in the past 45min.

	---Mike




-- 
-------------------
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, mike@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/

home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?541319F5.1020502>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation