From owner-freebsd-hardware@FreeBSD.ORG Tue Feb 1 20:15:22 2011 Return-Path: Delivered-To: freebsd-hardware@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 30F4B106566B; Tue, 1 Feb 2011 20:15:22 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost1.sentex.ca (smarthost1-6.sentex.ca [IPv6:2607:f3e0:0:1::12]) by mx1.freebsd.org (Postfix) with ESMTP id D4FB48FC12; Tue, 1 Feb 2011 20:15:21 +0000 (UTC) Received: from [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a] (saphire3.sentex.ca [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a]) by smarthost1.sentex.ca (8.14.4/8.14.4) with ESMTP id p11KFIY8078660 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Tue, 1 Feb 2011 15:15:19 -0500 (EST) (envelope-from mike@sentex.net) Message-ID: <4D4869D1.1010709@sentex.net> Date: Tue, 01 Feb 2011 15:15:13 -0500 From: Mike Tancsa Organization: Sentex Communications User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101207 Thunderbird/3.1.7 MIME-Version: 1.0 To: Jack Vogel References: <1290533941.3173.50.camel@home-yahoo> <4CEC0548.1080801@sentex.net> <4D2C636B.5040003@sentex.net> <4D3C4795.40205@sentex.net> <4D42EA74.4090807@sentex.net> <1296590190.2326.6.camel@hitfishpass-lx.corp.yahoo.com> In-Reply-To: X-Enigmail-Version: 1.1.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.67 on IPv6:2607:f3e0:0:1::12 Cc: "freebsd-net@freebsd.org" , Ivan Voras , Sean Bruno , Jan Koum , "freebsd-hardware@freebsd.org" Subject: Re: em driver, 82574L chip, and possibly ASPM X-BeenThere: freebsd-hardware@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: General discussion of FreeBSD hardware List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Feb 2011 20:15:22 -0000 On 2/1/2011 3:05 PM, Jack Vogel wrote: > At this point I'm open to any ideas, this sounds like a good one Sean, > thanks. > Mike, you want to test this ? Sure, I am feeling lucky ;-) If someone generates the appropriate em diffs for me, I will apply on the box that sees this issue the most. ---Mike > > Jack > > > On Tue, Feb 1, 2011 at 11:56 AM, Sean Bruno wrote: > >> On Fri, 2011-01-28 at 08:10 -0800, Mike Tancsa wrote: >>> On 1/23/2011 10:21 AM, Mike Tancsa wrote: >>>> On 1/21/2011 4:21 AM, Jan Koum wrote: >>>> One other thing I noticed is that when the nic is in its hung state, >> the >>>> WOL option is gone ? >>>> >>>> e.g >>>> >>>> em1: flags=8843 metric 0 mtu >> 1500 >>>> >> options=19b >>>> ether 00:15:17:ed:68:a4 >>>> >>>> vs >>>> >>>> >>>> em1: flags=8843 metric 0 mtu >> 1500 >>>> >>>> >> options=219b >>>> ether 00:15:17:ed:68:a4 >>> >>> >>> Another hang last night :( >>> >>> Whats really strange is that the WOL_MAGIC and TSO4 got turned back on >>> somehow ? I had explicitly turned it off, but when the NIC was in its >>> bad state >>> >>> em1: flags=8843 metric 0 mtu 1500 >>> options=2198 >>> >>> ... its back on along with TSO? Not sure if its coincidence or a side >>> effect or what. For now, I have had to re-purpose this nic to something >>> else. >>> >>> debug info shows >>> >>> Jan 28 00:25:10 backup3 kernel: Interface is RUNNING and INACTIVE >>> Jan 28 00:25:10 backup3 kernel: em1: hw tdh = 625, hw tdt = 625 >>> Jan 28 00:25:10 backup3 kernel: em1: hw rdh = 903, hw rdt = 903 >>> Jan 28 00:25:10 backup3 kernel: em1: Tx Queue Status = 0 >>> Jan 28 00:25:10 backup3 kernel: em1: TX descriptors avail = 1024 >>> Jan 28 00:25:10 backup3 kernel: em1: Tx Descriptors avail failure = 0 >>> Jan 28 00:25:10 backup3 kernel: em1: RX discarded packets = 0 >>> Jan 28 00:25:10 backup3 kernel: em1: RX Next to Check = 903 >>> Jan 28 00:25:10 backup3 kernel: em1: RX Next to Refresh = 904 >>> Jan 28 00:25:27 backup3 kernel: em1: link state changed to DOWN >>> Jan 28 00:25:30 backup3 kernel: em1: link state changed to UP >>> >>> >>> ---Mike >> >> >> I'm trying to get some more testing done regarding my suggestions around >> the OACTIVE assertions in the driver. More or less, it looks like >> intense periods of activity can push the driver into the OACTIVE hold >> off state and the logic isn't quite right in igb(4) or em(4) to handle >> it. >> >> I suspect that something like this modification to igb(4) may be >> required for em(4). >> >> Comments? >> >> Sean >> > -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/