From owner-freebsd-stable@FreeBSD.ORG Sat Apr 24 04:59:19 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9331E106564A for ; Sat, 24 Apr 2010 04:59:19 +0000 (UTC) (envelope-from yanefbsd@gmail.com) Received: from mail-qy0-f181.google.com (mail-qy0-f181.google.com [209.85.221.181]) by mx1.freebsd.org (Postfix) with ESMTP id 445908FC0C for ; Sat, 24 Apr 2010 04:59:18 +0000 (UTC) Received: by qyk11 with SMTP id 11so12430883qyk.13 for ; Fri, 23 Apr 2010 21:59:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=O45gcc2PC0GeRlDLqVYWnBlK96F0j4v1MNrIy+WBUN0=; b=VVHjFt06gYVUsoB6hvdnNM/r/mepV4yEpF7xxlERkTQ0OJNXYmPfz3MmHqUhzMR0gE 7MCKvG0EEe/mLk2o0kNxzsnpFDzht5xTidMAQr5k8aikUDp+4hG9JUp+wsI7hCQWIDHj kEA4o0Puf7SWt0qWkcHc2Bxdyafg261rty29Q= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=URsVOeqerkvHAbpKnwQGLh4jeb628u1hSVyy1VxmIp/3W8Wr0NVjiXkI9gn7vdVEVp nYN4HQMYpKmgmHz0XNgOSQIdijdUQ1latvNRpfE8w71m3EeO/d5mNRKsF2lg0yqPQA2j XIqjnlVnN/TPNwZgokKwjuiGxHXN8Fj8Ji0Ms= MIME-Version: 1.0 Received: by 10.229.230.65 with SMTP id jl1mr1260489qcb.7.1272085158001; Fri, 23 Apr 2010 21:59:18 -0700 (PDT) Received: by 10.229.233.11 with HTTP; Fri, 23 Apr 2010 21:59:17 -0700 (PDT) In-Reply-To: References: <20100418081400.GA40496@mx.techwires.net> Date: Fri, 23 Apr 2010 21:59:17 -0700 Message-ID: From: Garrett Cooper To: Brandon Gooch Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: =?ISO-8859-1?Q?Olivier_Cochard=2DLabb=E9?= , Bernhard Schmidt , freebsd-stable@freebsd.org Subject: Re: iwn firmware instability with an up-to-date stable kernel X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 Apr 2010 04:59:19 -0000 On Fri, Apr 23, 2010 at 9:42 PM, Garrett Cooper wrote: > On Fri, Apr 23, 2010 at 8:05 PM, Brandon Gooch > wrote: >> 2010/4/23 Garrett Cooper : >>> 2010/4/23 Garrett Cooper : >>>> 2010/4/18 Olivier Cochard-Labb=E9 : >>>>> 2010/4/18 Bernhard Schmidt : >>>>>> Are you able to reproduce this on demand? As in type a few commands = and >>>>>> the firmware error occurs? >>>>>> >>>>> >>>>> No, I'm not able to reproduce on demand this problem. >>>> >>>> I'm seeing similar issues on occasion with my Lenovo as well: >>>> >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: firmware error log: >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: error type =A0 =A0 =A0=3D >>>> "NMI_INTERRUPT_WDG" (0x00000004) >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: program counter =3D 0x0000046C >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: source line =A0 =A0 =3D 0x000000= D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: error data =A0 =A0 =A0=3D 0x0000= 000207030000 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: branch link =A0 =A0 =3D 0x000083= 70000004C2 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: interrupt link =A0=3D 0x000006DA= 000018B8 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: time =A0 =A0 =A0 =A0 =A0 =A0=3D = 4287402440 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: driver status: >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring =A00: qid=3D0 =A0cur=3D1= =A0 queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring =A01: qid=3D1 =A0cur=3D0= =A0 queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring =A02: qid=3D2 =A0cur=3D0= =A0 queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring =A03: qid=3D3 =A0cur=3D3= 6 =A0queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring =A04: qid=3D4 =A0cur=3D1= 23 queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring =A05: qid=3D5 =A0cur=3D0= =A0 queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring =A06: qid=3D6 =A0cur=3D0= =A0 queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring =A07: qid=3D7 =A0cur=3D0= =A0 queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring =A08: qid=3D8 =A0cur=3D0= =A0 queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring =A09: qid=3D9 =A0cur=3D0= =A0 queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 10: qid=3D10 cur=3D0 =A0= queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 11: qid=3D11 cur=3D0 =A0= queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 12: qid=3D12 cur=3D0 =A0= queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 13: qid=3D13 cur=3D0 =A0= queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 14: qid=3D14 cur=3D0 =A0= queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 15: qid=3D15 cur=3D0 =A0= queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: rx ring: cur=3D8 >>>> >>>> This may be because the system was under load (I was installing a port >>>> shortly before the connection dropped). I'll try poking at this >>>> further because it's going to be an annoying productivity loss :/. >>> >>> =A0 =A0Sorry... should have included more helpful details. >>> Thanks, >>> -Garrett >>> >>> dmesg: >>> >>> iwn0: mem 0xdf2fe000-0xdf2fffff irq 17 >>> at device 0.0 on pci3 >>> iwn0: MIMO 2T3R, MoW1, address 00:1d:e0:7d:9f:c7 >>> iwn0: [ITHREAD] >>> iwn0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps >>> iwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps >>> iwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps >>> 24Mbps 36Mbps 48Mbps 54Mbps >>> >>> pciconf -lv snippet: >>> >>> iwn0@pci0:3:0:0: =A0 =A0 =A0 =A0class=3D0x028000 card=3D0x11108086 chip= =3D0x42308086 >>> rev=3D0x61 hdr=3D0x00 >>> =A0 =A0vendor =A0 =A0 =3D 'Intel Corporation' >>> =A0 =A0device =A0 =A0 =3D 'Intel Wireless WiFi Link 4965AGN (Intel 4965= AGN)' >>> =A0 =A0class =A0 =A0 =A0=3D network >>> cbb0@pci0:21:0:0: =A0 =A0 =A0 class=3D0x060700 card=3D0x20c617aa chip= =3D0x04761180 >>> rev=3D0xba hdr=3D0x02 >>> >>> uname -a: >>> >>> $ uname -a >>> FreeBSD garrcoop-fbsd.cisco.com 8.0-STABLE FreeBSD 8.0-STABLE #0 >>> r207006: Wed Apr 21 13:18:44 PDT 2010 >>> root@garrcoop-fbsd.cisco.com:/usr/obj/usr/src/sys/LAPPY_X86 =A0i386 >> >> I'm actually looking at this right now. For me, it's actually >> happening when my machine stays on overnight (or for long periods of >> time, idle). >> >> Also, it seems to be causing the kernel to panic, although I'm now >> wondering if the Machine Check Architecture is somehow catching this >> device error and causing an exception (hw.mca.enabled=3D1)(?) -- not >> possible, right ??? >> >> Whatever the case, I can't seem to get the firmware error to occur >> with iwn(4) debugging or wlandebug options enabled, so who knows >> exactly what leads to this. >> >> I know Bernhard has worked hard on this driver, it's a shame that this >> freaky bug has bit us all now, without leaving many clues :( >> >> I've attached a textdump for posterity if nothing else :) > > =A0 =A0Connectivity appears to be shoddy in my neck of the woods (kind of > ironic... but meh). Just running buildworld, buildkernel, then doing a > tcpdump in parallel causes the pseudo device to go up and down a lot. > I assume this isn't standard behavior? > =A0 =A0Just for reference buildworld was started shortly after 19:39:05, > and it finished at 21:29. The interface has also gone up and down once > since then while the system's been basically idle. Hmmm... I'm seem to be in an excellent position to reproduce this issue. I've reproduced it twice by merely bringing the interface up and down several times using: ifconfig_wlan0=3D"WPA DHCP" instead of my usual: ifconfig_wlan0=3D"WPA ssid DHCP" Maybe others who are experiencing the issue should try that? I'll do more testing when I get home... Thanks, -Garrett