Date: Fri, 23 Apr 2010 21:59:17 -0700 From: Garrett Cooper <yanefbsd@gmail.com> To: Brandon Gooch <jamesbrandongooch@gmail.com> Cc: =?ISO-8859-1?Q?Olivier_Cochard=2DLabb=E9?= <olivier@cochard.me>, Bernhard Schmidt <bschmidt@techwires.net>, freebsd-stable@freebsd.org Subject: Re: iwn firmware instability with an up-to-date stable kernel Message-ID: <y2p7d6fde3d1004232159hcc0c02bcpb6ee0910d76672a7@mail.gmail.com> In-Reply-To: <u2v7d6fde3d1004232142yb2037851ne906f38ccc8039e9@mail.gmail.com> References: <w2s3131aa531004171849i12348bdbt12dfbb18c1f71bc2@mail.gmail.com> <20100418081400.GA40496@mx.techwires.net> <r2q3131aa531004180456w49eea301t526d305c8e7a980a@mail.gmail.com> <v2w7d6fde3d1004231929yb5b54ac6rc3a90276014176b0@mail.gmail.com> <r2u7d6fde3d1004231932rea2235a1r2dcd8f1973fcb1b4@mail.gmail.com> <s2m179b97fb1004232005geee6fbb6q9776a119f5b477db@mail.gmail.com> <u2v7d6fde3d1004232142yb2037851ne906f38ccc8039e9@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Apr 23, 2010 at 9:42 PM, Garrett Cooper <yanefbsd@gmail.com> wrote: > On Fri, Apr 23, 2010 at 8:05 PM, Brandon Gooch > <jamesbrandongooch@gmail.com> wrote: >> 2010/4/23 Garrett Cooper <yanefbsd@gmail.com>: >>> 2010/4/23 Garrett Cooper <yanefbsd@gmail.com>: >>>> 2010/4/18 Olivier Cochard-Labb=E9 <olivier@cochard.me>: >>>>> 2010/4/18 Bernhard Schmidt <bschmidt@techwires.net>: >>>>>> Are you able to reproduce this on demand? As in type a few commands = and >>>>>> the firmware error occurs? >>>>>> >>>>> >>>>> No, I'm not able to reproduce on demand this problem. >>>> >>>> I'm seeing similar issues on occasion with my Lenovo as well: >>>> >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: firmware error log: >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: error type =A0 =A0 =A0=3D >>>> "NMI_INTERRUPT_WDG" (0x00000004) >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: program counter =3D 0x0000046C >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: source line =A0 =A0 =3D 0x000000= D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: error data =A0 =A0 =A0=3D 0x0000= 000207030000 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: branch link =A0 =A0 =3D 0x000083= 70000004C2 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: interrupt link =A0=3D 0x000006DA= 000018B8 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: time =A0 =A0 =A0 =A0 =A0 =A0=3D = 4287402440 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: driver status: >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring =A00: qid=3D0 =A0cur=3D1= =A0 queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring =A01: qid=3D1 =A0cur=3D0= =A0 queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring =A02: qid=3D2 =A0cur=3D0= =A0 queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring =A03: qid=3D3 =A0cur=3D3= 6 =A0queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring =A04: qid=3D4 =A0cur=3D1= 23 queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring =A05: qid=3D5 =A0cur=3D0= =A0 queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring =A06: qid=3D6 =A0cur=3D0= =A0 queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring =A07: qid=3D7 =A0cur=3D0= =A0 queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring =A08: qid=3D8 =A0cur=3D0= =A0 queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring =A09: qid=3D9 =A0cur=3D0= =A0 queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 10: qid=3D10 cur=3D0 =A0= queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 11: qid=3D11 cur=3D0 =A0= queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 12: qid=3D12 cur=3D0 =A0= queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 13: qid=3D13 cur=3D0 =A0= queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 14: qid=3D14 cur=3D0 =A0= queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 15: qid=3D15 cur=3D0 =A0= queued=3D0 >>>> Apr 23 19:25:24 garrcoop-fbsd kernel: rx ring: cur=3D8 >>>> >>>> This may be because the system was under load (I was installing a port >>>> shortly before the connection dropped). I'll try poking at this >>>> further because it's going to be an annoying productivity loss :/. >>> >>> =A0 =A0Sorry... should have included more helpful details. >>> Thanks, >>> -Garrett >>> >>> dmesg: >>> >>> iwn0: <Intel(R) PRO/Wireless 4965BGN> mem 0xdf2fe000-0xdf2fffff irq 17 >>> at device 0.0 on pci3 >>> iwn0: MIMO 2T3R, MoW1, address 00:1d:e0:7d:9f:c7 >>> iwn0: [ITHREAD] >>> iwn0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps >>> iwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps >>> iwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps >>> 24Mbps 36Mbps 48Mbps 54Mbps >>> >>> pciconf -lv snippet: >>> >>> iwn0@pci0:3:0:0: =A0 =A0 =A0 =A0class=3D0x028000 card=3D0x11108086 chip= =3D0x42308086 >>> rev=3D0x61 hdr=3D0x00 >>> =A0 =A0vendor =A0 =A0 =3D 'Intel Corporation' >>> =A0 =A0device =A0 =A0 =3D 'Intel Wireless WiFi Link 4965AGN (Intel 4965= AGN)' >>> =A0 =A0class =A0 =A0 =A0=3D network >>> cbb0@pci0:21:0:0: =A0 =A0 =A0 class=3D0x060700 card=3D0x20c617aa chip= =3D0x04761180 >>> rev=3D0xba hdr=3D0x02 >>> >>> uname -a: >>> >>> $ uname -a >>> FreeBSD garrcoop-fbsd.cisco.com 8.0-STABLE FreeBSD 8.0-STABLE #0 >>> r207006: Wed Apr 21 13:18:44 PDT 2010 >>> root@garrcoop-fbsd.cisco.com:/usr/obj/usr/src/sys/LAPPY_X86 =A0i386 >> >> I'm actually looking at this right now. For me, it's actually >> happening when my machine stays on overnight (or for long periods of >> time, idle). >> >> Also, it seems to be causing the kernel to panic, although I'm now >> wondering if the Machine Check Architecture is somehow catching this >> device error and causing an exception (hw.mca.enabled=3D1)(?) -- not >> possible, right ??? >> >> Whatever the case, I can't seem to get the firmware error to occur >> with iwn(4) debugging or wlandebug options enabled, so who knows >> exactly what leads to this. >> >> I know Bernhard has worked hard on this driver, it's a shame that this >> freaky bug has bit us all now, without leaving many clues :( >> >> I've attached a textdump for posterity if nothing else :) > > =A0 =A0Connectivity appears to be shoddy in my neck of the woods (kind of > ironic... but meh). Just running buildworld, buildkernel, then doing a > tcpdump in parallel causes the pseudo device to go up and down a lot. > I assume this isn't standard behavior? > =A0 =A0Just for reference buildworld was started shortly after 19:39:05, > and it finished at 21:29. The interface has also gone up and down once > since then while the system's been basically idle. Hmmm... I'm seem to be in an excellent position to reproduce this issue. I've reproduced it twice by merely bringing the interface up and down several times using: ifconfig_wlan0=3D"WPA DHCP" instead of my usual: ifconfig_wlan0=3D"WPA ssid <base-station-id1> DHCP" Maybe others who are experiencing the issue should try that? I'll do more testing when I get home... Thanks, -Garrett
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?y2p7d6fde3d1004232159hcc0c02bcpb6ee0910d76672a7>