Date: Sat, 24 Apr 2010 09:34:30 +0200 From: Bernhard Schmidt <bschmidt@techwires.net> To: Garrett Cooper <yanefbsd@gmail.com> Cc: Brandon Gooch <jamesbrandongooch@gmail.com>, Olivier =?iso-8859-1?Q?Cochard-Labb=E9?= <olivier@cochard.me>, freebsd-stable@freebsd.org Subject: Re: iwn firmware instability with an up-to-date stable kernel Message-ID: <20100424073430.GB62910@mx.techwires.net> In-Reply-To: <l2z7d6fde3d1004232327heb8a744alfa02b81f199876ff@mail.gmail.com> References: <w2s3131aa531004171849i12348bdbt12dfbb18c1f71bc2@mail.gmail.com> <20100418081400.GA40496@mx.techwires.net> <r2q3131aa531004180456w49eea301t526d305c8e7a980a@mail.gmail.com> <v2w7d6fde3d1004231929yb5b54ac6rc3a90276014176b0@mail.gmail.com> <r2u7d6fde3d1004231932rea2235a1r2dcd8f1973fcb1b4@mail.gmail.com> <s2m179b97fb1004232005geee6fbb6q9776a119f5b477db@mail.gmail.com> <u2v7d6fde3d1004232142yb2037851ne906f38ccc8039e9@mail.gmail.com> <y2p7d6fde3d1004232159hcc0c02bcpb6ee0910d76672a7@mail.gmail.com> <i2h179b97fb1004232208za4b1dbbekef363ec0fa522e0@mail.gmail.com> <l2z7d6fde3d1004232327heb8a744alfa02b81f199876ff@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Apr 23, 2010 at 11:27:32PM -0700, Garrett Cooper wrote: > On Fri, Apr 23, 2010 at 10:08 PM, Brandon Gooch > <jamesbrandongooch@gmail.com> wrote: > > On Sat, Apr 24, 2010 at 4:59 AM, Garrett Cooper <yanefbsd@gmail.com> wrote: > >> On Fri, Apr 23, 2010 at 9:42 PM, Garrett Cooper <yanefbsd@gmail.com> wrote: > >>> On Fri, Apr 23, 2010 at 8:05 PM, Brandon Gooch > >>> <jamesbrandongooch@gmail.com> wrote: > >>>> 2010/4/23 Garrett Cooper <yanefbsd@gmail.com>: > >>>>> 2010/4/23 Garrett Cooper <yanefbsd@gmail.com>: > >>>>>> 2010/4/18 Olivier Cochard-Labbé <olivier@cochard.me>: > >>>>>>> 2010/4/18 Bernhard Schmidt <bschmidt@techwires.net>: > >>>>>>>> Are you able to reproduce this on demand? As in type a few commands and > >>>>>>>> the firmware error occurs? > >>>>>>>> > >>>>>>> > >>>>>>> No, I'm not able to reproduce on demand this problem. > >>>>>> > >>>>>> I'm seeing similar issues on occasion with my Lenovo as well: > >>>>>> > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: firmware error log: > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: error type = > >>>>>> "NMI_INTERRUPT_WDG" (0x00000004) > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: program counter = 0x0000046C > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: source line = 0x000000D0 > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: error data = 0x0000000207030000 > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: branch link = 0x00008370000004C2 > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: interrupt link = 0x000006DA000018B8 > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: time = 4287402440 > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: driver status: > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 0: qid=0 cur=1 queued=0 > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 1: qid=1 cur=0 queued=0 > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 2: qid=2 cur=0 queued=0 > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 3: qid=3 cur=36 queued=0 > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 4: qid=4 cur=123 queued=0 > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 5: qid=5 cur=0 queued=0 > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 6: qid=6 cur=0 queued=0 > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 7: qid=7 cur=0 queued=0 > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 8: qid=8 cur=0 queued=0 > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 9: qid=9 cur=0 queued=0 > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 10: qid=10 cur=0 queued=0 > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 11: qid=11 cur=0 queued=0 > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 12: qid=12 cur=0 queued=0 > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 13: qid=13 cur=0 queued=0 > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 14: qid=14 cur=0 queued=0 > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: tx ring 15: qid=15 cur=0 queued=0 > >>>>>> Apr 23 19:25:24 garrcoop-fbsd kernel: rx ring: cur=8 > >>>>>> > >>>>>> This may be because the system was under load (I was installing a port > >>>>>> shortly before the connection dropped). I'll try poking at this > >>>>>> further because it's going to be an annoying productivity loss :/. > >>>>> > >>>>> Sorry... should have included more helpful details. > >>>>> Thanks, > >>>>> -Garrett > >>>>> > >>>>> dmesg: > >>>>> > >>>>> iwn0: <Intel(R) PRO/Wireless 4965BGN> mem 0xdf2fe000-0xdf2fffff irq 17 > >>>>> at device 0.0 on pci3 > >>>>> iwn0: MIMO 2T3R, MoW1, address 00:1d:e0:7d:9f:c7 > >>>>> iwn0: [ITHREAD] > >>>>> iwn0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps > >>>>> iwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps > >>>>> iwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps > >>>>> 24Mbps 36Mbps 48Mbps 54Mbps > >>>>> > >>>>> pciconf -lv snippet: > >>>>> > >>>>> iwn0@pci0:3:0:0: class=0x028000 card=0x11108086 chip=0x42308086 > >>>>> rev=0x61 hdr=0x00 > >>>>> vendor = 'Intel Corporation' > >>>>> device = 'Intel Wireless WiFi Link 4965AGN (Intel 4965AGN)' > >>>>> class = network > >>>>> cbb0@pci0:21:0:0: class=0x060700 card=0x20c617aa chip=0x04761180 > >>>>> rev=0xba hdr=0x02 > >>>>> > >>>>> uname -a: > >>>>> > >>>>> $ uname -a > >>>>> FreeBSD garrcoop-fbsd.cisco.com 8.0-STABLE FreeBSD 8.0-STABLE #0 > >>>>> r207006: Wed Apr 21 13:18:44 PDT 2010 > >>>>> root@garrcoop-fbsd.cisco.com:/usr/obj/usr/src/sys/LAPPY_X86 i386 > >>>> > >>>> I'm actually looking at this right now. For me, it's actually > >>>> happening when my machine stays on overnight (or for long periods of > >>>> time, idle). > >>>> > >>>> Also, it seems to be causing the kernel to panic, although I'm now > >>>> wondering if the Machine Check Architecture is somehow catching this > >>>> device error and causing an exception (hw.mca.enabled=1)(?) -- not > >>>> possible, right ??? > >>>> > >>>> Whatever the case, I can't seem to get the firmware error to occur > >>>> with iwn(4) debugging or wlandebug options enabled, so who knows > >>>> exactly what leads to this. > >>>> > >>>> I know Bernhard has worked hard on this driver, it's a shame that this > >>>> freaky bug has bit us all now, without leaving many clues :( > >>>> > >>>> I've attached a textdump for posterity if nothing else :) > >>> > >>> Connectivity appears to be shoddy in my neck of the woods (kind of > >>> ironic... but meh). Just running buildworld, buildkernel, then doing a > >>> tcpdump in parallel causes the pseudo device to go up and down a lot. > >>> I assume this isn't standard behavior? > >>> Just for reference buildworld was started shortly after 19:39:05, > >>> and it finished at 21:29. The interface has also gone up and down once > >>> since then while the system's been basically idle. > >> > >> Hmmm... I'm seem to be in an excellent position to reproduce this > >> issue. I've reproduced it twice by merely bringing the interface up > >> and down several times using: > >> > >> ifconfig_wlan0="WPA DHCP" > >> > >> instead of my usual: > >> > >> ifconfig_wlan0="WPA ssid <base-station-id1> DHCP" > >> > >> Maybe others who are experiencing the issue should try that? I'll > >> do more testing when I get home... How did you do that? Reloading the module, or with ifconfig? > > > > My rc.conf is: > > > > ifconfig_wlan0="WPA DHCP" > > > > ...as well, although I haven't tried manually taking the interface > > down and bringing it back up. > > Hmmm... that is interesting. I wish I could do that, but it seems to > be alluding my grasp right now. The driver just kind of freaks out > with a bunch of SSIDs, one being my target SSID, a bunch of NUL string > ones, and then finally it just croaks. I need to figure out whether or > not the SSIDs are valid when I boot it up at my desk again. > > > Are you waiting for the device to associate and begin passing traffic > > before you each up/down cycle? > > I was, but I'm not sure whether or not the Ajax pieces in GMail were. > I'll try some more rudimentary tests when I get back to work on Monday > in that environment, but I need to try out other things at home as > well in the meantime. > > Thanks, > -Garrett -- Bernhard
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100424073430.GB62910>