Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 19 Jul 2008 12:45:33 -0700
From:      "Garrett Cooper" <yanefbsd@gmail.com>
To:        "Edward Ruggeri" <smallhand@crawblog.com>
Cc:        freebsd-current@freebsd.org
Subject:   Re: 7.0 CURRENT kernel's ath driver causes page fault, kernel panic (debugging kernel)
Message-ID:  <7d6fde3d0807191245g748c6823udf6c187cdfad1b9a@mail.gmail.com>
In-Reply-To: <919383240807181729n210402a5r5095f8b1554e9891@mail.gmail.com>
References:  <919383240807172100j35e1c796q513fa34d83f8e8e0@mail.gmail.com> <7d6fde3d0807180336h61f13a73pcc433be16a732c7e@mail.gmail.com> <919383240807181729n210402a5r5095f8b1554e9891@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jul 18, 2008 at 5:29 PM, Edward Ruggeri <smallhand@crawblog.com> wrote:
> On Fri, Jul 18, 2008 at 6:36 AM, Garrett Cooper <yanefbsd@gmail.com> wrote:
>> Some notes:
>>
>> 1. *blinks*... I hope you mean 8-CURRENT, not 7-CURRENT. 7 hasn't been
>> CURRENT for some months now (~6 months IIRC).
>
> Oh my, I am an idiot.  I'm using 7-STABLE, making this the wrong list
> to ask; sorry.  I guess I could repost to freebsd-stable in addition
> to filing a PR.  Would that be wise?
>
>> 2. pciconf -lv might help with the PCI ID info. Then someone might be
>> able to tie your card back to the appropriate chipset.
>
> This gives me:
> ath0@pci0:3:0:0:        class=0x020000 card=0x058a1014 chip=0x1014168c
> rev=0x01 hdr=0x00
>    vendor     = 'Atheros Communications Inc.'
>    device     = 'AR5212 Atheros AR5212 802.11abg wireless'
>    class      = network
>    subclass   = ethernet
>   class      = base peripheral
>
> I get 167 pages on google that contain ar5212 and 0x1014168c and
> 0x058a1014.  I only get one with ar5006ex instead of ar5212.  I'm
> inclined to believe my Lenovo representative was wrong; she's just a
> sales rep and asked around about the part...
>
>> 3. KDB, DDB, WITNESS and INVARIANTS support compiled into the kernel
>> would be extremely helpful, if not required to debug your issue.
>
> I'm currently recompiling the kernel with these debug options:
>
> makeoptions     DEBUG=-g                # Build kernel with gdb(1) debug symbols
> options         KDB
> options         DDB
> options         INVARIANTS
> options         WITNESS
>
> As soon as it's done compiling, I'll try reproducing the error.  I've
> added "set dumpdev="/var/crash" in /etc/rc.conf.
>
>> As for the actual debug process, there's a spot in the dev handbook
>> about it (http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug.html),
>> but when I tried debugging my issue with NTFS and SMB I didn't really
>> find it helpful to be honest...
>
> Once I have a core dump, how should I proceed?  Use kdb, and execute
> "list *[instruction pointer]" to find out what (NULL) pointer is being
> dereferenced?  Run backtrace?  If I post a PR, is it likely that
> someone can guide me through this?  I'm fairly familiar with C, but my
> experience using debuggers is very limited...
>
>> You may also have to compile without SMP and with the 4BSD scheduler
>> just to see whether or not it's an issue reproducible with the ULE
>> scheduler, the driver, or something else...
>
> After I get the dump with the current options (+ debug options), I'll
> try w/o SMP and ULE...
>
>> Hopefully this gets you started on the right path...
>> -Garrett
>
> Thanks so much, Garrett!

No problem :).

The steps I just had you go through were to rule out the kernel
scheduler and multiprocessing -- 2 points that make debugging issues
like these difficult. With all of the locking changes in 7-RELEASE,
it's harder to track down driver bugs to some extent.

The error seems straightforward (albeit it may not sound like it)
though: it's an issue with how your card is interfacing with the
driver.

-Garrett



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7d6fde3d0807191245g748c6823udf6c187cdfad1b9a>