Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 1 Apr 2013 10:25:01 -0700
From:      Jeremy Chadwick <jdc@koitsu.org>
To:        d@delphij.net
Cc:        Ryan McIntosh <rmcintosh@nitemare.net>, freebsd-stable@freebsd.org
Subject:   Re: 9.1-REL Supermicro H8DCL-iF kernel panic
Message-ID:  <20130401172501.GA12934@icarus.home.lan>
In-Reply-To: <5159B5FA.1080005@delphij.net>
References:  <CAEoCk-Pjfd-kkBnmbXnxbbsXxiu1PSrBzqW22y14utGBLaux0g@mail.gmail.com> <515937BF.9010805@delphij.net> <CAEoCk-N__WL%2BrLaYBFjHOuzaRg32h%2BxRYDB=KcqsoccH%2BHgSEA@mail.gmail.com> <51593BB8.4020403@delphij.net> <CAEoCk-MmOhs2Wor_ts3w0_C%2BKiu_TeB=Hv_knkqknr9-SxuLuA@mail.gmail.com> <20130401122550.GA7367@icarus.home.lan> <5159B5FA.1080005@delphij.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Apr 01, 2013 at 09:29:46AM -0700, Xin Li wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
> 
> On 4/1/13 5:25 AM, Jeremy Chadwick wrote:
> > On Mon, Apr 01, 2013 at 05:45:48AM -0400, Ryan McIntosh wrote:
> >> I can confirm that works as intended. I appreciate the prompt
> >> response and it looks like there's a real fix.
> >> 
> >> For google reference for anyone else searching..
> >> 
> >> Motherboard: Supermicro H8DCL-iF OS: FreeBSD 9.1-RELEASE
> >> 
> >> Boot message: panic: m_getzone: m_getjcl: invalid cluster type 
> >> cpuid = 0 KBD: stack backtrace: #0 0xffffffff809208a6 at
> >> kdb_backtrace+0x66 #1 0xffffffff808ea8be at panic+0x1ce #2
> >> 0xffffffff804ad5a7 at em_refresh_mbufs+0x207 #3
> >> 0xffffffff804adb7f at em_rxeof+0x47f #4 0xffffffff804adca4 at
> >> em_msix_rx+0x24 #5 0xffffffff808be8d4 at
> >> intr_event_execute_handlers+0x104 #6 0xffffffff808c0076 at
> >> ithread_loop+0xa6 #7 0xffffffff808bb9ef at fork_exit+0x11f #8
> >> 0xffffffff80bc368e at fork_trampoline+0xe
> >> 
> >> Panic image from H8DCl-iF: 
> >> http://nitemail.net/img/crash91-h8dcl-if.png
> >> 
> >> Original image from X8DTU-6+: 
> >> http://www.grosbein.net/img/crash-91rc.png
> >> 
> >> As per Xin Li, which seems to work: 
> >> http://svnweb.freebsd.org/base/head/sys/dev/e1000/if_em.c?r1=238214&r2=239304&view=patch
> >>
> >>
> >> 
> References:
> >> http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063958.html
> >>
> >> 
> http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/172113
> >> 
> >> 
> >> Thanks again,
> >> 
> >> Ryan McIntosh e: rmcintosh@nitemare.net
> >> 
> >> 
> >> On Mon, Apr 1, 2013 at 3:48 AM, Xin Li <delphij@delphij.net>
> >> wrote:
> >> 
> >>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
> >>> 
> >>> On 4/1/13 12:34 AM, Ryan McIntosh wrote:
> >>>> I could try that patch, however that was intended for
> >>>> if_igb.c which for my system (and the panic's are almost
> >>>> identical except if_em for me) I'd have to apply that fix to
> >>>> if_em.c and I haven't looked at the source just yet. If you
> >>>> can give me a patch I'll do apply and test it shortly
> >>>> though.
> >>> 
> >>> Try this:
> >>> 
> >>> http://svnweb.freebsd.org/base/head/sys/dev/e1000/if_em.c?r1=238214&r2=239304&view=patch
> >
> >>> 
> > Jack Vogel has stated it's not a "real fix" (your words) but rather
> > a "bandaid", for both igb(4) and em(4).  The commit messages (for
> > r238214 and r239304) contain details:
> > 
> > http://svnweb.freebsd.org/base/head/sys/dev/e1000/if_em.c#rev238214
> >
> > 
> http://svnweb.freebsd.org/base/head/sys/dev/e1000/if_em.c#rev239304
> 
> Hm why 238214 is related, or did you mean the change between 238214
> and 239304?

Correct (the latter).  :-)  The "bandaid" in 239304 **wasn't** to fix a
bug introduced in 238214, it was an overall "bandaid".

I've gotten in the habit of always examining two commits (fix + previous
commit) to see what got introduced where.

> Yes, this is a bandaid and the right fix should be refactor the code a
> little bit to make sure that no interrupt handler is installed before
> the driver have done other initializations but I don't have hardware
> that can reproduce this issue handy to validate changes like that.

Yes exactly.  I just want to make sure Ryan understands that this is
simply a workaround for said spurious interrupt scenario, while the
actual root cause needs to be dealt as you describe.

-- 
| Jeremy Chadwick                                   jdc@koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Mountain View, CA, US                                            |
| Making life hard for others since 1977.             PGP 4BD6C0CB |



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130401172501.GA12934>