Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 13 Nov 2002 12:02:46 -0500 (EST)
From:      John Baldwin <jhb@FreeBSD.org>
To:        Dave Cornejo <dave@dogwood.com>
Cc:        freebsd-current@freebsd.org
Subject:   RE: current SMP kernel crashes (different?)
Message-ID:  <XFMail.20021113120246.jhb@FreeBSD.org>
In-Reply-To: <200211130637.gAD6btTC017406@white.dogwood.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On 13-Nov-2002 Dave Cornejo wrote:
> I've had a problem with a SuperMicro 2010H server crashing when
> attempting to run an SMP kernel.  I've noticed a lot of this lately,
> but this seem to be crashing in the clock code.  Below is the console
> output from power-up to crash.  If I use an UP kernel of the same
> vintage there is no problem.  If I'm reading the code correctly this
> seems to be a problem in APIC mode 8254 detection.
> 
> Does anyone have any idea why this is happening?  Any magical hints I
> could use to get past this?  I've tried disabling ACPI to no avail.

I think the ACPI PCI LNK code is messing up b/c with SMP we don't use
LNK's.  So you probably want to disable ACPI for now.  Is the panic the
same w/o ACPI?
 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; lapic.id = 00000000
> fault virtual address   = 0x6dbc00
> fault code              = supervisor read, page not present
> instruction pointer     = 0x8:0xc02d7383
> stack pointer           = 0x10:0xc06decf8
> frame pointer           = 0x10:0xc06ded18
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 0 (swapper)
> kernel: type 12 trap, code=0
> Stopped at      _mtx_lock_flags+0x43:   cmpl    $0xc05216c0,0(%ebx)
> db> trace
> _mtx_lock_flags(6dbc00,0,c04c3c00,138,c02d767d) at _mtx_lock_flags+0x43
> ithread_remove_handler(c06ded80,c06ded78,c046c427,c06ded80,0) at ithread_remove_handler+0x53
> inthand_remove(c06ded80,0,c04e8e36,445,a0) at inthand_remove+0x11
> cpu_initclocks(c06ded98,c02bcf75,0,6db000,6dbc00) at cpu_initclocks+0x327
> initclocks(0,6db000,6dbc00,6db000,0) at initclocks+0x1c
> mi_startup() at mi_startup+0xb5
> begin() at begin+0x2c
> db>
> 
> _mtx_lock_flags+0x43 -> sys/kern/kern_mutex.c:324
> ithread_remove_handler+0x53 -> sys/kern/kern_intr.c:314
> inthand_remove+0x11 -> sys/i386/isa/intr_machdep.c:705
> cpu_initclocks+0x327 -> sys/i386/isa/clock.c:1096
> initclocks+0x1c -> sys/kern/kern_clock.c:153
> mi_startup+0xb5 -> sys/kern/init_main.c:217

        KASSERT(m->mtx_object.lo_class == &lock_class_mtx_sleep,
            ("mtx_lock() of spin mutex %s @ %s:%d", m->mtx_object.lo_name,
            file, line));

It's blowing up doing that == compare, so it seems that the mutex pointer
(m) (%ebx) is 0x6dbc00.  (Doing a p $ebx might confirm this.)  That means
that the ithread might be messed up.  Either that or the handler itself
might be messed up.  If you do a hexdump of the first argument to
ithread_remove_handler() that should give you a dump of the struct intrhand,
and you can then see if that looks valid, esp. the ithread pointer.  If the
ithread pointer is valid then you can start looking at the ithread structure
via hexdump and see if it looks valid.

-- 

John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.20021113120246.jhb>