From owner-freebsd-current  Thu Nov 14  7:46:34 2002
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 2C8C637B401; Thu, 14 Nov 2002 07:46:32 -0800 (PST)
Received: from white.dogwood.com (white.dogwood.com [63.96.228.130])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 8A7B243E4A; Thu, 14 Nov 2002 07:46:31 -0800 (PST)
	(envelope-from dave@dogwood.com)
Received: from white.dogwood.com (localhost [127.0.0.1])
	by white.dogwood.com (8.12.6/8.12.5) with ESMTP id gAEFkSoJ000893;
	Thu, 14 Nov 2002 07:46:29 -0800 (PST)
	(envelope-from dave@white.dogwood.com)
Received: (from dave@localhost)
	by white.dogwood.com (8.12.6/8.12.6/Submit) id gAEFkS2h000892;
	Thu, 14 Nov 2002 07:46:28 -0800 (PST)
From: Dave Cornejo <dave@dogwood.com>
Message-Id: <200211141546.gAEFkS2h000892@white.dogwood.com>
Subject: Re: current SMP kernel crashes (different?)
In-Reply-To: <XFMail.20021113120246.jhb@FreeBSD.org>
To: John Baldwin <jhb@FreeBSD.org>
Date: Thu, 14 Nov 2002 07:46:28 -0800 (PST)
Cc: Dave Cornejo <dave@dogwood.com>, freebsd-current@FreeBSD.org
X-Mailer: ELM [version 2.4ME+ PL99b (25)]
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-current.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-current>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-current>
X-Loop: FreeBSD.ORG

you wrote:
> I think the ACPI PCI LNK code is messing up b/c with SMP we don't use
> LNK's.  So you probably want to disable ACPI for now.  Is the panic the
> same w/o ACPI?

without ACPI the kernel hangs after the "Waiting 2 seconds for SCSI
devices to settle" message. 

> > Fatal trap 12: page fault while in kernel mode
> > cpuid = 0; lapic.id = 00000000
> > fault virtual address   = 0x6dbc00
> > fault code              = supervisor read, page not present
> > instruction pointer     = 0x8:0xc02d7383
> > stack pointer           = 0x10:0xc06decf8
> > frame pointer           = 0x10:0xc06ded18
> > code segment            = base 0x0, limit 0xfffff, type 0x1b
> >                         = DPL 0, pres 1, def32 1, gran 1
> > processor eflags        = interrupt enabled, resume, IOPL = 0
> > current process         = 0 (swapper)
> > kernel: type 12 trap, code=0
> > Stopped at      _mtx_lock_flags+0x43:   cmpl    $0xc05216c0,0(%ebx)
> > db> trace
> > _mtx_lock_flags(6dbc00,0,c04c3c00,138,c02d767d) at _mtx_lock_flags+0x43
> > ithread_remove_handler(c06ded80,c06ded78,c046c427,c06ded80,0) at ithread_remove_handler+0x53
> > inthand_remove(c06ded80,0,c04e8e36,445,a0) at inthand_remove+0x11
> > cpu_initclocks(c06ded98,c02bcf75,0,6db000,6dbc00) at cpu_initclocks+0x327
> > initclocks(0,6db000,6dbc00,6db000,0) at initclocks+0x1c
> > mi_startup() at mi_startup+0xb5
> > begin() at begin+0x2c
> > db>
> > 
> > _mtx_lock_flags+0x43 -> sys/kern/kern_mutex.c:324
> > ithread_remove_handler+0x53 -> sys/kern/kern_intr.c:314
> > inthand_remove+0x11 -> sys/i386/isa/intr_machdep.c:705
> > cpu_initclocks+0x327 -> sys/i386/isa/clock.c:1096
> > initclocks+0x1c -> sys/kern/kern_clock.c:153
> > mi_startup+0xb5 -> sys/kern/init_main.c:217
> 
>         KASSERT(m->mtx_object.lo_class == &lock_class_mtx_sleep,
>             ("mtx_lock() of spin mutex %s @ %s:%d", m->mtx_object.lo_name,
>             file, line));
> 
> It's blowing up doing that == compare, so it seems that the mutex pointer
> (m) (%ebx) is 0x6dbc00.  (Doing a p $ebx might confirm this.)  That means
> that the ithread might be messed up.  Either that or the handler itself
> might be messed up.  If you do a hexdump of the first argument to
> ithread_remove_handler() that should give you a dump of the struct intrhand,
> and you can then see if that looks valid, esp. the ithread pointer.  If the
> ithread pointer is valid then you can start looking at the ithread structure
> via hexdump and see if it looks valid.

Thanks for these hints, I'll try them ASAP,

dave c

-- 
Dave Cornejo @ Dogwood Media, Fremont, California (also dcornejo@ieee.org)
  "There aren't any monkeys chasing us..." - Xochi

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message