From owner-freebsd-current Thu Nov 14 7:46:34 2002 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2C8C637B401; Thu, 14 Nov 2002 07:46:32 -0800 (PST) Received: from white.dogwood.com (white.dogwood.com [63.96.228.130]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8A7B243E4A; Thu, 14 Nov 2002 07:46:31 -0800 (PST) (envelope-from dave@dogwood.com) Received: from white.dogwood.com (localhost [127.0.0.1]) by white.dogwood.com (8.12.6/8.12.5) with ESMTP id gAEFkSoJ000893; Thu, 14 Nov 2002 07:46:29 -0800 (PST) (envelope-from dave@white.dogwood.com) Received: (from dave@localhost) by white.dogwood.com (8.12.6/8.12.6/Submit) id gAEFkS2h000892; Thu, 14 Nov 2002 07:46:28 -0800 (PST) From: Dave Cornejo Message-Id: <200211141546.gAEFkS2h000892@white.dogwood.com> Subject: Re: current SMP kernel crashes (different?) In-Reply-To: To: John Baldwin Date: Thu, 14 Nov 2002 07:46:28 -0800 (PST) Cc: Dave Cornejo , freebsd-current@FreeBSD.org X-Mailer: ELM [version 2.4ME+ PL99b (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG you wrote: > I think the ACPI PCI LNK code is messing up b/c with SMP we don't use > LNK's. So you probably want to disable ACPI for now. Is the panic the > same w/o ACPI? without ACPI the kernel hangs after the "Waiting 2 seconds for SCSI devices to settle" message. > > Fatal trap 12: page fault while in kernel mode > > cpuid = 0; lapic.id = 00000000 > > fault virtual address = 0x6dbc00 > > fault code = supervisor read, page not present > > instruction pointer = 0x8:0xc02d7383 > > stack pointer = 0x10:0xc06decf8 > > frame pointer = 0x10:0xc06ded18 > > code segment = base 0x0, limit 0xfffff, type 0x1b > > = DPL 0, pres 1, def32 1, gran 1 > > processor eflags = interrupt enabled, resume, IOPL = 0 > > current process = 0 (swapper) > > kernel: type 12 trap, code=0 > > Stopped at _mtx_lock_flags+0x43: cmpl $0xc05216c0,0(%ebx) > > db> trace > > _mtx_lock_flags(6dbc00,0,c04c3c00,138,c02d767d) at _mtx_lock_flags+0x43 > > ithread_remove_handler(c06ded80,c06ded78,c046c427,c06ded80,0) at ithread_remove_handler+0x53 > > inthand_remove(c06ded80,0,c04e8e36,445,a0) at inthand_remove+0x11 > > cpu_initclocks(c06ded98,c02bcf75,0,6db000,6dbc00) at cpu_initclocks+0x327 > > initclocks(0,6db000,6dbc00,6db000,0) at initclocks+0x1c > > mi_startup() at mi_startup+0xb5 > > begin() at begin+0x2c > > db> > > > > _mtx_lock_flags+0x43 -> sys/kern/kern_mutex.c:324 > > ithread_remove_handler+0x53 -> sys/kern/kern_intr.c:314 > > inthand_remove+0x11 -> sys/i386/isa/intr_machdep.c:705 > > cpu_initclocks+0x327 -> sys/i386/isa/clock.c:1096 > > initclocks+0x1c -> sys/kern/kern_clock.c:153 > > mi_startup+0xb5 -> sys/kern/init_main.c:217 > > KASSERT(m->mtx_object.lo_class == &lock_class_mtx_sleep, > ("mtx_lock() of spin mutex %s @ %s:%d", m->mtx_object.lo_name, > file, line)); > > It's blowing up doing that == compare, so it seems that the mutex pointer > (m) (%ebx) is 0x6dbc00. (Doing a p $ebx might confirm this.) That means > that the ithread might be messed up. Either that or the handler itself > might be messed up. If you do a hexdump of the first argument to > ithread_remove_handler() that should give you a dump of the struct intrhand, > and you can then see if that looks valid, esp. the ithread pointer. If the > ithread pointer is valid then you can start looking at the ithread structure > via hexdump and see if it looks valid. Thanks for these hints, I'll try them ASAP, dave c -- Dave Cornejo @ Dogwood Media, Fremont, California (also dcornejo@ieee.org) "There aren't any monkeys chasing us..." - Xochi To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message