From owner-freebsd-current Wed Nov 13 9: 2:50 2002 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D49C737B4E9 for ; Wed, 13 Nov 2002 09:02:46 -0800 (PST) Received: from mail.speakeasy.net (mail17.speakeasy.net [216.254.0.217]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5A0DE43E77 for ; Wed, 13 Nov 2002 09:02:46 -0800 (PST) (envelope-from jhb@FreeBSD.org) Received: (qmail 14190 invoked from network); 13 Nov 2002 17:02:50 -0000 Received: from unknown (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) by mail17.speakeasy.net (qmail-ldap-1.03) with DES-CBC3-SHA encrypted SMTP for ; 13 Nov 2002 17:02:50 -0000 Received: from laptop.baldwin.cx (gw1.twc.weather.com [216.133.140.1]) by server.baldwin.cx (8.12.6/8.12.6) with ESMTP id gADH2i2D008460; Wed, 13 Nov 2002 12:02:44 -0500 (EST) (envelope-from jhb@FreeBSD.org) Message-ID: X-Mailer: XFMail 1.5.2 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <200211130637.gAD6btTC017406@white.dogwood.com> Date: Wed, 13 Nov 2002 12:02:46 -0500 (EST) From: John Baldwin To: Dave Cornejo Subject: RE: current SMP kernel crashes (different?) Cc: freebsd-current@freebsd.org Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 13-Nov-2002 Dave Cornejo wrote: > I've had a problem with a SuperMicro 2010H server crashing when > attempting to run an SMP kernel. I've noticed a lot of this lately, > but this seem to be crashing in the clock code. Below is the console > output from power-up to crash. If I use an UP kernel of the same > vintage there is no problem. If I'm reading the code correctly this > seems to be a problem in APIC mode 8254 detection. > > Does anyone have any idea why this is happening? Any magical hints I > could use to get past this? I've tried disabling ACPI to no avail. I think the ACPI PCI LNK code is messing up b/c with SMP we don't use LNK's. So you probably want to disable ACPI for now. Is the panic the same w/o ACPI? > Fatal trap 12: page fault while in kernel mode > cpuid = 0; lapic.id = 00000000 > fault virtual address = 0x6dbc00 > fault code = supervisor read, page not present > instruction pointer = 0x8:0xc02d7383 > stack pointer = 0x10:0xc06decf8 > frame pointer = 0x10:0xc06ded18 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 0 (swapper) > kernel: type 12 trap, code=0 > Stopped at _mtx_lock_flags+0x43: cmpl $0xc05216c0,0(%ebx) > db> trace > _mtx_lock_flags(6dbc00,0,c04c3c00,138,c02d767d) at _mtx_lock_flags+0x43 > ithread_remove_handler(c06ded80,c06ded78,c046c427,c06ded80,0) at ithread_remove_handler+0x53 > inthand_remove(c06ded80,0,c04e8e36,445,a0) at inthand_remove+0x11 > cpu_initclocks(c06ded98,c02bcf75,0,6db000,6dbc00) at cpu_initclocks+0x327 > initclocks(0,6db000,6dbc00,6db000,0) at initclocks+0x1c > mi_startup() at mi_startup+0xb5 > begin() at begin+0x2c > db> > > _mtx_lock_flags+0x43 -> sys/kern/kern_mutex.c:324 > ithread_remove_handler+0x53 -> sys/kern/kern_intr.c:314 > inthand_remove+0x11 -> sys/i386/isa/intr_machdep.c:705 > cpu_initclocks+0x327 -> sys/i386/isa/clock.c:1096 > initclocks+0x1c -> sys/kern/kern_clock.c:153 > mi_startup+0xb5 -> sys/kern/init_main.c:217 KASSERT(m->mtx_object.lo_class == &lock_class_mtx_sleep, ("mtx_lock() of spin mutex %s @ %s:%d", m->mtx_object.lo_name, file, line)); It's blowing up doing that == compare, so it seems that the mutex pointer (m) (%ebx) is 0x6dbc00. (Doing a p $ebx might confirm this.) That means that the ithread might be messed up. Either that or the handler itself might be messed up. If you do a hexdump of the first argument to ithread_remove_handler() that should give you a dump of the struct intrhand, and you can then see if that looks valid, esp. the ithread pointer. If the ithread pointer is valid then you can start looking at the ithread structure via hexdump and see if it looks valid. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message