From owner-freebsd-questions Wed Jul 29 01:29:49 1998 Return-Path: <owner-freebsd-questions@FreeBSD.ORG> Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id BAA10821 for freebsd-questions-outgoing; Wed, 29 Jul 1998 01:29:49 -0700 (PDT) (envelope-from owner-freebsd-questions@FreeBSD.ORG) Received: from allegro.lemis.com (allegro.lemis.com [192.109.197.134]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id BAA10810 for <freebsd-questions@FreeBSD.ORG>; Wed, 29 Jul 1998 01:29:45 -0700 (PDT) (envelope-from grog@freebie.lemis.com) Received: from freebie.lemis.com (freebie.lemis.com [192.109.197.137]) by allegro.lemis.com (8.9.1/8.9.0) with ESMTP id RAA22369; Wed, 29 Jul 1998 17:58:58 +0930 (CST) Received: (from grog@localhost) by freebie.lemis.com (8.9.1/8.9.0) id RAA28899; Wed, 29 Jul 1998 17:58:57 +0930 (CST) Message-ID: <19980729175856.Q716@freebie.lemis.com> Date: Wed, 29 Jul 1998 17:58:56 +0930 From: Greg Lehey <grog@lemis.com> To: Les LaCroix <Les.LaCroix@Carleton.edu>, freebsd-questions@FreeBSD.ORG Subject: Re: (long) page fault in kernel mode: suggestions? References: <4027246050.901675410@miranda.INFOZOO.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.91.1i In-Reply-To: <4027246050.901675410@miranda.INFOZOO.com>; from Les LaCroix on Wed, Jul 29, 1998 at 01:23:30AM -0500 WWW-Home-Page: http://www.lemis.com/~grog Organization: LEMIS, PO Box 460, Echunga SA 5153, Australia Phone: +61-8-8388-8286 Fax: +61-8-8388-8725 Mobile: +61-41-739-7062 Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Wednesday, 29 July 1998 at 1:23:30 -0500, Les LaCroix wrote: > I've been fighting a "fatal trap 12: page fault while in kernel mode" > problem. Clues are appreciated. I'm running out of ideas. > > New machine (configuration below). Crashes in a similar (if not the exactly > the same) way with GENERIC kernel and a custom kernel with virtually > everything removed, in both 2.2.6 and 2.2.7. I've not changed anything in > the kernel source. > > I don't have the panic screen from other days, but tonight it crashed 3 > times in 5 hours like this: > > Fatal trap 12: page fault while in kernel mode > ... You don't need this information if you have a dump. > Each crash was the same: same instruction, stack and frame pointers, same > everything. gdb -k on the dumps all look like: > > (kgdb) symbol-file /kernel > Reading symbols from /kernel...done. > (kgdb) exec-file /var/crash/kernel.2 > (kgdb) core-file /var/crash/vmcore.2 > IdlePTD 1c1000 > current pcb at 1a8bb0 > panic: page fault > #0 boot (howto=256) at ../../kern/kern_shutdown.c:266 > 266 dumppcb.pcb_cr3 = rcr3(); > (kgdb) where > #0 boot (howto=256) at ../../kern/kern_shutdown.c:266 > #1 0xf010eb12 in panic (fmt=0xf017693f "page fault") > at ../../kern/kern_shutdown.c:400 > #2 0xf017751e in trap_fatal (frame=0xf019cf64) at > ./../i386/i386/trap.c:772 > #3 0xf0176fe0 in trap_pfault (frame=0xf019cf64, usermode=0) > at ../../i386/i386/trap.c:681 > #4 0xf0176c77 in trap (frame={tf_es = 16, tf_ds = 16, tf_edi = -1073741824, > tf_esi = -535754628, tf_ebp = -266743880, tf_isp = -266743924, > tf_ebx = -260199936, tf_edx = -226815792, tf_ecx = 1073741823, > tf_eax = -2147483648, tf_trapno = 12, tf_err = 0, tf_eip = -535754628, > tf_cs = 8, tf_eflags = 66118, tf_esp = -267363380, tf_ss = > -260199936}) > at ../../i386/i386/trap.c:324 > #5 0xe011087c in ?? () > > I'm not familiar enough (yet) with gdb and kernel debugging to try to figure > out what's going on. My current hunch is that something is corrupting the > stack, changing the return address, and causing the page fault when > something does a return. Yes, it looks like that. Not an easy dump to crack. > The machine: > > Epox 100Mhz 51MVP3E-M ATX board with 1MB cache: > bus clock = 100 MHz > multiplier = 3x > SDRAM clock = CPU bus clock > AMD K6 300 MMX CPU Hmmm. We haven't seen many of these yet. > 128MB PC100 SDRAM/ECC 8ns 168-pin DIMM w/ EPROM, 100MHz Mbrds > Seagate 6.4GB 7200 RPM IDE drive (ST36530A) > Adaptec ISA 1520 SCSI-2 Controller (for an external ZIP, but nothing > attached yet) I would look carefully at this. Not many people use them, and so they're more likely than most to cause problems. Try removing the board for a while and see if the crashes continue. > Intel EtherExpress Pro/100B > 8MB Millenium II PCI (but not running X or doing anything but dumb console > work yet) > Teac 24x, IDE (ATAPI) > > There's nothing interesting running, usually. I killed sendmail and cron > (although I left inetd, syslogd, portmap and a couple getty's running). I'd be more likely to suspect the hardware configuration. Can you change the Ethernet board for some other model? If the crashes continue after removing the SCSI board, that would be the next thing I'd look at. After that, if it still continues, consider temporarily replacing the CPU with some other model. Greg -- See complete headers for address and phone numbers finger grog@lemis.com for PGP public key To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message