From owner-freebsd-hackers Mon Sep 21 06:49:27 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id GAA11319 for freebsd-hackers-outgoing; Mon, 21 Sep 1998 06:49:27 -0700 (PDT) (envelope-from owner-freebsd-hackers@FreeBSD.ORG) Received: from lariat.lariat.org (lariat.lariat.org [206.100.185.2]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id GAA11308 for ; Mon, 21 Sep 1998 06:49:21 -0700 (PDT) (envelope-from brett@lariat.org) Received: (from brett@localhost) by lariat.lariat.org (8.8.8/8.8.6) id HAA01993; Mon, 21 Sep 1998 07:48:42 -0600 (MDT) Message-Id: <4.1.0.63.19980921072400.04165170@mail.lariat.org> X-Sender: brett@mail.lariat.org X-Mailer: QUALCOMM Windows Eudora Pro Version 4.1.0.63 (Beta) Date: Mon, 21 Sep 1998 07:31:01 -0600 To: Mike Smith From: Brett Glass Subject: Re: Remember those spontaneous crashes I was getting? Cc: hackers@FreeBSD.ORG In-Reply-To: <199809210754.AAA21394@word.smith.net.au> References: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG At 12:54 AM 9/21/98 -0700, Mike Smith wrote: >> Fatal trap 9: general protection fault while in kernel mode >> >> Instruction pointer = 0x8:0xf0176fb5 >> Stack pointer = 0x10:0xf0199000 > >Are you 100% sure about these numbers? The kernel stack pointer >shouldn't be higher than the instruction pointer. This looks like >either corrupt code eating %esp or a CPU fault. I checked my transcript twice. >There's nothing illegal about this at all; this really looks like a >memory read error (bad memory, CPU, cache or motherboard). You might >have received the GPF because the stack pointer is pointing into the >kernel text segment (which it probably can't write to). > >Corrupting the stack pointer (as opposed to corrupting the contents of >the stack) is pretty difficult. It's also very difficult to track >down. 8( >> As I began to play with the debugger (I really didn't know the commands), I >> saw: >> wd0: interrupt timeout >> wd0: status 50 error 0 >> >> ...which may not have meant anything, but then again.... This is a hint. Could the corruption have occurred in the wd0 driver? Could it have come out because an IRQ occurred at the wrong moment (e.g. while an ATAPI command was executing and the kernel was busy-waiting)? --Brett Glass To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message