From owner-freebsd-current@FreeBSD.ORG Wed Jul 16 02:27:11 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 36A8037B401 for ; Wed, 16 Jul 2003 02:27:11 -0700 (PDT) Received: from mailbox.univie.ac.at (mailbox.univie.ac.at [131.130.1.27]) by mx1.FreeBSD.org (Postfix) with ESMTP id B023043FB1 for ; Wed, 16 Jul 2003 02:27:09 -0700 (PDT) (envelope-from l.ertl@univie.ac.at) Received: from localhost.localdomain (adslle.cc.univie.ac.at [131.130.102.11]) by mailbox.univie.ac.at (8.12.2/8.12.2) with ESMTP id h6G9QwIx228240; Wed, 16 Jul 2003 11:27:00 +0200 Date: Wed, 16 Jul 2003 11:26:58 +0200 (CEST) From: Lukas Ertl To: Harald Schmalzbauer In-Reply-To: Message-ID: <20030716112218.N719@leelou.in.tern> References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE X-DCC-ZID-Univie-Metrics: unet 4261; Body=2 Fuz1=2 Fuz2=2 cc: freebsd-current@freebsd.org Subject: Re: escalation stage 2 [was:RE: Big and ugly bug in 5.1-release] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Jul 2003 09:27:11 -0000 On Wed, 16 Jul 2003, Harald Schmalzbauer wrote: > Now after resetting the machine which was hung by "sysinstall" it claims > that ad4 (one of two mirrored 30GB 2.5" disks" was absent (see dmesg belo= w) > Now the controller warns me that one drive is bad (which in fact is > definatley not) and allows me to select "continue boot" Did you try to replace that defective drive? > That's what I do and after kernel probing the machine reboots with the > folowing error (well, this takes some time to typewrite it from my monchr= ome > screen): > > Fatal trap 12: page fault while in kernel mode > fault virtual address =3D 0x10 > fault code=3D=09=09=09supervisor read, page not present > instruction pinter=3D=090x8:0xc014a0a6 > stack pointer=3D=09=090x10:0xcce65bd8 > frame pointer=3D=09=090x10:0xcce65c58 > code=09segment=09=09=3D base 0x0, limit 0xfffff type 0x1b > =09=09=09=09=3D DPL 0, pres 1, def32 1, gran 1 > processor eflags=09=09=3D interrupt enabled, resume, IOPL=3D0 > current process=09=09=3D 4(g_down) > trap number=09=09=09=3D 12 > panic: page fault > > Then it reboots! Can you get a coredump and a backtrace? That would be very helpful in debugging. > Now please give me a hint what to do. This is my brand new fileserver whi= ch > collected all improtant data from the last decade and since it's brand ne= w I > didn't manage any backup. Funny, there's always an excuse why there are no backups. > When testing the hardware (unplugging one drive while the machine was > running) I had the same error but I thought that would never happen under > normal circumstances. Well, if you did run tests and saw the errors, why did you think it wouldn't happen "under normal circumstances"? IMHO it would be better if you start over with a clean machine and two new disks. Sounds very much like you damaged the drive. regards, le --=20 Lukas Ertl eMail: l.ertl@univie.ac.at UNIX-Systemadministrator Tel.: (+43 1) 4277-14073 Zentraler Informatikdienst (ZID) Fax.: (+43 1) 4277-9140 der Universit=E4t Wien http://mailbox.univie.ac.at/~le/