From owner-freebsd-hackers@FreeBSD.ORG Wed Oct 25 15:04:13 2006 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 56E3416A4E7 for ; Wed, 25 Oct 2006 15:04:13 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (66-23-211-162.clients.speedfactory.net [66.23.211.162]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4B25A43D94 for ; Wed, 25 Oct 2006 15:03:56 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.13.6/8.13.6) with ESMTP id k9PF3hTt018720; Wed, 25 Oct 2006 11:03:50 -0400 (EDT) (envelope-from jhb@freebsd.org) From: John Baldwin To: freebsd-hackers@freebsd.org Date: Wed, 25 Oct 2006 10:55:14 -0400 User-Agent: KMail/1.9.1 References: <453ef5d4.JWeFkgfXTFibI+uh%perryh@pluto.rain.com> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200610251055.15445.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Wed, 25 Oct 2006 11:03:51 -0400 (EDT) X-Virus-Scanned: ClamAV 0.88.3/2098/Wed Oct 25 09:14:20 2006 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: Charles Sprickman , perryh@pluto.rain.com Subject: Re: Panic caused by bad memory? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Oct 2006 15:04:13 -0000 On Wednesday 25 October 2006 02:28, Charles Sprickman wrote: > On Tue, 24 Oct 2006 perryh@pluto.rain.com wrote: > > >> I can't get a kernel dump since it fails like this each time: > >> > >> dumping to dev #da/0x20001, offset 2097152 > >> dump 1024 1023 1022 1021 Aborting dump due to I/O error. > >> status == 0xb, scsi status == 0x0 > >> failed, reason: i/o error > > > > Bad memory seems unlikely to cause an I/O error trying to write the > > dump to the swap partition. I'd guess a dicey drive -- and bad > > swap space could also account for the original crash. You might > > be able to get a backup by booting single user, provided nothing > > activates the (presumably bad) swap partition. > > Just for the record, this box is running an Adaptec raid controller (2005S > - ZCR card) and swap is coming off a mirrored array. > > Coincidentally, I have a utility box where it had bad blocks on the swap > partition (but no others) - what I saw there is that the box would just > hang and spit out a bunch of "swap_pager timeout" messages to the console. > Quick and dirty remote fix while waiting for a drive? Run file-backed > swap on /usr. :) > > Let's pretend for a minute it's not the drive that's the root cause... > Not saying it isn't - we're none too thrilled with these Adaptec RAID > controllers... Do those memory addresses in the panic message point > towards bad memory if they are always the same? No, they are virtual addresses. Having the same EIP means you are crashing in the same place. Did you recently kldunload a module before it crashed? -- John Baldwin