From owner-freebsd-hackers@FreeBSD.ORG Wed Oct 25 06:28:53 2006 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7B11416A51B for ; Wed, 25 Oct 2006 06:28:53 +0000 (UTC) (envelope-from spork@fasttrackmonkey.com) Received: from angryfist.fasttrackmonkey.com (angryfist.fasttrackmonkey.com [216.220.107.230]) by mx1.FreeBSD.org (Postfix) with ESMTP id A4F0143D46 for ; Wed, 25 Oct 2006 06:28:52 +0000 (GMT) (envelope-from spork@fasttrackmonkey.com) Received: (qmail 43966 invoked by uid 2003); 25 Oct 2006 06:29:50 -0000 Received: from spork@fasttrackmonkey.com by angryfist.fasttrackmonkey.com by uid 1001 with qmail-scanner-1.20 (clamscan: 0.65. Clear:RC:1(216.220.116.154):. Processed in 0.013172 secs); 25 Oct 2006 06:29:50 -0000 Received: from unknown (HELO white.nat.fasttrackmonkey.com) (216.220.116.154) by 0 with (DHE-RSA-AES256-SHA encrypted) SMTP; 25 Oct 2006 06:29:50 -0000 Date: Wed, 25 Oct 2006 02:28:51 -0400 (EDT) From: Charles Sprickman X-X-Sender: spork@white.nat.fasttrackmonkey.com To: perryh@pluto.rain.com In-Reply-To: <453ef5d4.JWeFkgfXTFibI+uh%perryh@pluto.rain.com> Message-ID: References: <453ef5d4.JWeFkgfXTFibI+uh%perryh@pluto.rain.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-hackers@freebsd.org Subject: Re: Panic caused by bad memory? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Oct 2006 06:28:53 -0000 On Tue, 24 Oct 2006 perryh@pluto.rain.com wrote: >> I can't get a kernel dump since it fails like this each time: >> >> dumping to dev #da/0x20001, offset 2097152 >> dump 1024 1023 1022 1021 Aborting dump due to I/O error. >> status == 0xb, scsi status == 0x0 >> failed, reason: i/o error > > Bad memory seems unlikely to cause an I/O error trying to write the > dump to the swap partition. I'd guess a dicey drive -- and bad > swap space could also account for the original crash. You might > be able to get a backup by booting single user, provided nothing > activates the (presumably bad) swap partition. Just for the record, this box is running an Adaptec raid controller (2005S - ZCR card) and swap is coming off a mirrored array. Coincidentally, I have a utility box where it had bad blocks on the swap partition (but no others) - what I saw there is that the box would just hang and spit out a bunch of "swap_pager timeout" messages to the console. Quick and dirty remote fix while waiting for a drive? Run file-backed swap on /usr. :) Let's pretend for a minute it's not the drive that's the root cause... Not saying it isn't - we're none too thrilled with these Adaptec RAID controllers... Do those memory addresses in the panic message point towards bad memory if they are always the same? Thanks, Charles