From owner-freebsd-hackers@FreeBSD.ORG  Wed Oct 25 06:28:53 2006
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
X-Original-To: freebsd-hackers@freebsd.org
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 7B11416A51B
	for <freebsd-hackers@freebsd.org>; Wed, 25 Oct 2006 06:28:53 +0000 (UTC)
	(envelope-from spork@fasttrackmonkey.com)
Received: from angryfist.fasttrackmonkey.com (angryfist.fasttrackmonkey.com
	[216.220.107.230])
	by mx1.FreeBSD.org (Postfix) with ESMTP id A4F0143D46
	for <freebsd-hackers@freebsd.org>; Wed, 25 Oct 2006 06:28:52 +0000 (GMT)
	(envelope-from spork@fasttrackmonkey.com)
Received: (qmail 43966 invoked by uid 2003); 25 Oct 2006 06:29:50 -0000
Received: from spork@fasttrackmonkey.com by angryfist.fasttrackmonkey.com by
	uid 1001 with qmail-scanner-1.20 
	(clamscan: 0.65.  Clear:RC:1(216.220.116.154):. 
	Processed in 0.013172 secs); 25 Oct 2006 06:29:50 -0000
Received: from unknown (HELO white.nat.fasttrackmonkey.com) (216.220.116.154)
	by 0 with (DHE-RSA-AES256-SHA encrypted) SMTP;
	25 Oct 2006 06:29:50 -0000
Date: Wed, 25 Oct 2006 02:28:51 -0400 (EDT)
From: Charles Sprickman <spork@fasttrackmonkey.com>
X-X-Sender: spork@white.nat.fasttrackmonkey.com
To: perryh@pluto.rain.com
In-Reply-To: <453ef5d4.JWeFkgfXTFibI+uh%perryh@pluto.rain.com>
Message-ID: <Pine.OSX.4.61.0610250223540.889@white.nat.fasttrackmonkey.com>
References: <Pine.OSX.4.61.0610241900480.889@white.nat.fasttrackmonkey.com>
	<453ef5d4.JWeFkgfXTFibI+uh%perryh@pluto.rain.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-hackers@freebsd.org
Subject: Re: Panic caused by bad memory?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 25 Oct 2006 06:28:53 -0000

On Tue, 24 Oct 2006 perryh@pluto.rain.com wrote:

>> I can't get a kernel dump since it fails like this each time:
>>
>> dumping to dev #da/0x20001, offset 2097152
>> dump 1024 1023 1022 1021 Aborting dump due to I/O error.
>> status == 0xb, scsi status == 0x0
>> failed, reason: i/o error
>
> Bad memory seems unlikely to cause an I/O error trying to write the
> dump to the swap partition.  I'd guess a dicey drive -- and bad
> swap space could also account for the original crash.  You might
> be able to get a backup by booting single user, provided nothing
> activates the (presumably bad) swap partition.

Just for the record, this box is running an Adaptec raid controller (2005S 
- ZCR card) and swap is coming off a mirrored array.

Coincidentally, I have a utility box where it had bad blocks on the swap 
partition (but no others) - what I saw there is that the box would just 
hang and spit out a bunch of "swap_pager timeout" messages to the console. 
Quick and dirty remote fix while waiting for a drive?  Run file-backed 
swap on /usr. :)

Let's pretend for a minute it's not the drive that's the root cause... 
Not saying it isn't - we're none too thrilled with these Adaptec RAID 
controllers...  Do those memory addresses in the panic message point 
towards bad memory if they are always the same?

Thanks,

Charles