From owner-freebsd-hackers@FreeBSD.ORG  Wed Oct 25 15:04:13 2006
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
X-Original-To: freebsd-hackers@freebsd.org
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 56E3416A4E7
	for <freebsd-hackers@freebsd.org>; Wed, 25 Oct 2006 15:04:13 +0000 (UTC)
	(envelope-from jhb@freebsd.org)
Received: from server.baldwin.cx (66-23-211-162.clients.speedfactory.net
	[66.23.211.162])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 4B25A43D94
	for <freebsd-hackers@freebsd.org>; Wed, 25 Oct 2006 15:03:56 +0000 (GMT)
	(envelope-from jhb@freebsd.org)
Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1])
	(authenticated bits=0)
	by server.baldwin.cx (8.13.6/8.13.6) with ESMTP id k9PF3hTt018720;
	Wed, 25 Oct 2006 11:03:50 -0400 (EDT) (envelope-from jhb@freebsd.org)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-hackers@freebsd.org
Date: Wed, 25 Oct 2006 10:55:14 -0400
User-Agent: KMail/1.9.1
References: <Pine.OSX.4.61.0610241900480.889@white.nat.fasttrackmonkey.com>
	<453ef5d4.JWeFkgfXTFibI+uh%perryh@pluto.rain.com>
	<Pine.OSX.4.61.0610250223540.889@white.nat.fasttrackmonkey.com>
In-Reply-To: <Pine.OSX.4.61.0610250223540.889@white.nat.fasttrackmonkey.com>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200610251055.15445.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by
	milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]);
	Wed, 25 Oct 2006 11:03:51 -0400 (EDT)
X-Virus-Scanned: ClamAV 0.88.3/2098/Wed Oct 25 09:14:20 2006 on
	server.baldwin.cx
X-Virus-Status: Clean
X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 
	autolearn=ham version=3.1.3
X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx
Cc: Charles Sprickman <spork@fasttrackmonkey.com>, perryh@pluto.rain.com
Subject: Re: Panic caused by bad memory?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 25 Oct 2006 15:04:13 -0000

On Wednesday 25 October 2006 02:28, Charles Sprickman wrote:
> On Tue, 24 Oct 2006 perryh@pluto.rain.com wrote:
> 
> >> I can't get a kernel dump since it fails like this each time:
> >>
> >> dumping to dev #da/0x20001, offset 2097152
> >> dump 1024 1023 1022 1021 Aborting dump due to I/O error.
> >> status == 0xb, scsi status == 0x0
> >> failed, reason: i/o error
> >
> > Bad memory seems unlikely to cause an I/O error trying to write the
> > dump to the swap partition.  I'd guess a dicey drive -- and bad
> > swap space could also account for the original crash.  You might
> > be able to get a backup by booting single user, provided nothing
> > activates the (presumably bad) swap partition.
> 
> Just for the record, this box is running an Adaptec raid controller (2005S 
> - ZCR card) and swap is coming off a mirrored array.
> 
> Coincidentally, I have a utility box where it had bad blocks on the swap 
> partition (but no others) - what I saw there is that the box would just 
> hang and spit out a bunch of "swap_pager timeout" messages to the console. 
> Quick and dirty remote fix while waiting for a drive?  Run file-backed 
> swap on /usr. :)
> 
> Let's pretend for a minute it's not the drive that's the root cause... 
> Not saying it isn't - we're none too thrilled with these Adaptec RAID 
> controllers...  Do those memory addresses in the panic message point 
> towards bad memory if they are always the same?

No, they are virtual addresses.  Having the same EIP means you are crashing in 
the same place.  Did you recently kldunload a module before it crashed?

-- 
John Baldwin