From owner-freebsd-hackers@FreeBSD.ORG Thu Oct 26 22:24:35 2006 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EDA7616A403 for ; Thu, 26 Oct 2006 22:24:35 +0000 (UTC) (envelope-from spork@fasttrackmonkey.com) Received: from angryfist.fasttrackmonkey.com (angryfist.fasttrackmonkey.com [216.220.107.230]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2CBDD43D45 for ; Thu, 26 Oct 2006 22:24:35 +0000 (GMT) (envelope-from spork@fasttrackmonkey.com) Received: (qmail 14159 invoked by uid 2003); 26 Oct 2006 22:25:40 -0000 Received: from spork@fasttrackmonkey.com by angryfist.fasttrackmonkey.com by uid 1001 with qmail-scanner-1.20 (clamscan: 0.65. Clear:RC:1(216.220.116.154):. Processed in 0.103728 secs); 26 Oct 2006 22:25:40 -0000 Received: from unknown (HELO white.nat.fasttrackmonkey.com) (216.220.116.154) by 0 with (DHE-RSA-AES256-SHA encrypted) SMTP; 26 Oct 2006 22:25:40 -0000 Date: Thu, 26 Oct 2006 18:24:33 -0400 (EDT) From: Charles Sprickman X-X-Sender: spork@white.nat.fasttrackmonkey.com To: John Baldwin In-Reply-To: <200610251055.15445.jhb@freebsd.org> Message-ID: References: <453ef5d4.JWeFkgfXTFibI+uh%perryh@pluto.rain.com> <200610251055.15445.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-hackers@freebsd.org, perryh@pluto.rain.com Subject: Re: Panic caused by bad memory? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Oct 2006 22:24:36 -0000 On Wed, 25 Oct 2006, John Baldwin wrote: > On Wednesday 25 October 2006 02:28, Charles Sprickman wrote: >> On Tue, 24 Oct 2006 perryh@pluto.rain.com wrote: >> >>>> I can't get a kernel dump since it fails like this each time: >>>> >>>> dumping to dev #da/0x20001, offset 2097152 >>>> dump 1024 1023 1022 1021 Aborting dump due to I/O error. >>>> status == 0xb, scsi status == 0x0 >>>> failed, reason: i/o error >>> >>> Bad memory seems unlikely to cause an I/O error trying to write the >>> dump to the swap partition. I'd guess a dicey drive -- and bad >>> swap space could also account for the original crash. You might >>> be able to get a backup by booting single user, provided nothing >>> activates the (presumably bad) swap partition. >> >> Just for the record, this box is running an Adaptec raid controller (2005S >> - ZCR card) and swap is coming off a mirrored array. >> >> Coincidentally, I have a utility box where it had bad blocks on the swap >> partition (but no others) - what I saw there is that the box would just >> hang and spit out a bunch of "swap_pager timeout" messages to the console. >> Quick and dirty remote fix while waiting for a drive? Run file-backed >> swap on /usr. :) >> >> Let's pretend for a minute it's not the drive that's the root cause... >> Not saying it isn't - we're none too thrilled with these Adaptec RAID >> controllers... Do those memory addresses in the panic message point >> towards bad memory if they are always the same? > > No, they are virtual addresses. Having the same EIP means you are crashing in > the same place. Did you recently kldunload a module before it crashed? Same place == same code? The only change on this box was a massive portupgrade which included apache, php, mysql, postgres and most of the additional gnu tools. There is one module that someone set to load on boot, and that's the linuxolator. I have disabled that in rc.conf for now and we'll see what happens after the next panic. We also have a few sticks of RAM on order now... Thanks, Charles > -- > John Baldwin > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" >