Date: Thu, 26 Oct 2006 18:24:33 -0400 (EDT) From: Charles Sprickman <spork@fasttrackmonkey.com> To: John Baldwin <jhb@freebsd.org> Cc: freebsd-hackers@freebsd.org, perryh@pluto.rain.com Subject: Re: Panic caused by bad memory? Message-ID: <Pine.OSX.4.61.0610261821260.889@white.nat.fasttrackmonkey.com> In-Reply-To: <200610251055.15445.jhb@freebsd.org> References: <Pine.OSX.4.61.0610241900480.889@white.nat.fasttrackmonkey.com> <453ef5d4.JWeFkgfXTFibI%2Buh%perryh@pluto.rain.com> <Pine.OSX.4.61.0610250223540.889@white.nat.fasttrackmonkey.com> <200610251055.15445.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 25 Oct 2006, John Baldwin wrote: > On Wednesday 25 October 2006 02:28, Charles Sprickman wrote: >> On Tue, 24 Oct 2006 perryh@pluto.rain.com wrote: >> >>>> I can't get a kernel dump since it fails like this each time: >>>> >>>> dumping to dev #da/0x20001, offset 2097152 >>>> dump 1024 1023 1022 1021 Aborting dump due to I/O error. >>>> status == 0xb, scsi status == 0x0 >>>> failed, reason: i/o error >>> >>> Bad memory seems unlikely to cause an I/O error trying to write the >>> dump to the swap partition. I'd guess a dicey drive -- and bad >>> swap space could also account for the original crash. You might >>> be able to get a backup by booting single user, provided nothing >>> activates the (presumably bad) swap partition. >> >> Just for the record, this box is running an Adaptec raid controller (2005S >> - ZCR card) and swap is coming off a mirrored array. >> >> Coincidentally, I have a utility box where it had bad blocks on the swap >> partition (but no others) - what I saw there is that the box would just >> hang and spit out a bunch of "swap_pager timeout" messages to the console. >> Quick and dirty remote fix while waiting for a drive? Run file-backed >> swap on /usr. :) >> >> Let's pretend for a minute it's not the drive that's the root cause... >> Not saying it isn't - we're none too thrilled with these Adaptec RAID >> controllers... Do those memory addresses in the panic message point >> towards bad memory if they are always the same? > > No, they are virtual addresses. Having the same EIP means you are crashing in > the same place. Did you recently kldunload a module before it crashed? Same place == same code? The only change on this box was a massive portupgrade which included apache, php, mysql, postgres and most of the additional gnu tools. There is one module that someone set to load on boot, and that's the linuxolator. I have disabled that in rc.conf for now and we'll see what happens after the next panic. We also have a few sticks of RAM on order now... Thanks, Charles > -- > John Baldwin > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.OSX.4.61.0610261821260.889>