Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 26 Jun 2006 14:06:31 +0100 (BST)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        "Marc G. Fournier" <scrappy@hub.org>
Cc:        freebsd-acpi@freebsd.org, freebsd-stable@freeBSD.org, Pete French <petefrench@ticketswitch.com>
Subject:   Re: FreeBSD 6.x CVSUP today crashes with zero load ...
Message-ID:  <20060626140333.M38418@fledge.watson.org>
In-Reply-To: <20060626081029.L1114@ganymede.hub.org>
References:  <E1FuYsL-000HT3-H2@dilbert.firstcallgroup.co.uk> <20060626100949.G24406@fledge.watson.org> <20060626081029.L1114@ganymede.hub.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On Mon, 26 Jun 2006, Marc G. Fournier wrote:

>> I'm also running 6.x on several dual-PIII without problems.  An issue local 
>> to Marc's setup is definitely indicated.  Given the failure mode, I would 
>> be worried about a potential hardware issue, although subtle hardware and 
>> subtle system software problems are sometimes difficult to distinguish.
>
> Well, I've been trying to do it 'the hardway' ... went back to the original 
> kernel, and am slowly upgrading forward ... I'm currently running a June 
> 15th kernel with none of the problems that I was seeing before ... I'm just 
> in the process of running my third 'make -j3 buildworld' on this kernel, and 
> its clean ... going to go forward to June 22nd next, see if that too is 
> clean *cross fingers*

I think this is a useful activity, especially if you've already run extensive 
memory testing on the box.  If you haven't yet done that, I encourage you to 
take a break from buildworld's and make sure the memory tests pass. I spent 
several months on and off trying to track down a bug a few years ago, which 
turned out to be a one bit error in memory on the box.  It would appear and 
disappear based on how the memory page was used -- for debugging kernels, it 
consistently got mapped to padding in the kernel's bss.  For non-debugging 
kernels, it typically manifested in other usable kernel momory.  Changes in 
kernel versions would move the bit around kernel memory and user memory, 
resulting in hard to debug failure modes.  I wish I'd run the memory test 
earlier, but the lesson is clear!

Robert N M Watson
Computer Laboratory
University of Cambridge



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060626140333.M38418>