Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 7 Nov 2005 16:23:22 -0500 (EST)
From:      "Michael Lieske" <micahjon@ywave.com>
To:        <xfb52@dial.pipex.com>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: Diagnosing reboot under load
Message-ID:  <1638.128.208.250.149.1131398602.squirrel@webx1.neonova.net>
In-Reply-To: <436FA4F0.3060102@dial.pipex.com>
References:  <436E739E.8020605@ywave.com>	<436E7599.9090003@cs.earlham.edu>	<436E7D4E.6080707@ywave.com>	<F3441A15-7CD9-4B7E-8AE9-359B59658C82@u.washington.edu>	<436E9DF0.1080408@ywave.com>	<436F1779.7090807@u.washington.edu>	<436F6B5F.9000304@ywave.com>	<20051107100935.31771357.wmoran@potentialtech.com>	<20051107102617.3abfd2c5.wmoran@potentialtech.com>	<436F896B.2040404@dial.pipex.com> <436F8E2E.802@ywave.com> <436FA4F0.3060102@dial.pipex.com>

next in thread | previous in thread | raw e-mail | index | archive | help
> Micah wrote:
>
>> I'm really beginning to doubt it's the PSU.  Why?  I cannot get the
>> output voltage to drop no matter what load I throw at it.  I plugged
>> in four additional hard drives and ran a system stress test and still
>> the voltages remained rock steady at the values I stated earlier.  I
>> ran it for an hours with the high-low monitor on a Fluke multimeter.
>> The +5 stayed near 5.1 with 5.08 as the bottom, and the +12 stayed
>> near 11.89 with 11.84 as the minimum.  I even had one of the "random
>> segfaults" and the +12 voltage never dropped below 11.84.  I'm not
>> sure how I can get the load any higher without using resistors which
>> most certainly does not simulate the load I'm generating while
>> compiling.
>>
>> That leaves memory, CPU or mobo.  I ran memtest86+ and it reported no
>> errors.  I'll run it again for an extended period of time while I'm at
>>  school to see if it reports anything.  That leaves CPU and mobo.
>> Anyone got any ideas how to test those?  The only system test I can
>> run that does report an error is Lucifer 1.0 (on the ultimate boot
>> cd).  The mprime test and cpuburn do not find any errors.
>
> The usual advice is to run memtest86 overnight, but I'm not convinced it
>  will find a fault related to either temperature or load, since memtest
> seems to cause neither.  Still, worth a try.
>
> When I was arsing around with overclocking, I could reliably crash the
> machine (IIRC) like this:
>     run cpu burn
>     run mprime (or was it a pi generator?  can't recall now...)
>     wait for temp to hit max
>     kill cpuburn!
>     wait < 5 mins for either machine to crash or prime/pi test to have
> error
>
> I was fairly convinced at the time that it was the memory which didn't
> cope.  This is possibly not far off what happens in a big series of
> stressful compiles.
>
> As for diagnosing faults, you may be down to replacing components one at
>  a time and seeing if it makes a difference.  That's easier when the
> machine crashes quickly, so if you can find something which reliably
> crashes it, that's good.  If you have >1 memory stick and the machine
> will run with a single stick, try each stick in turn.  You could also
> try deliberately under-performing the memory and see if that makes it
> reliable.  Was the memory you go on the compatibility list for the mobo?
>
> Hope that helps,
>
> --Alex

Unfortunately I cannot reliably reproduce it.  If I remember correctly the
memmory was on the compatibility list.  I do wish I had bought two sticks
of 512 rather than one of 1024.

Thanks,
Micah





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1638.128.208.250.149.1131398602.squirrel>