Date: Mon, 07 Nov 2005 19:03:12 +0000 From: Alex Zbyslaw <xfb52@dial.pipex.com> To: Micah <micahjon@ywave.com> Cc: freebsd-questions@freebsd.org Subject: Re: Diagnosing reboot under load Message-ID: <436FA4F0.3060102@dial.pipex.com> In-Reply-To: <436F8E2E.802@ywave.com> References: <436E739E.8020605@ywave.com> <436E7599.9090003@cs.earlham.edu> <436E7D4E.6080707@ywave.com> <F3441A15-7CD9-4B7E-8AE9-359B59658C82@u.washington.edu> <436E9DF0.1080408@ywave.com> <436F1779.7090807@u.washington.edu> <436F6B5F.9000304@ywave.com> <20051107100935.31771357.wmoran@potentialtech.com> <20051107102617.3abfd2c5.wmoran@potentialtech.com> <436F896B.2040404@dial.pipex.com> <436F8E2E.802@ywave.com>
index | next in thread | previous in thread | raw e-mail
Micah wrote:
> I'm really beginning to doubt it's the PSU. Why? I cannot get the
> output voltage to drop no matter what load I throw at it. I plugged
> in four additional hard drives and ran a system stress test and still
> the voltages remained rock steady at the values I stated earlier. I
> ran it for an hours with the high-low monitor on a Fluke multimeter.
> The +5 stayed near 5.1 with 5.08 as the bottom, and the +12 stayed
> near 11.89 with 11.84 as the minimum. I even had one of the "random
> segfaults" and the +12 voltage never dropped below 11.84. I'm not
> sure how I can get the load any higher without using resistors which
> most certainly does not simulate the load I'm generating while compiling.
>
> That leaves memory, CPU or mobo. I ran memtest86+ and it reported no
> errors. I'll run it again for an extended period of time while I'm at
> school to see if it reports anything. That leaves CPU and mobo.
> Anyone got any ideas how to test those? The only system test I can
> run that does report an error is Lucifer 1.0 (on the ultimate boot
> cd). The mprime test and cpuburn do not find any errors.
The usual advice is to run memtest86 overnight, but I'm not convinced it
will find a fault related to either temperature or load, since memtest
seems to cause neither. Still, worth a try.
When I was arsing around with overclocking, I could reliably crash the
machine (IIRC) like this:
run cpu burn
run mprime (or was it a pi generator? can't recall now...)
wait for temp to hit max
kill cpuburn!
wait < 5 mins for either machine to crash or prime/pi test to have error
I was fairly convinced at the time that it was the memory which didn't
cope. This is possibly not far off what happens in a big series of
stressful compiles.
As for diagnosing faults, you may be down to replacing components one at
a time and seeing if it makes a difference. That's easier when the
machine crashes quickly, so if you can find something which reliably
crashes it, that's good. If you have >1 memory stick and the machine
will run with a single stick, try each stick in turn. You could also
try deliberately under-performing the memory and see if that makes it
reliable. Was the memory you go on the compatibility list for the mobo?
Hope that helps,
--Alex
help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?436FA4F0.3060102>
