From owner-freebsd-questions@FreeBSD.ORG Mon Nov 7 21:24:02 2005 Return-Path: X-Original-To: freebsd-questions@freebsd.org Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3A36516A41F for ; Mon, 7 Nov 2005 21:24:02 +0000 (GMT) (envelope-from micahjon@ywave.com) Received: from relay0.av-mx.com (relay2.av-mx.com [137.118.16.124]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9E61143D8A for ; Mon, 7 Nov 2005 21:23:23 +0000 (GMT) (envelope-from micahjon@ywave.com) X-Virus-Scan-Time: 0 Received: from [137.118.16.62] (HELO mx1.av-mx.com) by relay0.av-mx.com (CommuniGate Pro SMTP 4.2.10) with SMTP id 49788622 for freebsd-questions@freebsd.org; Mon, 07 Nov 2005 16:23:23 -0500 Received: (qmail 9359 invoked from network); 7 Nov 2005 21:23:23 -0000 Received: from webx1.neonova.net (137.118.60.140) by 0 with SMTP; 7 Nov 2005 21:23:23 -0000 X-CLIENT-IP: 137.118.60.140 X-CLIENT-HOST: webx1.neonova.net Received: (from nobody@localhost) by webX1.neonova.net (8.11.6/8.11.6) id jA7LNMA12956; Mon, 7 Nov 2005 16:23:22 -0500 Received: from 128.208.250.149 (SquirrelMail authenticated user micahjon@ywave.com) by webx1.neonova.net with HTTP; Mon, 7 Nov 2005 16:23:22 -0500 (EST) Message-ID: <1638.128.208.250.149.1131398602.squirrel@webx1.neonova.net> Date: Mon, 7 Nov 2005 16:23:22 -0500 (EST) From: "Michael Lieske" To: In-Reply-To: <436FA4F0.3060102@dial.pipex.com> References: <436E739E.8020605@ywave.com> <436E7599.9090003@cs.earlham.edu> <436E7D4E.6080707@ywave.com> <436E9DF0.1080408@ywave.com> <436F1779.7090807@u.washington.edu> <436F6B5F.9000304@ywave.com> <20051107100935.31771357.wmoran@potentialtech.com> <20051107102617.3abfd2c5.wmoran@potentialtech.com> <436F896B.2040404@dial.pipex.com> <436F8E2E.802@ywave.com> <436FA4F0.3060102@dial.pipex.com> X-Priority: 3 Importance: Normal X-MSMail-Priority: Normal X-Mailer: SquirrelMail (version 1.2.7) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Cc: freebsd-questions@freebsd.org Subject: Re: Diagnosing reboot under load X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: micahjon@ywave.com List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Nov 2005 21:24:02 -0000 > Micah wrote: > >> I'm really beginning to doubt it's the PSU. Why? I cannot get the >> output voltage to drop no matter what load I throw at it. I plugged >> in four additional hard drives and ran a system stress test and still >> the voltages remained rock steady at the values I stated earlier. I >> ran it for an hours with the high-low monitor on a Fluke multimeter. >> The +5 stayed near 5.1 with 5.08 as the bottom, and the +12 stayed >> near 11.89 with 11.84 as the minimum. I even had one of the "random >> segfaults" and the +12 voltage never dropped below 11.84. I'm not >> sure how I can get the load any higher without using resistors which >> most certainly does not simulate the load I'm generating while >> compiling. >> >> That leaves memory, CPU or mobo. I ran memtest86+ and it reported no >> errors. I'll run it again for an extended period of time while I'm at >> school to see if it reports anything. That leaves CPU and mobo. >> Anyone got any ideas how to test those? The only system test I can >> run that does report an error is Lucifer 1.0 (on the ultimate boot >> cd). The mprime test and cpuburn do not find any errors. > > The usual advice is to run memtest86 overnight, but I'm not convinced it > will find a fault related to either temperature or load, since memtest > seems to cause neither. Still, worth a try. > > When I was arsing around with overclocking, I could reliably crash the > machine (IIRC) like this: > run cpu burn > run mprime (or was it a pi generator? can't recall now...) > wait for temp to hit max > kill cpuburn! > wait < 5 mins for either machine to crash or prime/pi test to have > error > > I was fairly convinced at the time that it was the memory which didn't > cope. This is possibly not far off what happens in a big series of > stressful compiles. > > As for diagnosing faults, you may be down to replacing components one at > a time and seeing if it makes a difference. That's easier when the > machine crashes quickly, so if you can find something which reliably > crashes it, that's good. If you have >1 memory stick and the machine > will run with a single stick, try each stick in turn. You could also > try deliberately under-performing the memory and see if that makes it > reliable. Was the memory you go on the compatibility list for the mobo? > > Hope that helps, > > --Alex Unfortunately I cannot reliably reproduce it. If I remember correctly the memmory was on the compatibility list. I do wish I had bought two sticks of 512 rather than one of 1024. Thanks, Micah