Date: Tue, 13 Mar 2012 10:09:14 +0200 From: Volodymyr Kostyrko <c.kworr@gmail.com> To: Matthew Seaman <matthew@FreeBSD.org> Cc: freebsd-questions@FreeBSD.org Subject: Re: 9.0 spontaneously reboots Message-ID: <4F5F00AA.1060008@gmail.com> In-Reply-To: <4F5E2ADB.6020104@FreeBSD.org> References: <4F5E031D.5060203@gmail.com> <4F5E2ADB.6020104@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Matthew Seaman wrote: > On 12/03/2012 14:07, Volodymyr Kostyrko wrote: >> What should I blame now? Is it some programming error or should I >> continue with testing/changing motherboard and cpu? > > Instability that appears spontaneously (and especially if it persists > across system updates) is almost always caused by hardware problems. > So, yes, carry on swapping out components until you can isolate where > the problem is. > > Some common hardware problems which might result in the problems you've > seen: > > * PSU going flakey. If you have the right measuring equipment, this > is pretty easy to detect by looking at the output voltages -- if > they've drifted out of spec, or if you've got mains frequency > jitter leaking through then its no wonder your system crashes. Sensors report everything is good. > * Similarly, if the crashing is associated with system load, > (particularly at startup, when things are happening like disks > spinning up) this can indicate a power supply fading under load. > That can happen due to age, or because you've been adding extra > hardware and haven't considered the power requirements. The only load I know to cause sure lockup in some hours is memcached. Right now project is migrated to redis and machines survives for two weeks. Most common problem for lockup is ECC error. > * The other reason for crashing under load is overheating. > Sometimes this can be cured easily by cleaning dust out of vents > and heat-sinks. Check too for fans either seized or running > slowly. Sensors reports normal temperature. > * You may need to clean off any old heat-sink compound and re-apply > a fresh layer, especially if you've taken CPU coolers off at > some point. > > * There's also the old capacitor problem: electrolytic capacitors > have a failure mode that generates some positive pressure inside > them. This is detectable by the end of the capacitor being bowed > out, rather than slightly concave. (Generally this means a new > motherboard, although I've heard of people being able to solder in > replacements successfully.) It's fully serviced SuperMicro server without any additional problems. > Other than that, try disconnecting and reconnecting peripherals like > disks or DVDs and so forth in various combinations to test if that > improves system stability. One faulty component can knock the whole > machine over. -- Sphinx of black quartz judge my vow.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F5F00AA.1060008>