Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 8 Oct 2008 00:01:42 -0700
From:      Jeremy Chadwick <koitsu@FreeBSD.org>
To:        Mister Olli <mister.olli@googlemail.com>
Cc:        Jerry McAllister <jerrymc@msu.edu>, freebsd-questions@freebsd.org
Subject:   Re: analyzing freebsd core dumps
Message-ID:  <20081008070142.GA69250@icarus.home.lan>
In-Reply-To: <1223447412.5896.9.camel@phoenix.blechhirn.net>
References:  <1223273047.23248.25.camel@phoenix.blechhirn.net> <20081006171809.GA26368@icarus.home.lan> <20081006174502.GB71024@gizmo.acns.msu.edu> <1223447412.5896.9.camel@phoenix.blechhirn.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Oct 08, 2008 at 08:30:12AM +0200, Mister Olli wrote:
> hi...
> 
> thanks for the feedback on this topic.
> the first step to clean the machine and check all connectors has been
> done yesterday. I hope that this will fix the problem, and that it's not
> some kind of hardware failure.
> 
> to run tests with memtest is quite a problem, since the machine has high
> availability requirements. to take it off for nearly one hour for
> cleaning and checking during daily work of our company was a pain.
> 6 hours or more of RAM tests is not possible.
> 
> is there some other way to detect hardware failure with less time
> consuming tool/ process?

Yes -- you start replacing hardware one piece at a time until the
problem goes away.  That will also require downtime, quite regularly,
and waste money.

So to answer your question: no, there is no way to easily track down the
source of a hardware failure, or determine what piece has failed (if
any).  This is completely 100% normal when it comes to computers,
especially x86 PCs.  Anyone who has worked in the IT field for many
years knows this.  :-)

I'm amazed that in this day and age, any company would have a single
host as a single-point-of-failure.  You can't take this machine down
for troubleshooting, but you have no failover available.  The company
has put themselves into this situation.

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20081008070142.GA69250>