Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 31 Oct 2006 01:09:18 +0200
From:      "Cristian Mijea" <cristian.mijea@gmail.com>
To:        freebsd-questions@freebsd.org
Cc:        Ronald Paul <ronald@jesdesign.nl>
Subject:   Re: Instable machine; hardware or not?
Message-ID:  <268ac7a80610301509k23f88233q6d81179fbb63615e@mail.gmail.com>
In-Reply-To: <44psc9r5dv.fsf@be-well.ilk.org>
References:  <454399E5.3030904@jesdesign.nl> <44psc9r5dv.fsf@be-well.ilk.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 10/30/06, Lowell Gilbert <freebsd-questions-local@be-well.ilk.org> wrote:
> Ronald Paul <ronald@jesdesign.nl> writes:
>
> > I have a small server (AMD XP 2400+, ASRock K7VM4+lan, no ECC) running
> > 4.9-RELEASE since February 2004. It is being used for some small
> > dynamic websites (FAMP), e-mail and some other small stuff. It got an
> > uptime of 400+ days last year but since a few months, the machines
> > seems to get more and more unstable.
> >
> > Seemingly random signals (most of them 11, some 10 and 6) are causing
> > random processes (including bash, cron, named, adjkernts, inetd,
> > syslogd and sh) to exit. So this cannot be something else than faulty
> > hardware, you would think. But, and this is the strange part for me,
> > these instabilities are somehow triggered because when the machine is
> > restarted, the server seems rock-solid for the first week. I then can
> > compile a kernel without problems.
> >
> > Temperatures and voltages are fine:
> >> # healthd -d
> >> Temp.= 38.0, 21.5,  0.0; Rot.= 3629,    0,    0
> >>  Vcore = 1.73, 0.00; Volt. = 3.28, 4.95, 11.55, -10.55, -4.56
> >
> > I already swapped memory and disk but this behavior keeps the same. Is
> > there any possibility that this crashes would disappear when switching
> > to 6.1-RELEASE or are these problems solely caused by hardware? If so,
> > is there any indication on to what hardware-component I should look?
> > I'm planning to switch motherboards but since it is quite a drive to
> > our co-location facility and because it is still functioning as
> > production-server and we do not have much failsafe-services yet, I
> > want to think twice.
>
> Yes, it's probably a hardware problem, and yes, it will probably be
> hard to prove that.  Assuming your time has some value, I would
> recommend replacing the whole machine; that way, you can have it set
> up and tested before moving it out on location.


If the server can work "rock-solid" for a week, I would look at the
heat factor and also try a new clean install.
Anyway if time is a factor a new machine is probably a good idea.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?268ac7a80610301509k23f88233q6d81179fbb63615e>