From owner-freebsd-questions@FreeBSD.ORG Mon Oct 30 19:10:02 2006 Return-Path: X-Original-To: freebsd-questions@freebsd.org Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B3BDB16A505 for ; Mon, 30 Oct 2006 19:10:02 +0000 (UTC) (envelope-from freebsd-questions-local@be-well.ilk.org) Received: from mail7.sea5.speakeasy.net (mail7.sea5.speakeasy.net [69.17.117.9]) by mx1.FreeBSD.org (Postfix) with ESMTP id C7A4B43E25 for ; Mon, 30 Oct 2006 19:05:39 +0000 (GMT) (envelope-from freebsd-questions-local@be-well.ilk.org) Received: (qmail 31168 invoked from network); 30 Oct 2006 19:05:33 -0000 Received: from dsl092-078-145.bos1.dsl.speakeasy.net (HELO be-well.ilk.org) ([66.92.78.145]) (envelope-sender ) by mail7.sea5.speakeasy.net (qmail-ldap-1.03) with SMTP for ; 30 Oct 2006 19:05:33 -0000 Received: by be-well.ilk.org (Postfix, from userid 1147) id 8B8C828432; Mon, 30 Oct 2006 14:05:32 -0500 (EST) To: Ronald Paul References: <454399E5.3030904@jesdesign.nl> From: Lowell Gilbert Date: Mon, 30 Oct 2006 14:05:32 -0500 In-Reply-To: <454399E5.3030904@jesdesign.nl> (Ronald Paul's message of "Sat, 28 Oct 2006 19:56:53 +0200") Message-ID: <44psc9r5dv.fsf@be-well.ilk.org> User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: freebsd-questions@freebsd.org Subject: Re: Instable machine; hardware or not? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: freebsd-questions@freebsd.org List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Oct 2006 19:10:02 -0000 Ronald Paul writes: > I have a small server (AMD XP 2400+, ASRock K7VM4+lan, no ECC) running > 4.9-RELEASE since February 2004. It is being used for some small > dynamic websites (FAMP), e-mail and some other small stuff. It got an > uptime of 400+ days last year but since a few months, the machines > seems to get more and more unstable. > > Seemingly random signals (most of them 11, some 10 and 6) are causing > random processes (including bash, cron, named, adjkernts, inetd, > syslogd and sh) to exit. So this cannot be something else than faulty > hardware, you would think. But, and this is the strange part for me, > these instabilities are somehow triggered because when the machine is > restarted, the server seems rock-solid for the first week. I then can > compile a kernel without problems. > > Temperatures and voltages are fine: >> # healthd -d >> Temp.= 38.0, 21.5, 0.0; Rot.= 3629, 0, 0 >> Vcore = 1.73, 0.00; Volt. = 3.28, 4.95, 11.55, -10.55, -4.56 > > I already swapped memory and disk but this behavior keeps the same. Is > there any possibility that this crashes would disappear when switching > to 6.1-RELEASE or are these problems solely caused by hardware? If so, > is there any indication on to what hardware-component I should look? > I'm planning to switch motherboards but since it is quite a drive to > our co-location facility and because it is still functioning as > production-server and we do not have much failsafe-services yet, I > want to think twice. Yes, it's probably a hardware problem, and yes, it will probably be hard to prove that. Assuming your time has some value, I would recommend replacing the whole machine; that way, you can have it set up and tested before moving it out on location.