From owner-freebsd-questions@FreeBSD.ORG Mon Oct 30 23:09:21 2006 Return-Path: X-Original-To: freebsd-questions@freebsd.org Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4D0C216A494 for ; Mon, 30 Oct 2006 23:09:21 +0000 (UTC) (envelope-from cristian.mijea@gmail.com) Received: from ug-out-1314.google.com (ug-out-1314.google.com [66.249.92.171]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8226843D5A for ; Mon, 30 Oct 2006 23:09:20 +0000 (GMT) (envelope-from cristian.mijea@gmail.com) Received: by ug-out-1314.google.com with SMTP id m2so1105306uge for ; Mon, 30 Oct 2006 15:09:19 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=uo8g1Tzk+OvLlGXfU4L5WHXHJfAXLs8iuZabPhKoIT7LTzUwkhUrU7MPP27OhcHu5oS6kou34St59A+d5uJ4oZWVxuquPk4NFUSNgdNaH+tOAzvV5ABZYe/SyntyWb34P7CtLBAW0lWonGDEglhmiAZBxajGMBaHykY/LhWwNQk= Received: by 10.78.127.3 with SMTP id z3mr5712745huc; Mon, 30 Oct 2006 15:09:18 -0800 (PST) Received: by 10.78.181.16 with HTTP; Mon, 30 Oct 2006 15:09:18 -0800 (PST) Message-ID: <268ac7a80610301509k23f88233q6d81179fbb63615e@mail.gmail.com> Date: Tue, 31 Oct 2006 01:09:18 +0200 From: "Cristian Mijea" To: freebsd-questions@freebsd.org In-Reply-To: <44psc9r5dv.fsf@be-well.ilk.org> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <454399E5.3030904@jesdesign.nl> <44psc9r5dv.fsf@be-well.ilk.org> Cc: Ronald Paul Subject: Re: Instable machine; hardware or not? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Oct 2006 23:09:21 -0000 On 10/30/06, Lowell Gilbert wrote: > Ronald Paul writes: > > > I have a small server (AMD XP 2400+, ASRock K7VM4+lan, no ECC) running > > 4.9-RELEASE since February 2004. It is being used for some small > > dynamic websites (FAMP), e-mail and some other small stuff. It got an > > uptime of 400+ days last year but since a few months, the machines > > seems to get more and more unstable. > > > > Seemingly random signals (most of them 11, some 10 and 6) are causing > > random processes (including bash, cron, named, adjkernts, inetd, > > syslogd and sh) to exit. So this cannot be something else than faulty > > hardware, you would think. But, and this is the strange part for me, > > these instabilities are somehow triggered because when the machine is > > restarted, the server seems rock-solid for the first week. I then can > > compile a kernel without problems. > > > > Temperatures and voltages are fine: > >> # healthd -d > >> Temp.= 38.0, 21.5, 0.0; Rot.= 3629, 0, 0 > >> Vcore = 1.73, 0.00; Volt. = 3.28, 4.95, 11.55, -10.55, -4.56 > > > > I already swapped memory and disk but this behavior keeps the same. Is > > there any possibility that this crashes would disappear when switching > > to 6.1-RELEASE or are these problems solely caused by hardware? If so, > > is there any indication on to what hardware-component I should look? > > I'm planning to switch motherboards but since it is quite a drive to > > our co-location facility and because it is still functioning as > > production-server and we do not have much failsafe-services yet, I > > want to think twice. > > Yes, it's probably a hardware problem, and yes, it will probably be > hard to prove that. Assuming your time has some value, I would > recommend replacing the whole machine; that way, you can have it set > up and tested before moving it out on location. If the server can work "rock-solid" for a week, I would look at the heat factor and also try a new clean install. Anyway if time is a factor a new machine is probably a good idea.