From owner-freebsd-questions@FreeBSD.ORG Sun Nov 18 21:41:56 2007 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4DC5B16A421 for ; Sun, 18 Nov 2007 21:41:56 +0000 (UTC) (envelope-from bsd-unix@embarqmail.com) Received: from mailrelay.embarq.synacor.com (mailrelay.embarq.synacor.com [208.47.184.3]) by mx1.freebsd.org (Postfix) with ESMTP id EFDBD13C46A for ; Sun, 18 Nov 2007 21:41:55 +0000 (UTC) (envelope-from bsd-unix@embarqmail.com) X_CMAE_Category: 0,0 Undefined,Undefined X-CNFS-Analysis: v=1.0 c=1 a=ODN9OVgc0BXSdPnRlR4A:9 a=7JtgLVOUQwjVK7eNZqcR6eheRH8A:4 a=MSl-tDqOz04A:10 a=LY0hPdMaydYA:10 X-CM-Score: 0 X-Scanned-by: Cloudmark Authority Engine Authentication-Results: smtp08.embarq.synacor.com smtp.mail=bsd-unix@embarqmail.com; spf=neutral Authentication-Results: smtp08.embarq.synacor.com smtp.user=rpratt1950@embarqmail.com; auth=pass (LOGIN) Received-SPF: neutral (smtp08.embarq.synacor.com: 76.6.194.183 is neither permitted nor denied by domain of embarqmail.com) Received: from [76.6.194.183] ([76.6.194.183:58053] helo=kt.weeeble.com) by mailrelay.embarq.synacor.com (envelope-from ) (ecelerity 2.2.1.21 r(19176)) with ESMTPA id 28/1F-02853-591B0474; Sun, 18 Nov 2007 16:41:41 -0500 Date: Sun, 18 Nov 2007 16:41:38 -0500 From: Randy Pratt To: "n j" Message-Id: <20071118164138.ebd3492c.bsd-unix@embarqmail.com> In-Reply-To: <92bcbda50711181312l1dc6b26cteaad3c8db11e17b6@mail.gmail.com> References: <92bcbda50711180451h5db8f4ady6e2d21da80d32548@mail.gmail.com> <20071118163747.36C5F4AB7D@mail.kaltimpost.net> <92bcbda50711181312l1dc6b26cteaad3c8db11e17b6@mail.gmail.com> X-Mailer: Sylpheed 2.4.7 (GTK+ 2.12.1; i386-portbld-freebsd6.2) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: freebsd-questions@freebsd.org, "Anthony M. Rasat" Subject: Re: Unexpected shutdown X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Nov 2007 21:41:56 -0000 On Sun, 18 Nov 2007 22:12:34 +0100 "n j" wrote: > > Does it happened before, or does it happened everyday at 3 am, or is this the first time your box shutdown without explaination? > > No, this is the first time this has occurred, that is what makes it > completely unexpected. > > > If this is the first time, I would say there are many possibilities. Say an accidental quick push on power button or - humor me - the cleaning lady is with the conserve energy movement and thought your box just another forgotten-to-shutdown desktop, that alone could explain your mysterious shutdown incident. > > The machine is located in a server room within a server rack with a > (detachable) panel on the front side of the machine (Dell Poweredge) > that is covering the power-off button. No cleaning lady is entering > the room, especially at 3 AM. Due to all the circumstances I had > described, I ruled out (physical) human factor as the cause of > shutdown. > > The box has two independent AC power supplies, no hardware error is > found in RAC card logs, no other server (in the same rack/room) shut > down at that time. That is what leads me to believe that the problem > is software-related. > > I know there are many possibilities out there, but I am pondering this > for the whole day and ruled out everything that came to mind. So, any > other ideas - even humorous - are welcome. A few months ago I started having random mysterious lockups, no panics, no messages, no hints, no keyboard and no ssh. It forced me to recycle power to get the system back. After playing the RAM swap game, updating sources, and other such dead-ends, I felt the hard drives (Maxtor 7200RPM 250G type) and they were quite warm. I did a little hardware re-arranging so that the hard drives got more air and I've not had a lockup since. I had also been monitoring the temperature but didn't see any indication that it was the CPU or motherboard components. This is all ancedotal since I don't have any hard evidence to point to exactly one thing since I also swapped out a fan and reinserted connectors in the process. My feeling is that it was hard drive heat-related so my suggestion is to do some poking around for hot spots, clogged fan filters and any other factors affecting temperatures. In any case, in the grand scheme of things, *all* hardware will fail ... eventually ;-) Randy --