From owner-freebsd-questions@FreeBSD.ORG Tue Sep 16 19:42:25 2008 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2002C106567F for ; Tue, 16 Sep 2008 19:42:25 +0000 (UTC) (envelope-from chris@smartt.com) Received: from barium.smartt.com (barium.smartt.com [69.67.187.30]) by mx1.freebsd.org (Postfix) with ESMTP id EEA138FC22 for ; Tue, 16 Sep 2008 19:42:24 +0000 (UTC) (envelope-from chris@smartt.com) Received: from [69.31.174.220] (unknown [69.31.174.220]) by barium.smartt.com (Postfix) with ESMTP id 8B0DC10E489; Tue, 16 Sep 2008 12:42:06 -0700 (PDT) Message-ID: <48D00C20.1010503@smartt.com> Date: Tue, 16 Sep 2008 12:42:24 -0700 From: Chris St Denis User-Agent: Thunderbird 2.0.0.16 (Windows/20080708) MIME-Version: 1.0 To: freebsd-questions@freebsd.org References: <44ljxs9cgh.fsf@be-well.ilk.org> In-Reply-To: <44ljxs9cgh.fsf@be-well.ilk.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Gian Paolo Buono Subject: Re: FreeBSD 7 server in hang X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Sep 2008 19:42:25 -0000 Lowell Gilbert wrote: > "Gian Paolo Buono" writes: > > >> Hi, I have on a server ibm 3650 FreeBSD 7.0-STABLE and the proccess >> that running are nagios-3.0.2, apache-2.2.8 and heartbeat-1.2.5_3; >> random after some day machine becomes semi-dead, the ping respond but >> any stack (ssh,http) don't work and heartbeat don't switch the >> resources I can't loggon and I must reboot. In the syslog there >> isn't any message for trobleshotting the problem. Any idea ? Sorry >> for my english >> Best Regards >> > > Try keeping an eye on top(1); it may even give a hint after it stops > updating. If that doesn't help, you may need to break to the kernel > debugger (details in developers' handbook). > > I also was having some lockup problems on a 3650. Don't know if it's related but I will document my experiences in case it's of any help. Initially it was fine (Running 7.0-Release, but after some hardware problems the system was continuing to lockup even after the whole server was replaced. I ended up doing a clean install of 7-stable (as of August 20th) and haven't had any problems since. Not sure if it was some odd corruption of kernel or other system files (server went through many hard reboots during the hardware problems) or a bug in 7.0-release that was fixed in 7-stable. In my specific symptoms if I had something like top running on the console, it would continue to run. Top didn't show much of interest other than some 100% apache processes (which I suspect was a symptom of the problem, not the cause). When the system was hung top would continue running and updating, but it would not accept keyboard input. I could switch to other virtual consoles with alt+f#, but they also would not take text input for a login.