From owner-freebsd-stable@FreeBSD.ORG Mon Dec 12 21:40:20 2005 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BDA0716A41F for ; Mon, 12 Dec 2005 21:40:20 +0000 (GMT) (envelope-from PeterJeremy@optushome.com.au) Received: from mail09.syd.optusnet.com.au (mail09.syd.optusnet.com.au [211.29.132.190]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2232D43D55 for ; Mon, 12 Dec 2005 21:40:18 +0000 (GMT) (envelope-from PeterJeremy@optushome.com.au) Received: from cirb503493.alcatel.com.au (c220-239-19-236.belrs4.nsw.optusnet.com.au [220.239.19.236]) by mail09.syd.optusnet.com.au (8.12.11/8.12.11) with ESMTP id jBCLe4L0026188 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Tue, 13 Dec 2005 08:40:05 +1100 Received: from cirb503493.alcatel.com.au (localhost.alcatel.com.au [127.0.0.1]) by cirb503493.alcatel.com.au (8.12.10/8.12.10) with ESMTP id jBCLe4Hh077294; Tue, 13 Dec 2005 08:40:04 +1100 (EST) (envelope-from pjeremy@cirb503493.alcatel.com.au) Received: (from pjeremy@localhost) by cirb503493.alcatel.com.au (8.12.10/8.12.9/Submit) id jBCLe4lo077293; Tue, 13 Dec 2005 08:40:04 +1100 (EST) (envelope-from pjeremy) Date: Tue, 13 Dec 2005 08:40:04 +1100 From: Peter Jeremy To: Atanas Message-ID: <20051212214003.GA77268@cirb503493.alcatel.com.au> References: <439DE88B.1090407@asd.aplus.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <439DE88B.1090407@asd.aplus.net> User-Agent: Mutt/1.4.2.1i X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc Cc: freebsd-stable@freebsd.org Subject: Re: 6.0 random freezes X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Dec 2005 21:40:20 -0000 On Mon, 2005-Dec-12 13:15:55 -0800, Atanas wrote: >I have 3 machines running 6.0-RELEASE, and recently 2 of them started >freezing once a day or so. There are no error messages on the console or >in the system logs. > >The first one I put in production about a month ago and it was working >flawlessly until it got some load and now it started freezing almost >every day. The second one has exactly the same behavior - it was fine >when doing nothing (a couple of weeks), and started freezing when loaded. Define "freezing": Does it respond to pings? Can you switch VTYs? Do the num-lock/caps-lock LEDs respond? Do some processes seem to freeze before others? I suggest you add the following to your kernel config: options KDB # Enable kernel debugger support. options DDB # Support DDB. When it hangs, break into DDB (Ctrl-Alt-Esc on the console or BREAK on a serial console). As a start, run 'show lockedvnods' and 'ps'. My guess is that you'll see a lock that has a number of waiters - which is probably the culprit. Use 'panic' or 'call doadump' to get a crashdump and then you can use kgdb to rummage around once you reboot - see http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebg-gdb.html >< makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols I suggest you add this back in. Without it, you can't debug any crash dumps that you manage to get (and add "dumpdev" to your rc.conf). >Now the only reasonable option for me (I mean for production and in >relatively short term) seems going downward to 5.4 and wait until 6.x >get more stable Whilst I realise that you can't have production machines freezing on schedule, your assistance in providing more information about your problem will help make 6.x more stable. -- Peter Jeremy