From owner-freebsd-amd64@FreeBSD.ORG Fri Jan 20 21:48:15 2006 Return-Path: X-Original-To: freebsd-amd64@freebsd.org Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AD9BB16A41F; Fri, 20 Jan 2006 21:48:15 +0000 (GMT) (envelope-from jp@tns.cz) Received: from bertik.tns.cz (r2ah164.chello.upc.cz [62.245.97.164]) by mx1.FreeBSD.org (Postfix) with ESMTP id DE08E43D45; Fri, 20 Jan 2006 21:48:13 +0000 (GMT) (envelope-from jp@tns.cz) Received: by bertik.tns.cz (Postfix, from userid 1000) id 71FDE59734; Fri, 20 Jan 2006 22:48:27 +0100 (CET) Date: Fri, 20 Jan 2006 22:48:27 +0100 From: Josef Pojsl To: John Baldwin Message-ID: <20060120214827.GO795@bertik.tns.cz> References: <20060118071323.GB795@bertik.tns.cz> <200601181240.03366.jhb@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200601181240.03366.jhb@freebsd.org> User-Agent: Mutt/1.5.6i Cc: freebsd-amd64@freebsd.org Subject: Re: 5.4/amd64 not stable X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Jan 2006 21:48:15 -0000 On Wed, Jan 18, 2006 at 12:40:01PM -0500, John Baldwin wrote: > It can. :) Can you compile DDB into your kernel to get a stack trace when it > panics? Also, if you have a kernel.debug, you can run gdb on it and do a > list of the instruction pointer to get the corresponding file:line. i.e.: > > # gdb kernel.debug > gdb> l *0xffffffff80271a83 John, thank you for the reply. In the meantime, we were trying hard to simulate the problem in lab, unfortunately without success. Ultimately, we decided to upgrade to 6.0/amd64 and put the gear into production. It has been up and running for 53 hours now, so it is very likely that the problem disappeared in 6.0. As bosses were getting a bit nervous, we do not plan any further experiments. There was one issue on the 6.0 system running in production. Due to misconfiguration of Apache, a huge number of processes had opened a huge number of sockets. As a result, the maximum number of open files was reached (kern.maxfiles). This did not appear on 5.4, but it should have because it was running the same misconfigured Apache! Therefore, it _may_ be possible that whenever kern.maxfiles was reached on 5.4, panic occured. I admit that this is only a weak suspicion, and I have no direct proofs of it. BTW, kern.maxfiles is 500000 (yes, half a million). -- Josef