From owner-freebsd-hackers@FreeBSD.ORG Thu Jan 4 18:09:25 2007 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8865616A660 for ; Thu, 4 Jan 2007 18:09:25 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (66-23-211-162.clients.speedfactory.net [66.23.211.162]) by mx1.freebsd.org (Postfix) with ESMTP id 17E4A13C459 for ; Thu, 4 Jan 2007 18:09:25 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.13.6/8.13.6) with ESMTP id l04I90Zv031471; Thu, 4 Jan 2007 13:09:12 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: Brian Dean Date: Thu, 4 Jan 2007 12:53:47 -0500 User-Agent: KMail/1.9.1 References: <20061214190510.GA26590@neutrino.bsdhome.com> <200612272350.43680.jhb@freebsd.org> <20070104152754.GA94609@neutrino.bsdhome.com> In-Reply-To: <20070104152754.GA94609@neutrino.bsdhome.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200701041253.48092.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Thu, 04 Jan 2007 13:09:12 -0500 (EST) X-Virus-Scanned: ClamAV 0.88.3/2413/Thu Jan 4 04:46:27 2007 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: freebsd-hackers@freebsd.org, "R. Tyler Ballance" , Brian Dean Subject: Re: Kernel hang on 6.x X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Jan 2007 18:09:25 -0000 On Thursday 04 January 2007 10:27, Brian Dean wrote: > On Wed, Dec 27, 2006 at 11:50:43PM -0500, John Baldwin wrote: > > > The 'traceall' seemed to miss several threads actually (like pid > > 18). Can you get a 'ps'? Also, are you able to get a kernel dump > > when this happens? > > I can't ps that particular session since it is no longer available, > however I can reproduce another one and generate a new set of debug > output. One note, the "swap_pager: indefinite wait buffer: ..." > timeout message may have been a result of a misconfigured secondary > swap file, so that might be a red herring. However, we can still > reliably reproduce the hang with 32 Gig swap, but we don't get any > console messages associated with it. > > The system is set up as a test system so I'm not under any pressure to > get it rebooted and back up when it hangs, so I have the ability to > take some time to debug it. > > I believe that I can generate a kernel dump. We tried this yesterday > but didn't have a dump device configured. I think we've got that set > up now and plan to generate a kernel dump. I'm assuming that since > the process size and swap size is so large, that the dump size is > going to be very large also, on the order of 32 Gig. I beleive I can > host this on a server and make it accessible to you if you are willing > to download it. If this is 6.x, turn on minidumps via the sysctl. The dump size normally is the size of RAM. With minidumps it can be a lot smaller. If you get a dump, let me know and I'll point you at some gdb scripts to generate 'ps' type output, etc. -- John Baldwin