From owner-freebsd-current@FreeBSD.ORG Thu Jan 6 21:21:10 2005 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7A51216A4E8 for ; Thu, 6 Jan 2005 21:21:10 +0000 (GMT) Received: from mail5.speakeasy.net (mail5.speakeasy.net [216.254.0.205]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0908543D1D for ; Thu, 6 Jan 2005 21:21:10 +0000 (GMT) (envelope-from jhb@FreeBSD.org) Received: (qmail 7552 invoked from network); 6 Jan 2005 21:21:09 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) encrypted SMTP for ; 6 Jan 2005 21:21:09 -0000 Received: from [10.50.41.243] (gw1.twc.weather.com [216.133.140.1]) (authenticated bits=0) by server.baldwin.cx (8.12.11/8.12.11) with ESMTP id j06LKeBC083753; Thu, 6 Jan 2005 16:21:05 -0500 (EST) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: freebsd-current@FreeBSD.org Date: Thu, 6 Jan 2005 14:48:34 -0500 User-Agent: KMail/1.6.2 References: <20041218221322.GA18557@xor.obsecurity.org> In-Reply-To: <20041218221322.GA18557@xor.obsecurity.org> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200501061448.34992.jhb@FreeBSD.org> X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on server.baldwin.cx cc: current@FreeBSD.org cc: Kris Kennaway Subject: Re: Deadlock under recent 6.0 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Jan 2005 21:21:10 -0000 On Saturday 18 December 2004 05:13 pm, Kris Kennaway wrote: > After updating it earlier this week, one of my SMP machines (with > SCHED_4BSD) is regularly deadlocking under load; nothing is reported > by WITNESS. A sample process listing from DDB and some strack traces > are as follows: > > db> ps > pid proc uid ppid pgrp flag stat wmesg wchan cmd > 83025 c8142dc8 0 83014 77576 0004000 [CPU 1] kldload [ snip, many threads blocked on Giant ] > 12 c563e9d8 0 0 0 000020c [CPU 0] idle: cpu0 > 11 c563ebd0 0 0 0 000020c [Can run] idle: cpu1 > 1 c563edc8 0 0 1 0004200 [SLPQ wait 0xc563edc8][SLP] init > 10 c5647000 0 0 0 0000204 [SLPQ ktrace 0xc074e818][SLP] > ktrace 0 c074dd80 0 0 0 0000200 [SLPQ sched 0xc074dd80][SLP] > swapper db> tr 83025 > Tracing pid 83025 tid 100398 td 0xc8145b80 > sched_switch(c0758848,c075f120,1,c0758848,c075f0e0) at sched_switch+0xfe > w_data(e8721c73,8b01c783,75db851b,83d231d0,d08914c4) at w_data+0x16a8 Unless you are on CPU 1, you aren't going to get an accurate trace of that thread. That thread is the one holding Giant, so it is the one that needs to be looked at. If kldload is the thread that is always running across several different attempts at breaking into ddb, doing a ps, and then doing a continue, then I'd try to look for some kind of infinite loop while holding Giant. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org