From owner-freebsd-threads@FreeBSD.ORG Tue Sep 14 16:06:18 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BF8E616A4CE; Tue, 14 Sep 2004 16:06:18 +0000 (GMT) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5A74E43D1F; Tue, 14 Sep 2004 16:06:18 +0000 (GMT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.12.10/8.12.10) with ESMTP id i8EG6GJt001935 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 14 Sep 2004 12:06:16 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.12.9p2/8.12.9/Submit) id i8EG6BUF068803; Tue, 14 Sep 2004 12:06:11 -0400 (EDT) (envelope-from gallatin) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16711.5875.358882.236642@grasshopper.cs.duke.edu> Date: Tue, 14 Sep 2004 12:06:11 -0400 (EDT) To: Julian Elischer In-Reply-To: <4147100C.8000005@elischer.org> References: <16703.11479.679335.588170@grasshopper.cs.duke.edu> <16703.12410.319869.29996@grasshopper.cs.duke.edu> <413F55B8.50003@elischer.org> <16703.28031.454342.774229@grasshopper.cs.duke.edu> <413F8DBB.5040502@elischer.org> <16704.40876.708925.425911@grasshopper.cs.duke.edu> <4140AA2A.90605@elischer.org> <16704.45327.42494.922427@grasshopper.cs.duke.edu> <4140C04D.1060906@elischer.org> <16704.49447.290897.602540@grasshopper.cs.duke.edu> <4146AAC1.5020701@elischer.org> <16711.383.448500.578640@grasshopper.cs.duke.edu> <4147100C.8000005@elischer.org> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid cc: John Baldwin cc: freebsd-threads@freebsd.org Subject: Re: Unkillable KSE threaded proc X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Sep 2004 16:06:18 -0000 FWIW, for the case where there is one lingering thread, calling thread_unsuspend_one() on it seems to get it to exit.. Maybe there is some sort of race while exiting which causes the wrong number of threads to be either suspended, or unsuspended. If too many are suspended, one is left lingering. If too few are suspended, the system deadlocks because a thread never gets off the cpu. Would it help at all to try with libthr and see what it does? Let me know what more I can do to help get this fixed.. Drew PS: By "one lingering thread", I mean the case I first complained about. Eg: 540 c164e700 e52e1000 1387 1 538 000c482 (threaded) mx_pingpong thread 0xc1fb8320 ksegrp 0xc15bb850 [SUSP] db> tr 540 sched_switch(c1fb8320,0,0,15fc9814,e30bebc7) at sched_switch+0xd8 mi_switch(1,0,e881fc44,c051e6dd,c1fb8320) at mi_switch+0x1c7 thread_single(1,c06eaae0,e881fc64,c164e700,c1fb8320) at thread_single+0x1d7 exit1(c1fb8320,9,0,e881fce4,c051877e) at exit1+0x115 expand_name(c1fb8320,9,100,0,0) at expand_name postsig(9,202,c06e5dd8,17f,8058f84) at postsig+0x204 ast(e881fd48) at ast+0x5e4 doreti_ast() at doreti_ast+0x17 db> call thread_unsuspend_one(0xc1fb8320) 0xc1562640