From owner-freebsd-current@FreeBSD.ORG Fri Aug 13 20:15:46 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 69F5816A4CE for ; Fri, 13 Aug 2004 20:15:46 +0000 (GMT) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id 150B343D31 for ; Fri, 13 Aug 2004 20:15:46 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from [192.168.0.201] ([192.168.0.201]) (authenticated bits=0) by pooker.samsco.org (8.12.11/8.12.10) with ESMTP id i7DKFiPx042932; Fri, 13 Aug 2004 14:15:45 -0600 (MDT) (envelope-from scottl@samsco.org) Message-ID: <411D20DF.2000503@samsco.org> Date: Fri, 13 Aug 2004 14:13:19 -0600 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.1) Gecko/20040801 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Doug White References: <20040813121208.M31181@cvs.imp.ch> <20040813102922.E93695@carver.gumbysoft.com> In-Reply-To: <20040813102922.E93695@carver.gumbysoft.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, hits=0.0 required=3.8 tests=none autolearn=no version=2.63 X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on pooker.samsco.org cc: Martin Blapp cc: freebsd-current@freebsd.org Subject: Re: Deadlocks with recent SMP current X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Aug 2004 20:15:46 -0000 Doug White wrote: > On Fri, 13 Aug 2004, Martin Blapp wrote: > > >>Since yesterday I'm getting complete deadlocks. This time unrelated >>the servers are nor loaded at all, the just freeze after a while. >>No break into DDB possible at all. > > > Welcome to the club; I've been having them on my -curent builder since Aug > 4. I'm going to set up a duplicate box and start binary-searching for the > offending commit(s). > > Preemption is the default, disabled. > > My box is a dual-600MHz P3 with 1GB RAM and running kde. A make -j3 > buildworld will lock it up 75% of the time. It'll survive a nonparallel > build, and it'll survive a kernel build. > > Haven't tried WITNESS+INVARIANTS yet since it really dogs the machine. :) > Can you try the patch below? It's really only a band-aid, but might make things usable for now. Also, are more lockups being seen under ULE or under 4BSD. There was a recent change to ULE (rev 1.120 of sched_ule.c) that seems to have aggrivated the scheduler problems on my test systems. Scott Index: kern_switch.c =================================================================== RCS file: /usr/ncvs/src/sys/kern/kern_switch.c,v retrieving revision 1.78 diff -u -r1.78 kern_switch.c --- kern_switch.c 10 Aug 2004 00:26:25 -0000 1.78 +++ kern_switch.c 13 Aug 2004 20:11:27 -0000 @@ -345,6 +345,8 @@ return; } + critical_enter(); + tda = kg->kg_last_assigned; if ((ke = td->td_kse) == NULL) { if (kg->kg_idle_kses) { @@ -441,6 +443,7 @@ CTR3(KTR_RUNQ, "setrunqueue: held: td%p kg%p pid%d", td, td->td_ksegrp, td->td_proc->p_pid); } + critical_exit(); } /*