From owner-freebsd-current@FreeBSD.ORG Sun Sep 12 06:39:45 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 18C2116A4CE; Sun, 12 Sep 2004 06:39:45 +0000 (GMT) Received: from pimout3-ext.prodigy.net (pimout3-ext.prodigy.net [207.115.63.102]) by mx1.FreeBSD.org (Postfix) with ESMTP id 87C4C43D46; Sun, 12 Sep 2004 06:39:44 +0000 (GMT) (envelope-from julian@elischer.org) Received: from elischer.org (adsl-68-123-121-27.dsl.snfc21.pacbell.net [68.123.121.27])i8C6df3d099758; Sun, 12 Sep 2004 02:39:42 -0400 Message-ID: <4143EF29.2080404@elischer.org> Date: Sat, 11 Sep 2004 23:39:37 -0700 From: Julian Elischer User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.4b) Gecko/20030524 X-Accept-Language: en, hu MIME-Version: 1.0 To: John Baldwin , Peter Wemm , current@freebsd.org Content-Type: multipart/mixed; boundary="------------070802050901090002030708" Subject: [Patch] panics/hangs with preemption and threads. X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 12 Sep 2004 06:39:45 -0000 This is a multi-part message in MIME format. --------------070802050901090002030708 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Guys I think I found a (the?) major cause for the corruptions of the ksegrp/thread runqueue for threaded processes when Premption is turned on.. When a thread is scheduled in setrunqueue() the firt thing that is done is that it is put in the correct place in the ksegrp's run queue,. then if it is in the top N spots (where N is the defined concurrency and is usually <= NCPU) it is passed down to the system scheduler using sched_add(). Sched_add can call maybe_preempt() which can decide to switch out the current thread and switch to the new one immediatly. The trouble with that is that we have already put the new one on the ksegrp's run queue! When that thread is next put on the run queue using setrunqueue() it is already there, and we end up with an infinitly looping run queue. Any code that follows that list will never end. and the system will freeze. Here is a patch that solves it but I'm not happy about it.. John, you wrote the preemption code.. do you have any ideas about how to do this cleaner? One possibility is to make sched_add return a value that indicates if the thread was handled immediatly. that would allow setrunqueue to only set it into the ksegrp's run queue if it was not already handled. Other suggestions welcome. --------------070802050901090002030708 Content-Type: text/plain; name="q.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="q.diff" ==== //depot/projects/nsched/sys/kern/kern_switch.c#21 - /home/julian/p4/nsched/sys/kern/kern_switch.c ==== @@ -396,5 +396,9 @@ return; } + if (((flags & (SRQ_YIELDING|SRQ_OURSELF|SRQ_NOPREEMPT)) == 0) && + maybe_preempt(td)) + return; + tda = kg->kg_last_assigned; if ((kg->kg_avail_opennings <= 0) && @@ -453,7 +457,7 @@ kg->kg_last_assigned = td2; } kg->kg_avail_opennings--; - sched_add(td2, flags); + sched_add(td2, flags|SRQ_NOPREEMPT); } else { CTR3(KTR_RUNQ, "setrunqueue: held: td%p kg%p pid%d", td, td->td_ksegrp, td->td_proc->p_pid); ==== //depot/projects/nsched/sys/kern/sched_4bsd.c#48 - /home/julian/p4/nsched/sys/kern/sched_4bsd.c ==== @@ -1018,7 +1018,8 @@ #endif { - if (maybe_preempt(td)) + if (((flags & SRQ_NOPREEMPT) == 0) && + maybe_preempt(td)) return; } } ==== //depot/projects/nsched/sys/kern/sched_ule.c#30 - /home/julian/p4/nsched/sys/kern/sched_ule.c ==== @@ -1662,13 +1662,13 @@ /* let jeff work out how to map the flags better */ /* I'm open to suggestions */ - if (flags & SRQ_YIELDING) + if (flags & (SRQ_YIELDING|SRQ_NOPREEMPT)) { /* * Preempting during switching can be bad JUJU * especially for KSE processes */ sched_add_internal(td, 0); - else + } else sched_add_internal(td, 1); } ==== //depot/projects/nsched/sys/sys/proc.h#29 - /home/julian/p4/nsched/sys/sys/proc.h ==== @@ -658,6 +658,7 @@ #define SRQ_YIELDING 0x0001 /* we are yielding (from mi_switch) */ #define SRQ_OURSELF 0x0002 /* it is ourself (from mi_switch) */ #define SRQ_INTR 0x0004 /* it is probably urgent */ +#define SRQ_NOPREEMPT 0x0008 /* Just don't ok? */ /* How values for thread_single(). */ #define SINGLE_NO_EXIT 0 --------------070802050901090002030708--