From owner-freebsd-current@FreeBSD.ORG Thu Mar 31 20:54:34 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 05924106564A; Thu, 31 Mar 2011 20:54:34 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) by mx1.freebsd.org (Postfix) with ESMTP id C03028FC08; Thu, 31 Mar 2011 20:54:33 +0000 (UTC) Received: from julian-mac.elischer.org (home-nat.elischer.org [67.100.89.137]) (authenticated bits=0) by vps1.elischer.org (8.14.4/8.14.4) with ESMTP id p2VKsTvJ033667 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 31 Mar 2011 13:54:32 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <4D94EA1D.9010708@freebsd.org> Date: Thu, 31 Mar 2011 13:54:53 -0700 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10.4; en-US; rv:1.9.2.15) Gecko/20110303 Thunderbird/3.1.9 MIME-Version: 1.0 To: John Baldwin References: <201103311418.31658.jhb@freebsd.org> <201103311437.19682.jhb@freebsd.org> In-Reply-To: <201103311437.19682.jhb@freebsd.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Attilio Rao , freebsd-current@freebsd.org, Svatopluk Kraus Subject: Re: schedcpu() in /sys/kern/sched_4bsd.c calls thread_lock() on thread with un-initialized td_lock X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 20:54:34 -0000 On 3/31/11 11:37 AM, John Baldwin wrote: > On Thursday, March 31, 2011 2:20:11 pm Attilio Rao wrote: >> 2011/3/31 John Baldwin: >>> On Thursday, March 31, 2011 12:34:31 pm Attilio Rao wrote: >>>> 2011/3/31 John Baldwin: >>>>> On Thursday, March 31, 2011 7:32:26 am Svatopluk Kraus wrote: >>>>>> Hi, >>>>>> >>>>>> I've got a page fault (because of NULL td_lock) in >>>>>> thread_lock_flags() called from schedcpu() in /sys/kern/sched_4bsd.c >>>>>> file. During process fork, new thread is linked to new process which >>>>>> is linked to allproc list and both allproc_lock and new process lock >>>>>> are unlocked before sched_fork() is called, where new thread td_lock >>>>>> is initialized. Only PRS_NEW process status is on sentry but not >>>>>> checked in schedcpu(). >>>>> I think this should fix it: >>>>> >>>>> Index: sched_4bsd.c >>>>> =================================================================== >>>>> --- sched_4bsd.c (revision 220190) >>>>> +++ sched_4bsd.c (working copy) >>>>> @@ -463,6 +463,10 @@ schedcpu(void) >>>>> sx_slock(&allproc_lock); >>>>> FOREACH_PROC_IN_SYSTEM(p) { >>>>> PROC_LOCK(p); >>>>> + if (p->p_state == PRS_NEW) { >>>>> + PROC_UNLOCK(p); >>>>> + continue; >>>>> + } >>>>> FOREACH_THREAD_IN_PROC(p, td) { >>>>> awake = 0; >>>>> thread_lock(td); >>>>> >>>> I don't really think this fix is right because otherwise, when using >>>> sched_4bsd anytime we are going to scan the thread list within a proc >>>> we need to check for PRS_NEW. >>>> >>>> We likely need to change the init scheme for the td_lock by having a >>>> scheduler primitive setting it and doing that on thread_init() UMA >>>> constructor, or similar approach. >>> But the thread state isn't valid anyway. 4BSD shouldn't be touching the >>> thread since it is in an incomplete / undefined state. >> Yep, in this case I'd then want to just add the threads to proc once >> they are fully initialized. >> >> It is pointless (and dangerous) to replicate this check all over, >> besides we want scheduler agnostic code, which means every iterations >> of p_threads will need to check for a valid state of threads. > Yes, we do have to check for PRS_NEW in many places with the current approach, > but we need some way to reserve the PID to avoid duplicates and unless we > expand the scope of allproc in fork by a whole lot or stop using the allproc > list to track "pids in use", we will be stuck with some sort of "process > is still being built" sentry. > the pid used to be reserved in the pid hash it was not put into the proc list until it was set up. I know you don't believe me but that's how it was around 2000 I'm pretty sure of it.