From owner-freebsd-current@FreeBSD.ORG Thu Mar 31 18:34:42 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C5B34106564A for ; Thu, 31 Mar 2011 18:34:42 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 964FB8FC19 for ; Thu, 31 Mar 2011 18:34:42 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 21F2546B03; Thu, 31 Mar 2011 14:34:42 -0400 (EDT) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id A3ED88A01B; Thu, 31 Mar 2011 14:34:41 -0400 (EDT) From: John Baldwin To: Svatopluk Kraus Date: Thu, 31 Mar 2011 14:34:41 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110325; KDE/4.5.5; amd64; ; ) References: <201103310958.51416.jhb@freebsd.org> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201103311434.41188.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (bigwig.baldwin.cx); Thu, 31 Mar 2011 14:34:41 -0400 (EDT) Cc: freebsd-current@freebsd.org Subject: Re: schedcpu() in /sys/kern/sched_4bsd.c calls thread_lock() on thread with un-initialized td_lock X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 18:34:42 -0000 On Thursday, March 31, 2011 12:21:45 pm Svatopluk Kraus wrote: > On Thu, Mar 31, 2011 at 3:58 PM, John Baldwin wrote: > > On Thursday, March 31, 2011 7:32:26 am Svatopluk Kraus wrote: > >> Hi, > >> > >> I've got a page fault (because of NULL td_lock) in > >> thread_lock_flags() called from schedcpu() in /sys/kern/sched_4bsd.c > >> file. During process fork, new thread is linked to new process which > >> is linked to allproc list and both allproc_lock and new process lock > >> are unlocked before sched_fork() is called, where new thread td_lock > >> is initialized. Only PRS_NEW process status is on sentry but not > >> checked in schedcpu(). > > > > I think this should fix it: > > > > Index: sched_4bsd.c > > =================================================================== > > --- sched_4bsd.c (revision 220190) > > +++ sched_4bsd.c (working copy) > > @@ -463,6 +463,10 @@ schedcpu(void) > > sx_slock(&allproc_lock); > > FOREACH_PROC_IN_SYSTEM(p) { > > PROC_LOCK(p); > > + if (p->p_state == PRS_NEW) { > > + PROC_UNLOCK(p); > > + continue; > > + } > > FOREACH_THREAD_IN_PROC(p, td) { > > awake = 0; > > thread_lock(td); > > > > Thanks for patch. Maybe, test p_state not to be PRS_NORMAL could be better? I thought about that, but zombies are always moved to zombproc atomically with changing p_state (and under an exclusive allproc_lock) and all the other places currently use this type of check. > I've got next (same reason) page fault in thread_lock_flags() called > from scheduler() in sys/vm/vm_glue.c. I try to search for > FOREACH_THREAD_IN_PROC() together with FOREACH_PROC_IN_SYSTEM() in > /sys subtree and next problem could be in deadlkres() in > sys/kern/kern_clock.c at least. Here is a larger patch: Index: kern/kern_ktrace.c =================================================================== --- kern/kern_ktrace.c (revision 220190) +++ kern/kern_ktrace.c (working copy) @@ -882,7 +882,8 @@ nfound = 0; LIST_FOREACH(p, &pg->pg_members, p_pglist) { PROC_LOCK(p); - if (p_cansee(td, p) != 0) { + if (p->p_state == PRS_NEW || + p_cansee(td, p) != 0) { PROC_UNLOCK(p); continue; } Index: kern/kern_sig.c =================================================================== --- kern/kern_sig.c (revision 220190) +++ kern/kern_sig.c (working copy) @@ -1799,7 +1799,8 @@ PGRP_LOCK_ASSERT(pgrp, MA_OWNED); LIST_FOREACH(p, &pgrp->pg_members, p_pglist) { PROC_LOCK(p); - if (checkctty == 0 || p->p_flag & P_CONTROLT) + if (p->p_state == PRS_NORMAL && + (checkctty == 0 || p->p_flag & P_CONTROLT)) pksignal(p, sig, ksi); PROC_UNLOCK(p); } @@ -3313,7 +3314,8 @@ PGRP_LOCK(sigio->sio_pgrp); LIST_FOREACH(p, &sigio->sio_pgrp->pg_members, p_pglist) { PROC_LOCK(p); - if (CANSIGIO(sigio->sio_ucred, p->p_ucred) && + if (p->p_state == PRS_NORMAL && + CANSIGIO(sigio->sio_ucred, p->p_ucred) && (checkctty == 0 || (p->p_flag & P_CONTROLT))) psignal(p, sig); PROC_UNLOCK(p); Index: kern/kern_clock.c =================================================================== --- kern/kern_clock.c (revision 220190) +++ kern/kern_clock.c (working copy) @@ -201,6 +201,10 @@ tryl = 0; FOREACH_PROC_IN_SYSTEM(p) { PROC_LOCK(p); + if (p->p_state == PRS_NEW) { + PROC_UNLOCK(p); + continue; + } FOREACH_THREAD_IN_PROC(p, td) { /* Index: kern/sched_4bsd.c =================================================================== --- kern/sched_4bsd.c (revision 220190) +++ kern/sched_4bsd.c (working copy) @@ -463,6 +463,10 @@ sx_slock(&allproc_lock); FOREACH_PROC_IN_SYSTEM(p) { PROC_LOCK(p); + if (p->p_state == PRS_NEW) { + PROC_UNLOCK(p); + continue; + } FOREACH_THREAD_IN_PROC(p, td) { awake = 0; thread_lock(td); Index: kern/kern_resource.c =================================================================== --- kern/kern_resource.c (revision 220190) +++ kern/kern_resource.c (working copy) @@ -129,7 +129,8 @@ sx_sunlock(&proctree_lock); LIST_FOREACH(p, &pg->pg_members, p_pglist) { PROC_LOCK(p); - if (p_cansee(td, p) == 0) { + if (p->p_state == PRS_NORMAL && + p_cansee(td, p) == 0) { if (p->p_nice < low) low = p->p_nice; } @@ -215,7 +216,8 @@ sx_sunlock(&proctree_lock); LIST_FOREACH(p, &pg->pg_members, p_pglist) { PROC_LOCK(p); - if (p_cansee(td, p) == 0) { + if (p->p_state == PRS_NORMAL && + p_cansee(td, p) == 0) { error = donice(td, p, uap->prio); found++; } @@ -230,7 +232,8 @@ sx_slock(&allproc_lock); FOREACH_PROC_IN_SYSTEM(p) { PROC_LOCK(p); - if (p->p_ucred->cr_uid == uap->who && + if (p->p_state == PRS_NORMAL && + p->p_ucred->cr_uid == uap->who && p_cansee(td, p) == 0) { error = donice(td, p, uap->prio); found++; Index: vm/vm_glue.c =================================================================== --- vm/vm_glue.c (revision 220190) +++ vm/vm_glue.c (working copy) @@ -730,7 +730,8 @@ sx_slock(&allproc_lock); FOREACH_PROC_IN_SYSTEM(p) { PROC_LOCK(p); - if (p->p_flag & (P_SWAPPINGOUT | P_SWAPPINGIN | P_INMEM)) { + if (p->p_state == PRS_NEW || + p->p_flag & (P_SWAPPINGOUT | P_SWAPPINGIN | P_INMEM)) { PROC_UNLOCK(p); continue; } -- John Baldwin