From owner-freebsd-hackers@freebsd.org Wed Feb 3 01:04:17 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9E8CAA9AB22 for ; Wed, 3 Feb 2016 01:04:17 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-wm0-x242.google.com (mail-wm0-x242.google.com [IPv6:2a00:1450:400c:c09::242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 42B931DCE; Wed, 3 Feb 2016 01:04:17 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by mail-wm0-x242.google.com with SMTP id r129so4952616wmr.0; Tue, 02 Feb 2016 17:04:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=ZsDFtcVKAzFzibtwRScJ7D+u8pnTjOJjbIxhcwRx+jw=; b=v9f+wU6kP4MIGF6P+gbIeKvv3SNH0Dg482v/aOnzohvBi3MX+J5upDHZoWYn45Ekwa 5TS11oJ2O3bPugjAJjiJW2dWO1pnfrWJS6Zrv/+RGxnwW5z3ZggfGDPDC9YZ6BOVedhe BkfC7jmooXZHEMhPQQwrIVm6W/Mh/5eDLdhNHUrXpx45eT/4TV12vipHmArv2BvT2yyW qE24kvFE0gwy3VxauxEOF3U3rFW1Uc4xmYZeHeR5xybR7hr/RZG3tTMineDIEOEPyGUk rtj92FuKcPMHaOzNpAtWkYvB6i00Fjd+VvnWHmzoB0EQNPyP3xXKQ4xLNfQtB/7MeXMT 7oFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-type :content-disposition:in-reply-to:user-agent; bh=ZsDFtcVKAzFzibtwRScJ7D+u8pnTjOJjbIxhcwRx+jw=; b=FeVfB1O5VEhxa8ayQK+1V/Sxqrc3FG2xrNZ0UA4wb5hZmeOz8VdOYF+Lru/sb+0Dx1 yVcX9iqxK9JU6PJK/ls+ik+xapNRbESygJBBp3d1K8YfRXS2qyPSa0v2xFv6cgq02MeS FV0lV3xWgLVR29g7uoy5rRsMg/XXyIqd+AnizXX0dtThJ+85JqWuwmgFWBj+pMBSoUt6 tbq+EWsSuTTbqtS/tZk5TXcaFgSbeg/mYodLhLOMvwgFB7+Hn5VeeTq+/da7WsSSJfBj sl2moUpmVMcrsdpQQQONG33GoGos6u8ukXM8MDNXPTjPEWlTonp6hu1Suya8hw8RSad3 1Gtg== X-Gm-Message-State: AG10YOQbWGCvHpzW51tUbcJPFx7NXmrmfPqhUjIZL2GM3vGOb/PuHCo64nWJLkWnfS6+qg== X-Received: by 10.194.246.37 with SMTP id xt5mr31542101wjc.7.1454461455686; Tue, 02 Feb 2016 17:04:15 -0800 (PST) Received: from dft-labs.eu (n1x0n-1-pt.tunnel.tserv5.lon1.ipv6.he.net. [2001:470:1f08:1f7::2]) by smtp.gmail.com with ESMTPSA id e77sm18851736wma.18.2016.02.02.17.04.14 (version=TLS1_2 cipher=AES128-SHA bits=128/128); Tue, 02 Feb 2016 17:04:14 -0800 (PST) Date: Wed, 3 Feb 2016 02:04:13 +0100 From: Mateusz Guzik To: Konstantin Belousov , freebsd-hackers@freebsd.org Cc: jmg@freebsd.org Subject: Re: [PATCH 2/2] fork: plug a use after free of the returned process pointer Message-ID: <20160203010412.GC9812@dft-labs.eu> Mail-Followup-To: Mateusz Guzik , Konstantin Belousov , freebsd-hackers@freebsd.org, jmg@freebsd.org References: <20160201103632.GL91220@kib.kiev.ua> <1454386069-29657-1-git-send-email-mjguzik@gmail.com> <1454386069-29657-3-git-send-email-mjguzik@gmail.com> <20160202132322.GU91220@kib.kiev.ua> <20160202175652.GA9812@dft-labs.eu> <20160202181635.GC91220@kib.kiev.ua> <20160202214427.GB9812@dft-labs.eu> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20160202214427.GB9812@dft-labs.eu> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Feb 2016 01:04:17 -0000 On Tue, Feb 02, 2016 at 10:44:27PM +0100, Mateusz Guzik wrote: > On Tue, Feb 02, 2016 at 08:16:35PM +0200, Konstantin Belousov wrote: > > On Tue, Feb 02, 2016 at 06:56:52PM +0100, Mateusz Guzik wrote: > > > On Tue, Feb 02, 2016 at 03:23:22PM +0200, Konstantin Belousov wrote: > > > > On Tue, Feb 02, 2016 at 05:07:49AM +0100, Mateusz Guzik wrote: > > > > > + flags = fr->fr_flags; > > > > Why not use fr->fr_flags directly ? It is slightly more churn, but IMO > > > > it is worth it. > > > > > > > > > > I'm indiffernet on this one, can change it no problem. > > > > > > > > + /* > > > > > + * Hold the process so that it cannot exit after we make it runnable, > > > > > + * but before we wait for the debugger. > > > > Is this possible ? The forked child must execute through fork_return(), > > > > and there we do ptracestop() before the child has a chance to ever return > > > > to usermode. > > > > > > > > Do you mean a scenario where the debugger detaches before child executes > > > > fork_return() and TDP_STOPATFORK is cleared in advance ? > > > > > > > > > > The comment is somewhat misworded and I forgot to update it, how about > > > just stating we hold the process so that we can mark the thread runnable > > > and not have it disappear under we are done. > > This means that the reader has to guess too much, IMHO. > > > > At least, add a note that despite fork_return() stops when the child > > is traced, it is not enough because ... . > > > > > > While I have not tested this particular bug, prior to the patch the > > > following should possible: p2 is untraced and td2 is marked as runnable, > > > after which it exits and p2 is automatically reaped. If the code reaches > > > the TDB_STOPATFORK check after that, PROC_LOCK(p2) succeeds due to > > I.e. td2 is reused and the TDB_STOPATFORK is set by unrelated activity ? > > You reference the do_fork() code checking TDB_STOPATFORK, and not > > fork_return(), I guess. > > > > > > > processes being type stable. td2 dereference can cause no issues due to > > > threads being type stable as well. But the thread could have been resued > > > in a traced process, thus inducing cv_wait(&p2->p_dbgwait, ..) even > > > though td2 is not linked in p2 anymore and p2 is not even a valid > > > process, making curthread wait indefinitely since there is nobody to > > > wake it up. > > > > > Well, if TDP_STOPATFORK bit is set, it has the same meaning due to type > > stability, and eventually the wake up would be performed. It just the > > unintended sleep waiting for condvar which is problematic and which I > > agree with. > > > CPU0 is executing fork1. p2 is not traced. > > CPU0 CPU1 > p2 and td2 created > td2 is marked runnable > td2 is scheduled here > td2 does not have TDB_STOPATFORK set > td2 exits > p2 is autoreaped > td2's space is reused > td2 gets linked into p3 > td2 gets TDB_STOPATFORK > PROC_LOCK(p2); > TDB_STOPATFORK test on td2 > cv_wait(&p2->p_dbgwait, ..); > > So at this point p2 has no linked threads and is free. td2 belongs to > p3 and p2 is waiting for a wakeup which can't happen. > > Now that I look at it this may be broken in an additonal way, which is > not fixed by the patch: what if td2 spawns a new thread and thr_exits? > In this case testing td2 is still invalid. Maybe I'm just getting > paranoid here, I don't have time to properly test this right now. In > worst case should be fixable well enough with FIRST_THREAD_IN_PROC. > > How about the following comment around _PHOLD: > We are going to make the main thread runnable. It can quickly exit, > causing the process to be reaped and possibly reused, thus invalidating > our p2 pointer. Protect against this by holding the process, which > postpones the exit. > > And if the suspicion gets confimed the following would be added: > diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c > index d0c3837..2a076ed 100644 > --- a/sys/kern/kern_fork.c > +++ b/sys/kern/kern_fork.c > @@ -773,6 +773,12 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > > PROC_LOCK(p2); > /* > + * By the time we got here the thread could have created a new thread > + * and exited. Reload the main thread to ensure we got the right > + * pointer. > + */ > + td2 = FIRST_THREAD_IN_PROC(p2); > + /* > * Wait until debugger is attached to child. > */ > while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) > > To end the speculation and hackery I decided to reorganize the func a little bit instead. Namely it can be trivially modified to not drop the lock proc lock, which gets rid of all aforementioned races. I got 2 variants. For brevity both patches are applied on top of the current patchset. When combined, the patch would be combined with second patch in the patchset. Both currently and with the first patch below knote_fork can get a now-freed or even recycled pid. I find that odd. This variant only saves the pid and calls knote_fork. This is likely fine enough. Just in case, the second variant adds a primitive - knlist_empty_lockless to perform a racy check to see if there are any knotes. If so, the process is held and released after knote_fork. Note that this is no more racy than current code with respect to spotting knotes. What does improve is the fact that the process is guaranteed to be around for during knote_fork, although I don't know how important this is. With both variants we also save one lock/unlock round, and a second lock/unlock with of knotes for the second variant in the common case. Variant 1: diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c index d0c3837..fae4eaf 100644 --- a/sys/kern/kern_fork.c +++ b/sys/kern/kern_fork.c @@ -378,7 +378,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * struct vmspace *vm2, struct file *fp_procdesc) { struct proc *p1, *pptr; - int trypid; + int p2_pid, trypid; struct filedesc *fd; struct filedesc_to_leader *fdtol; struct sigacts *newsigacts; @@ -715,11 +715,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * if ((flags & RFMEM) == 0 && dtrace_fasttrap_fork) dtrace_fasttrap_fork(p1, p2); #endif - /* - * Hold the process so that it cannot exit after we make it runnable, - * but before we wait for the debugger. - */ - _PHOLD(p2); if ((p1->p_flag & (P_TRACED | P_FOLLOWFORK)) == (P_TRACED | P_FOLLOWFORK)) { /* @@ -737,7 +732,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * td->td_pflags |= TDP_RFPPWAIT; td->td_rfppwait_p = p2; } - PROC_UNLOCK(p2); /* * Now can be swapped. @@ -745,16 +739,10 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * _PRELE(p1); PROC_UNLOCK(p1); - /* - * Tell any interested parties about the new process. - */ - knote_fork(&p1->p_klist, p2->p_pid); SDT_PROBE3(proc, , , create, p2, p1, flags); - if (flags & RFPROCDESC) { + if (flags & RFPROCDESC) procdesc_finit(p2->p_procdesc, fp_procdesc); - fdrop(fp_procdesc, td); - } if ((flags & RFSTOPPED) == 0) { /* @@ -771,15 +759,26 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * *fr->fr_procp = p2; } - PROC_LOCK(p2); /* * Wait until debugger is attached to child. */ while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) cv_wait(&p2->p_dbgwait, &p2->p_mtx); - _PRELE(p2); racct_proc_fork_done(p2); + /* + * The process can exit and be waited on after we drop the lock. Save + * the pid so that it can be used for knotes. + */ + p2_pid = p2->p_pid; PROC_UNLOCK(p2); + + /* + * Tell any interested parties about the new process. + */ + knote_fork(&p1->p_klist, p2_pid); + + if (flags & RFPROCDESC) + fdrop(fp_procdesc, td); } int ======================== Variant 2: diff --git a/sys/kern/kern_event.c b/sys/kern/kern_event.c index d41ac96..3610d8a 100644 --- a/sys/kern/kern_event.c +++ b/sys/kern/kern_event.c @@ -2038,12 +2038,22 @@ knlist_remove_inevent(struct knlist *knl, struct knote *kn) (kn->kn_status & KN_HASKQLOCK) == KN_HASKQLOCK); } +/* + * For when the caller accepts that the check is inherently racy. + */ +int +knlist_empty_lockless(struct knlist *knl) +{ + + return SLIST_EMPTY(&knl->kl_list); +} + int knlist_empty(struct knlist *knl) { KNL_ASSERT_LOCKED(knl); - return SLIST_EMPTY(&knl->kl_list); + return knlist_empty_lockless(knl); } static struct mtx knlist_lock; diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c index d0c3837..70490ef 100644 --- a/sys/kern/kern_fork.c +++ b/sys/kern/kern_fork.c @@ -378,7 +378,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * struct vmspace *vm2, struct file *fp_procdesc) { struct proc *p1, *pptr; - int trypid; + int p2_held, trypid; struct filedesc *fd; struct filedesc_to_leader *fdtol; struct sigacts *newsigacts; @@ -387,6 +387,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * sx_assert(&proctree_lock, SX_SLOCKED); sx_assert(&allproc_lock, SX_XLOCKED); + p2_held = 0; p1 = td->td_proc; flags = fr->fr_flags; @@ -715,11 +716,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * if ((flags & RFMEM) == 0 && dtrace_fasttrap_fork) dtrace_fasttrap_fork(p1, p2); #endif - /* - * Hold the process so that it cannot exit after we make it runnable, - * but before we wait for the debugger. - */ - _PHOLD(p2); if ((p1->p_flag & (P_TRACED | P_FOLLOWFORK)) == (P_TRACED | P_FOLLOWFORK)) { /* @@ -737,7 +733,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * td->td_pflags |= TDP_RFPPWAIT; td->td_rfppwait_p = p2; } - PROC_UNLOCK(p2); /* * Now can be swapped. @@ -745,16 +740,10 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * _PRELE(p1); PROC_UNLOCK(p1); - /* - * Tell any interested parties about the new process. - */ - knote_fork(&p1->p_klist, p2->p_pid); SDT_PROBE3(proc, , , create, p2, p1, flags); - if (flags & RFPROCDESC) { + if (flags & RFPROCDESC) procdesc_finit(p2->p_procdesc, fp_procdesc); - fdrop(fp_procdesc, td); - } if ((flags & RFSTOPPED) == 0) { /* @@ -771,15 +760,32 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * *fr->fr_procp = p2; } - PROC_LOCK(p2); /* * Wait until debugger is attached to child. */ while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) cv_wait(&p2->p_dbgwait, &p2->p_mtx); - _PRELE(p2); racct_proc_fork_done(p2); + if (!knlist_empty_lockless(&p1->p_klist)) { + /* + * Hold the process so that it does not exit until we call + * knote_fork. + */ + _PHOLD(p2); + p2_held = 1; + } PROC_UNLOCK(p2); + + if (p2_held) { + /* + * Tell any interested parties about the new process. + */ + knote_fork(&p1->p_klist, p2->p_pid); + PRELE(p2); + } + + if (flags & RFPROCDESC) + fdrop(fp_procdesc, td); } int diff --git a/sys/sys/event.h b/sys/sys/event.h index 0f13231..771b3bb 100644 --- a/sys/sys/event.h +++ b/sys/sys/event.h @@ -254,6 +254,7 @@ extern void knote_fork(struct knlist *list, int pid); extern void knlist_add(struct knlist *knl, struct knote *kn, int islocked); extern void knlist_remove(struct knlist *knl, struct knote *kn, int islocked); extern void knlist_remove_inevent(struct knlist *knl, struct knote *kn); +extern int knlist_empty_lockless(struct knlist *knl); extern int knlist_empty(struct knlist *knl); extern void knlist_init(struct knlist *knl, void *lock, void (*kl_lock)(void *), void (*kl_unlock)(void *), -- Mateusz Guzik