Date: Wed, 3 Feb 2016 09:05:15 +0100 From: Mateusz Guzik <mjguzik@gmail.com> To: Konstantin Belousov <kostikbel@gmail.com>, freebsd-hackers@freebsd.org, jmg@freebsd.org Subject: Re: [PATCH 2/2] fork: plug a use after free of the returned process pointer Message-ID: <20160203080514.GA8753@dft-labs.eu> In-Reply-To: <20160203010412.GC9812@dft-labs.eu> References: <20160201103632.GL91220@kib.kiev.ua> <1454386069-29657-1-git-send-email-mjguzik@gmail.com> <1454386069-29657-3-git-send-email-mjguzik@gmail.com> <20160202132322.GU91220@kib.kiev.ua> <20160202175652.GA9812@dft-labs.eu> <20160202181635.GC91220@kib.kiev.ua> <20160202214427.GB9812@dft-labs.eu> <20160203010412.GC9812@dft-labs.eu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Feb 03, 2016 at 02:04:13AM +0100, Mateusz Guzik wrote: > On Tue, Feb 02, 2016 at 10:44:27PM +0100, Mateusz Guzik wrote: > > On Tue, Feb 02, 2016 at 08:16:35PM +0200, Konstantin Belousov wrote: > > > On Tue, Feb 02, 2016 at 06:56:52PM +0100, Mateusz Guzik wrote: > > > > On Tue, Feb 02, 2016 at 03:23:22PM +0200, Konstantin Belousov wrote: > > > > > On Tue, Feb 02, 2016 at 05:07:49AM +0100, Mateusz Guzik wrote: > > > > > > + flags = fr->fr_flags; > > > > > Why not use fr->fr_flags directly ? It is slightly more churn, but IMO > > > > > it is worth it. > > > > > > > > > > > > > I'm indiffernet on this one, can change it no problem. > > > > > > > > > > + /* > > > > > > + * Hold the process so that it cannot exit after we make it runnable, > > > > > > + * but before we wait for the debugger. > > > > > Is this possible ? The forked child must execute through fork_return(), > > > > > and there we do ptracestop() before the child has a chance to ever return > > > > > to usermode. > > > > > > > > > > Do you mean a scenario where the debugger detaches before child executes > > > > > fork_return() and TDP_STOPATFORK is cleared in advance ? > > > > > > > > > > > > > The comment is somewhat misworded and I forgot to update it, how about > > > > just stating we hold the process so that we can mark the thread runnable > > > > and not have it disappear under we are done. > > > This means that the reader has to guess too much, IMHO. > > > > > > At least, add a note that despite fork_return() stops when the child > > > is traced, it is not enough because ... . > > > > > > > > While I have not tested this particular bug, prior to the patch the > > > > following should possible: p2 is untraced and td2 is marked as runnable, > > > > after which it exits and p2 is automatically reaped. If the code reaches > > > > the TDB_STOPATFORK check after that, PROC_LOCK(p2) succeeds due to > > > I.e. td2 is reused and the TDB_STOPATFORK is set by unrelated activity ? > > > You reference the do_fork() code checking TDB_STOPATFORK, and not > > > fork_return(), I guess. > > > > > > > > > > processes being type stable. td2 dereference can cause no issues due to > > > > threads being type stable as well. But the thread could have been resued > > > > in a traced process, thus inducing cv_wait(&p2->p_dbgwait, ..) even > > > > though td2 is not linked in p2 anymore and p2 is not even a valid > > > > process, making curthread wait indefinitely since there is nobody to > > > > wake it up. > > > > > > > Well, if TDP_STOPATFORK bit is set, it has the same meaning due to type > > > stability, and eventually the wake up would be performed. It just the > > > unintended sleep waiting for condvar which is problematic and which I > > > agree with. > > > > > > CPU0 is executing fork1. p2 is not traced. > > > > CPU0 CPU1 > > p2 and td2 created > > td2 is marked runnable > > td2 is scheduled here > > td2 does not have TDB_STOPATFORK set > > td2 exits > > p2 is autoreaped > > td2's space is reused > > td2 gets linked into p3 > > td2 gets TDB_STOPATFORK > > PROC_LOCK(p2); > > TDB_STOPATFORK test on td2 > > cv_wait(&p2->p_dbgwait, ..); > > > > So at this point p2 has no linked threads and is free. td2 belongs to > > p3 and p2 is waiting for a wakeup which can't happen. > > > > Now that I look at it this may be broken in an additonal way, which is > > not fixed by the patch: what if td2 spawns a new thread and thr_exits? > > In this case testing td2 is still invalid. Maybe I'm just getting > > paranoid here, I don't have time to properly test this right now. In > > worst case should be fixable well enough with FIRST_THREAD_IN_PROC. > > > > How about the following comment around _PHOLD: > > We are going to make the main thread runnable. It can quickly exit, > > causing the process to be reaped and possibly reused, thus invalidating > > our p2 pointer. Protect against this by holding the process, which > > postpones the exit. > > > > And if the suspicion gets confimed the following would be added: > > diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c > > index d0c3837..2a076ed 100644 > > --- a/sys/kern/kern_fork.c > > +++ b/sys/kern/kern_fork.c > > @@ -773,6 +773,12 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > > > > PROC_LOCK(p2); > > /* > > + * By the time we got here the thread could have created a new thread > > + * and exited. Reload the main thread to ensure we got the right > > + * pointer. > > + */ > > + td2 = FIRST_THREAD_IN_PROC(p2); > > + /* > > * Wait until debugger is attached to child. > > */ > > while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) > > > > > > To end the speculation and hackery I decided to reorganize the func a > little bit instead. Namely it can be trivially modified to not drop the > lock proc lock, which gets rid of all aforementioned races. Here is a trivial change retaining doing knote_fork before waiting for the debugger. Here the problem is worked around by making td2 runnable after knote_fork is performed and without relocking p2 in-between. diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c index d0c3837..366262f 100644 --- a/sys/kern/kern_fork.c +++ b/sys/kern/kern_fork.c @@ -715,11 +715,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * if ((flags & RFMEM) == 0 && dtrace_fasttrap_fork) dtrace_fasttrap_fork(p1, p2); #endif - /* - * Hold the process so that it cannot exit after we make it runnable, - * but before we wait for the debugger. - */ - _PHOLD(p2); if ((p1->p_flag & (P_TRACED | P_FOLLOWFORK)) == (P_TRACED | P_FOLLOWFORK)) { /* @@ -756,6 +751,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * fdrop(fp_procdesc, td); } + PROC_LOCK(p2); if ((flags & RFSTOPPED) == 0) { /* * If RFSTOPPED not requested, make child runnable and @@ -771,13 +767,11 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * *fr->fr_procp = p2; } - PROC_LOCK(p2); /* * Wait until debugger is attached to child. */ while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) cv_wait(&p2->p_dbgwait, &p2->p_mtx); - _PRELE(p2); racct_proc_fork_done(p2); PROC_UNLOCK(p2); } > > I got 2 variants. For brevity both patches are applied on top of the > current patchset. When combined, the patch would be combined with second > patch in the patchset. > > Both currently and with the first patch below knote_fork can get a > now-freed or even recycled pid. I find that odd. This variant only saves > the pid and calls knote_fork. This is likely fine enough. > > Just in case, the second variant adds a primitive - > knlist_empty_lockless to perform a racy check to see if there are any > knotes. If so, the process is held and released after knote_fork. Note > that this is no more racy than current code with respect to spotting > knotes. What does improve is the fact that the process is guaranteed to > be around for during knote_fork, although I don't know how important > this is. > > With both variants we also save one lock/unlock round, and a second > lock/unlock with of knotes for the second variant in the common case. > > Variant 1: > diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c > index d0c3837..fae4eaf 100644 > --- a/sys/kern/kern_fork.c > +++ b/sys/kern/kern_fork.c > @@ -378,7 +378,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > struct vmspace *vm2, struct file *fp_procdesc) > { > struct proc *p1, *pptr; > - int trypid; > + int p2_pid, trypid; > struct filedesc *fd; > struct filedesc_to_leader *fdtol; > struct sigacts *newsigacts; > @@ -715,11 +715,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > if ((flags & RFMEM) == 0 && dtrace_fasttrap_fork) > dtrace_fasttrap_fork(p1, p2); > #endif > - /* > - * Hold the process so that it cannot exit after we make it runnable, > - * but before we wait for the debugger. > - */ > - _PHOLD(p2); > if ((p1->p_flag & (P_TRACED | P_FOLLOWFORK)) == (P_TRACED | > P_FOLLOWFORK)) { > /* > @@ -737,7 +732,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > td->td_pflags |= TDP_RFPPWAIT; > td->td_rfppwait_p = p2; > } > - PROC_UNLOCK(p2); > > /* > * Now can be swapped. > @@ -745,16 +739,10 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > _PRELE(p1); > PROC_UNLOCK(p1); > > - /* > - * Tell any interested parties about the new process. > - */ > - knote_fork(&p1->p_klist, p2->p_pid); > SDT_PROBE3(proc, , , create, p2, p1, flags); > > - if (flags & RFPROCDESC) { > + if (flags & RFPROCDESC) > procdesc_finit(p2->p_procdesc, fp_procdesc); > - fdrop(fp_procdesc, td); > - } > > if ((flags & RFSTOPPED) == 0) { > /* > @@ -771,15 +759,26 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > *fr->fr_procp = p2; > } > > - PROC_LOCK(p2); > /* > * Wait until debugger is attached to child. > */ > while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) > cv_wait(&p2->p_dbgwait, &p2->p_mtx); > - _PRELE(p2); > racct_proc_fork_done(p2); > + /* > + * The process can exit and be waited on after we drop the lock. Save > + * the pid so that it can be used for knotes. > + */ > + p2_pid = p2->p_pid; > PROC_UNLOCK(p2); > + > + /* > + * Tell any interested parties about the new process. > + */ > + knote_fork(&p1->p_klist, p2_pid); > + > + if (flags & RFPROCDESC) > + fdrop(fp_procdesc, td); > } > > int > ======================== > > Variant 2: > diff --git a/sys/kern/kern_event.c b/sys/kern/kern_event.c > index d41ac96..3610d8a 100644 > --- a/sys/kern/kern_event.c > +++ b/sys/kern/kern_event.c > @@ -2038,12 +2038,22 @@ knlist_remove_inevent(struct knlist *knl, struct knote *kn) > (kn->kn_status & KN_HASKQLOCK) == KN_HASKQLOCK); > } > > +/* > + * For when the caller accepts that the check is inherently racy. > + */ > +int > +knlist_empty_lockless(struct knlist *knl) > +{ > + > + return SLIST_EMPTY(&knl->kl_list); > +} > + > int > knlist_empty(struct knlist *knl) > { > > KNL_ASSERT_LOCKED(knl); > - return SLIST_EMPTY(&knl->kl_list); > + return knlist_empty_lockless(knl); > } > > static struct mtx knlist_lock; > diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c > index d0c3837..70490ef 100644 > --- a/sys/kern/kern_fork.c > +++ b/sys/kern/kern_fork.c > @@ -378,7 +378,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > struct vmspace *vm2, struct file *fp_procdesc) > { > struct proc *p1, *pptr; > - int trypid; > + int p2_held, trypid; > struct filedesc *fd; > struct filedesc_to_leader *fdtol; > struct sigacts *newsigacts; > @@ -387,6 +387,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > sx_assert(&proctree_lock, SX_SLOCKED); > sx_assert(&allproc_lock, SX_XLOCKED); > > + p2_held = 0; > p1 = td->td_proc; > flags = fr->fr_flags; > > @@ -715,11 +716,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > if ((flags & RFMEM) == 0 && dtrace_fasttrap_fork) > dtrace_fasttrap_fork(p1, p2); > #endif > - /* > - * Hold the process so that it cannot exit after we make it runnable, > - * but before we wait for the debugger. > - */ > - _PHOLD(p2); > if ((p1->p_flag & (P_TRACED | P_FOLLOWFORK)) == (P_TRACED | > P_FOLLOWFORK)) { > /* > @@ -737,7 +733,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > td->td_pflags |= TDP_RFPPWAIT; > td->td_rfppwait_p = p2; > } > - PROC_UNLOCK(p2); > > /* > * Now can be swapped. > @@ -745,16 +740,10 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > _PRELE(p1); > PROC_UNLOCK(p1); > > - /* > - * Tell any interested parties about the new process. > - */ > - knote_fork(&p1->p_klist, p2->p_pid); > SDT_PROBE3(proc, , , create, p2, p1, flags); > > - if (flags & RFPROCDESC) { > + if (flags & RFPROCDESC) > procdesc_finit(p2->p_procdesc, fp_procdesc); > - fdrop(fp_procdesc, td); > - } > > if ((flags & RFSTOPPED) == 0) { > /* > @@ -771,15 +760,32 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > *fr->fr_procp = p2; > } > > - PROC_LOCK(p2); > /* > * Wait until debugger is attached to child. > */ > while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) > cv_wait(&p2->p_dbgwait, &p2->p_mtx); > - _PRELE(p2); > racct_proc_fork_done(p2); > + if (!knlist_empty_lockless(&p1->p_klist)) { > + /* > + * Hold the process so that it does not exit until we call > + * knote_fork. > + */ > + _PHOLD(p2); > + p2_held = 1; > + } > PROC_UNLOCK(p2); > + > + if (p2_held) { > + /* > + * Tell any interested parties about the new process. > + */ > + knote_fork(&p1->p_klist, p2->p_pid); > + PRELE(p2); > + } > + > + if (flags & RFPROCDESC) > + fdrop(fp_procdesc, td); > } > > int > diff --git a/sys/sys/event.h b/sys/sys/event.h > index 0f13231..771b3bb 100644 > --- a/sys/sys/event.h > +++ b/sys/sys/event.h > @@ -254,6 +254,7 @@ extern void knote_fork(struct knlist *list, int pid); > extern void knlist_add(struct knlist *knl, struct knote *kn, int islocked); > extern void knlist_remove(struct knlist *knl, struct knote *kn, int islocked); > extern void knlist_remove_inevent(struct knlist *knl, struct knote *kn); > +extern int knlist_empty_lockless(struct knlist *knl); > extern int knlist_empty(struct knlist *knl); > extern void knlist_init(struct knlist *knl, void *lock, > void (*kl_lock)(void *), void (*kl_unlock)(void *), > > -- > Mateusz Guzik <mjguzik gmail.com> -- Mateusz Guzik <mjguzik gmail.com>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160203080514.GA8753>