Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 4 Feb 2016 11:15:56 +0100
From:      Mateusz Guzik <mjguzik@gmail.com>
To:        Konstantin Belousov <kib@freebsd.org>
Cc:        freebsd-hackers@freebsd.org, jmg@freebsd.org
Subject:   Re: [PATCH 2/2] fork: plug a use after free of the returned process pointer
Message-ID:  <20160204101556.GB21877@dft-labs.eu>
In-Reply-To: <20160204095341.GO91220@kib.kiev.ua>
References:  <1454386069-29657-3-git-send-email-mjguzik@gmail.com> <20160202132322.GU91220@kib.kiev.ua> <20160202175652.GA9812@dft-labs.eu> <20160202181635.GC91220@kib.kiev.ua> <20160202214427.GB9812@dft-labs.eu> <20160203010412.GC9812@dft-labs.eu> <20160203080514.GA8753@dft-labs.eu> <20160203141329.GF91220@kib.kiev.ua> <20160204093515.GA21877@dft-labs.eu> <20160204095341.GO91220@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Feb 04, 2016 at 11:53:41AM +0200, Konstantin Belousov wrote:
> On Thu, Feb 04, 2016 at 10:35:15AM +0100, Mateusz Guzik wrote:
> > Stuff below is just speculation.
> > 
> > So the remaining problem, after we know the process has to survive, is
> > survival of the thread and its relationship with the process.
> > 
> > The problem stems from not having the proc lock over the entire time
> > from the moment the thread is marked as runnable to the moment where the
> > code is done with it.
> > 
> > Race 1:
> > 
> > CPU0				CPU1
> > p1: p2 and td2 created
> > td2: marked runnable
> > 				td2: scheduled here
> > 				td2: does not have TDB_STOPATFORK set
> > 				td2: calls thr_new
> > 				td2: calls thr_exit
> > 				td2: reused and linked into p3
> > 				td2: gets TDB_STOPATFORK
> > p1: PROC_LOCK(p2);
> > p1: TDB_STOPATFORK test on td2
> > p1: cv_wait(&p2->p_dbgwait, ..);
> > 
> > p2 is the process we want, but td2 now belongs to a different thread.
> > 
> > Race 2:
> > 
> > However, seems to be even more buggy. To quote:
> > 
> > 	while ((td2->td_dbgflags & TDB_STOPATFORK) != 0)
> > 		cv_wait(&p2->p_dbgwait, &p2->p_mtx);
> > 
> > The check is done in a loop which drops the proc lock. This makes me
> > wonder about the following additional race:
> > 
> > p2 is traced, TDB_STOPATFORK is set on td2.
> > 
> > CPU0				CPU1
> > p1: PROC_LOCK(p2);
> > p1: TDB_STOPATFORK test on td2
> > p1: cv_wait(&p2->p_dbgwait, ..);
> > 				td2: is scheduled here
> > 				td2: clears TDB_STOPATFORK
> > 				td2: cv_broadcast(&p2->p_dbgwait)
> > p1: not scheduled yet
> > 				td2: calls thr_new
> > 				td2: calls thr_exit
> > 				td2: is reused and linked into p3
> > 				td2: gets TDB_STOPATFORK
> > p1: scheduled here
> > p1: internal PROC_LOCK(p2); 
> > p1: TDB_STOPATFORK test on td2
> > 
> > But td2 now belongs to p3.
> > 
> > I think the patch below deals with race 1 just fine.
> > 
> > For race 2, it is unclear to me if the while loop is justified. If a
> > single 'if' statement was sufficient, there would be no problem since
> > unlock + lock would be avoided guaranteeting the consistency.
> > 
> > I was pondering borrowing fork_return's logic to check if tracing is
> > enabled before testing TDB_STOPATFORK. However, tracing state could have
> > changed several times invalidating the result. Maybe refreshing the
> > pointer to th first thread would do the trick, but imho the lock
> > dropping business is extremely fishy and will have to be dealt with at
> > some point.
> > 
> So if the issue is only reassignment of td2 to p3, why not do the following ?
> I think that possible ABA problem where td2 gets TDB_STOPATFORK set after
> being reused for p2 (and not p3) after yet another fork, is actually fine.
> 
> diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c
> index baee954..5bb14e8 100644
> --- a/sys/kern/kern_fork.c
> +++ b/sys/kern/kern_fork.c
> @@ -777,7 +777,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread *
>  	/*
>  	 * Wait until debugger is attached to child.
>  	 */
> -	while ((td2->td_dbgflags & TDB_STOPATFORK) != 0)
> +	while (td2->td_proc == p2 && (td2->td_dbgflags & TDB_STOPATFORK) != 0)
>  		cv_wait(&p2->p_dbgwait, &p2->p_mtx);
>  	_PRELE(p2);
>  	racct_proc_fork_done(p2);

This is definitely fine for the being, it's just that unlock+lock pair
which seems extremely error prone and someone(tm) should investigate it
further at some point (tm).

-- 
Mateusz Guzik <mjguzik gmail.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160204101556.GB21877>