Date: Thu, 04 Nov 2004 07:16:05 +0800 From: David Xu <davidxu@freebsd.org> To: John Baldwin <jhb@freebsd.org> Cc: cvs-all@freebsd.org Subject: Re: cvs commit: src/lib/libpthread/thread thr_private.h thr_sig.c Message-ID: <418966B5.4010404@freebsd.org> In-Reply-To: <200411031431.16218.jhb@FreeBSD.org> References: <Pine.GSO.4.43.0411021838240.5097-100000@sea.ntplx.net> <200411031431.16218.jhb@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
John Baldwin wrote: >On Tuesday 02 November 2004 06:40 pm, Daniel Eischen wrote: > > >>On Wed, 3 Nov 2004, David Xu wrote: >> >> >>>John Baldwin wrote: >>> >>> >>>>On Monday 01 November 2004 06:04 pm, David Xu wrote: >>>> >>>> >>>>>Not every important, I think I have another very important history >>>>>bug in hand, did you get my "fix famous libpthread conditional >>>>>variable race condition" mail ? :-) >>>>> >>>>> >>>>Oooo, can I test it please? We are still having problems with mono on >>>>HEAD here at work. I tried merging the changes in uthread_cond.c 1.32 >>>>to libpthread but that seemed to make it worse. The problems seem to >>>>be that a signal handler is being run when the SYNCQ sflag is set (but >>>>the thread is not on a cv or a mutex queue), and the handler calls >>>>sem_post() which is supposed to be signal safe. sem_post() tries to >>>>lock a mutex and then bombs with the assertion failure. >>>> >>>> >>>You can try: >>>http://people.freebsd.org/~davidxu/kse/thr_cond.c.diff >>> >>>But it was not designed to fix the problem you have seen. :-) >>> >>> >>I think if _kse_critical_leave() were replaced by _kcb_critical_leave() >>at around line 676 in thr_kern.c, that should fix the problem, no? >>There's no reason to do a yield check after leaving the scheduler, >>and the check for signals and cancellation is done right after >>that point before returning. >> >> > >Well, it moved it. :) Now thr_sig_rundown() is called from thr_resume_check() >from thr_sched_switch_unlocked(), but psf->valid is zero, so it still doesn't >work. What would happen if the signal came in before curthread->frame was >set to &psf in thread_sched_switch_unlocked()? > > > I think the race condition is that a signal was delivered to thread before thread sets its state to PS_COND_WAIT, so the THR_LOCK_RELEASE(curthread, &(*cond)->c_lock) cause it to be handled, at the time curframe is NULL. Daniel has a fix for this problem, mine is for pthread_cond_wait() and pthread_cond_signal/broadcast race. :-) We already have a patch merged two fixes, it is being tested.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?418966B5.4010404>