Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 04 Nov 2004 07:16:05 +0800
From:      David Xu <davidxu@freebsd.org>
To:        John Baldwin <jhb@freebsd.org>
Cc:        cvs-all@freebsd.org
Subject:   Re: cvs commit: src/lib/libpthread/thread thr_private.h thr_sig.c
Message-ID:  <418966B5.4010404@freebsd.org>
In-Reply-To: <200411031431.16218.jhb@FreeBSD.org>
References:  <Pine.GSO.4.43.0411021838240.5097-100000@sea.ntplx.net> <200411031431.16218.jhb@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
John Baldwin wrote:

>On Tuesday 02 November 2004 06:40 pm, Daniel Eischen wrote:
>  
>
>>On Wed, 3 Nov 2004, David Xu wrote:
>>    
>>
>>>John Baldwin wrote:
>>>      
>>>
>>>>On Monday 01 November 2004 06:04 pm, David Xu wrote:
>>>>        
>>>>
>>>>>Not every important,  I think I have another very important history
>>>>>bug in hand,  did you get my "fix famous libpthread conditional
>>>>>variable race condition" mail ? :-)
>>>>>          
>>>>>
>>>>Oooo, can I test it please?  We are still having problems with mono on
>>>>HEAD here at work.  I tried merging the changes in uthread_cond.c 1.32
>>>>to libpthread but that seemed to make it worse.  The problems seem to
>>>>be that a signal handler is being run when the SYNCQ sflag is set (but
>>>>the thread is not on a cv or a mutex queue), and the handler calls
>>>>sem_post() which is supposed to be signal safe.  sem_post() tries to
>>>>lock a mutex and then bombs with the assertion failure.
>>>>        
>>>>
>>>You can try:
>>>http://people.freebsd.org/~davidxu/kse/thr_cond.c.diff
>>>
>>>But it was not designed to fix the problem you have seen. :-)
>>>      
>>>
>>I think if _kse_critical_leave() were replaced by _kcb_critical_leave()
>>at around line 676 in thr_kern.c, that should fix the problem, no?
>>There's no reason to do a yield check after leaving the scheduler,
>>and the check for signals and cancellation is done right after
>>that point before returning.
>>    
>>
>
>Well, it moved it. :)  Now thr_sig_rundown() is called from thr_resume_check() 
>from thr_sched_switch_unlocked(), but psf->valid is zero, so it still doesn't 
>work.  What would happen if the signal came in before curthread->frame was 
>set to &psf in thread_sched_switch_unlocked()?
>
>  
>
I think the race condition is that a signal was delivered to thread
before  thread sets its state to PS_COND_WAIT,
so the THR_LOCK_RELEASE(curthread,  &(*cond)->c_lock)
cause it to be handled, at the time curframe is NULL.
Daniel has a fix for this problem, mine is for pthread_cond_wait()
and pthread_cond_signal/broadcast race. :-)
We already have a patch merged two fixes,  it is being tested.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?418966B5.4010404>