From owner-freebsd-threads@FreeBSD.ORG  Sun Apr 13 21:55:42 2003
Return-Path: <owner-freebsd-threads@FreeBSD.ORG>
Delivered-To: freebsd-threads@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 4B48A37B407; Sun, 13 Apr 2003 21:55:42 -0700 (PDT)
Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id EE0B943F75; Sun, 13 Apr 2003 21:55:40 -0700 (PDT)
	(envelope-from eischen@pcnet1.pcnet.com)
Received: from pcnet1.pcnet.com (localhost [127.0.0.1])
	by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h3E4teBg016082;
	Mon, 14 Apr 2003 00:55:40 -0400 (EDT)
Received: from localhost (eischen@localhost)h3E4tdhM016079;
	Mon, 14 Apr 2003 00:55:39 -0400 (EDT)
Date: Mon, 14 Apr 2003 00:55:39 -0400 (EDT)
From: Daniel Eischen <eischen@pcnet1.pcnet.com>
To: David Xu <davidxu@freebsd.org>
In-Reply-To: <006001c3023a$65fe01d0$f001a8c0@davidw2k>
Message-ID: <Pine.GSO.4.10.10304140023100.11977-100000@pcnet1.pcnet.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
cc: freebsd-threads@freebsd.org
Subject: Re: libpthread patch
X-BeenThere: freebsd-threads@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Threading on FreeBSD <freebsd-threads.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-threads>,
	<mailto:freebsd-threads-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-threads>
List-Post: <mailto:freebsd-threads@freebsd.org>
List-Help: <mailto:freebsd-threads-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-threads>,
	<mailto:freebsd-threads-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Apr 2003 04:55:42 -0000

On Mon, 14 Apr 2003, David Xu wrote:

> After tested Daniel's pthread patch, I found 50% of ACE test
> program are core dumpped on my machine. So I studied the
> libpthread source code after applied the patch, I found that
> the main problem is thread state transition is not atomic, 
> for example in thr_mutex:

I have an updated version that works for all but the Cached_Conn_Test
(except for the ones that don't work with libc_r).  I narrowed down
the problem to joint and am working on a fix for that.

>     mutex_queue_enq(*m, curthread);
>     curthread->data.mutex = *m;
>     /*
>      This thread is active and is in a critical
>      region (holding the mutex lock); we should
>      be able to safely set the state.
>     */
>     THR_SET_STATE(curthread, PS_MUTEX_WAIT);
> 
>     /* Unlock the mutex structure: */
>     THR_LOCK_RELEASE(curthread, &(*m)->m_lock);
>     /* Schedule the next thread: */
>     _thr_sched_switch(curthread);
> 
> 
> thread sets its state to PS_MUTEX_WAIT, and call _thr_sched_switch,
> but it is not under scheduler lock, so there is a race between
> THR_SET_STATE and thr_sched_switch.

Unlike libc_r, I disassociated thread state from being in any
run or waiting queue.  So it is OK if another thread comes
along and makes the thread runnable.  Making the thread runnable
requires taking the scheduler lock, so by the time thr_sched_switch
takes the scheduler lock, the thread could be runnable again.
There are flags to indicate whether the thread is in the run
queue or the wait queue, so we rely on those flags before
adding/removing threads from those queues (instead of state).

> I have inserted  _kse_critical_enter() before THR_SET_STATE,
> the code looks as following:
> 
>     mutex_queue_enq(*m, curthread);
>     curthread->data.mutex = *m;
>     _kse_critical_enter();
>     /*
>      This thread is active and is in a critical
>      region (holding the mutex lock); we should
>      be able to safely set the state.
>     */
>     THR_SET_STATE(curthread, PS_MUTEX_WAIT);
> 
>     /* Unlock the mutex structure: */
>     THR_LOCK_RELEASE(curthread, &(*m)->m_lock);
>     /* Schedule the next thread: */
>     _thr_sched_switch(curthread);
> 
> I also commented out most code in thr_lock_wait() and
> thr_lock_wakeup(), I think without better scheduler lock,
> these code has race condition, and in most case will
> this cause a thread be reinserted into runq while it 
> is already in this queue.

It shouldn't any more :-)

> now, I can run ACE test programs without any core dumpped,
> and only the following program are failed:
> 
> Cached_Conn_Test
> Conn_Test
> MT_Reactor_Timer_Test
> Malloc_Test
> Process_Strategy_Test
> Thread_Pool_Test
> 
> a complete log file is at:
> http://people.freebsd.org/~davidxu/run_test.log
> 
> the libpthread package I modified is at:
> http://people.freebsd.org/~davidxu/libpthread.tgz

I'll take a look at it to make sure I haven't missed anything.

> Also, I can run crew test program without any problem.

The next thing I want to try is KDE.  It doesn't currently
work, but I'm hopeful that with your changes and my last
set of changes it will.

> I think the whole scheduler lock should be reworked
> to allow state transition is in atomic, my change is
> not SMP safe, only works on UP, because kse_critical_enter
> is only works for UP system. If we fixed this scheduler lock
> problem, I think the libpthread will be stable enough.

I thought that we might want to make changing the state
and switching to the scheduler all under the same lock,
but decided not to go that way.  I was concerned that by
requiring the scheduler lock to be held while atomically
changing the thread state and switching to the scheduler
might introduce weird locking order and wanted to avoid
that.  For instance, inherited priority adjustments and
condition signal broadcasts.  We should look at it again
though, because it may make sense in most cases.

I'll try to get a new patch set out after I fix
pthread_join().  I should have posted my earlier version
but I thought I could fix join quickly.  It turns out
it's a bit more tricky than I thought.

One question about kse_create().  I was thinking about
implementing pthread_setconcurrency().  We need to return
an error if the requested concurrency level cannot be
achieved.  If we do return an error, though, we should
not end up creating _any_ additional KSEs.  If someone
does:

	ret = pthread_setconcurrency(5);

How do we know how many KSEs (within the same KSEG) we can
create?  And if we use ncpu or some other knob, it doesn't
automatically mean we _will_ get the number of KSEs requested.
We can currently only make one at a time, so we may end up
with some additional KSEs but less than requested -- but we
still have to return an error to the caller.

-- 
Dan Eischen