Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 10 Nov 2001 10:09:14 -0500 (EST)
From:      Daniel Eischen <eischen@pcnet1.pcnet.com>
To:        rittle@labs.mot.com
Cc:        freebsd-hackers@FreeBSD.ORG, current@FreeBSD.ORG
Subject:   Re: Report on FreeBSD 4.4 pthread implementation verses boehm-gc
Message-ID:  <Pine.SUN.3.91.1011110095953.20896A-100000@pcnet1.pcnet.com>
In-Reply-To: <200111090250.fA92o7h55180@latour.rsch.comm.mot.com>

next in thread | previous in thread | raw e-mail | index | archive | help
[ Followups to -current ]

On Thu, 8 Nov 2001, Loren James Rittle wrote:

> Hello all,
> 
> I have ported the most recent version of boehm-gc (6.1-alpha) to
> FreeBSD/i386 under the auspice of the gcc project (it will be in Hans'
> 6.1 release and it is on the gcc mainline).  I got one notable thing
> fully configured beyond what is in the ports tree (which is based on
> 6.0): threaded GC is now supported.  However, this work has uncovered
> either a rare race condition in the 4.X pthread implementation (also
> seen on a current 5.0 system) or a bad assumption in the GC signal
> code (abstracted below).  Either way, the result seen is an undetected
> deadlock.  With the following new assertion, I can at least force the
> condition to be detectable in many cases where it would have locked up.
> 
> Two questions come to mind: Is there any condition under which my new
> assumption should not be true?  Is there any obvious mistake that a
> threaded application can make (perhaps related to its signal use) that
> could cause the new assumption to ever be violated?
> 
> I have also seen what I thought was a less important issue, but I now
> see that it is probably related.  After reviewing the FreeBSD uthread
> source code, the issue appears to be a race between the pthread_exit()
> code running in one thread and the pthread_join() code running in
> another thread in conjunction with a sigsuspend() call occurring on a
> signal handler of that second thread.  Under some conditions, an
> errant EINTR would be returned to the pthread_join() caller instead of
> the exit code from the terminated thread.  Under other timing
> conditions, you get the deadlock spotted with the above new assertion.

Try the following patch; this is to -current, you'll have to massage
it a bit for -stable (Hint: s/curthread/_thread_run/ in -stable).

-- 
Dan Eischen

Index: uthread/pthread_private.h
===================================================================
RCS file: /opt/d/CVS/src/lib/libc_r/uthread/pthread_private.h,v
retrieving revision 1.63
diff -u -r1.63 pthread_private.h
--- uthread/pthread_private.h	26 Oct 2001 21:19:22 -0000	1.63
+++ uthread/pthread_private.h	10 Nov 2001 14:35:48 -0000
@@ -601,6 +601,11 @@
 	/* XXX - What about thread->timeout and/or thread->error? */
 };
 
+struct join_status {
+	struct pthread	*thread;
+	int		ret;
+	int		error;
+};
 
 /*
  * Normally thread contexts are stored as jmp_bufs via 
_setjmp()/_longjmp(),
@@ -757,8 +762,12 @@
 	 */
 	int	error;
 
-	/* Pointer to a thread that is waiting to join (NULL if no joiner). */
-	struct pthread *joiner;
+	/*
+	 * The joiner is the thread that is joining to this thread.  The
+	 * join status keeps track of a join operation to another thread.
+	 */
+	struct pthread		*joiner;
+	struct join_status	join_status;
 
 	/*
 	 * The current thread can belong to only one scheduling queue at
Index: uthread/uthread_exit.c
===================================================================
RCS file: /opt/d/CVS/src/lib/libc_r/uthread/uthread_exit.c,v
retrieving revision 1.23
diff -u -r1.23 uthread_exit.c
--- uthread/uthread_exit.c	20 May 2001 23:08:32 -0000	1.23
+++ uthread/uthread_exit.c	10 Nov 2001 14:36:50 -0000
@@ -220,8 +220,9 @@
 		}
 
 		/* Set the return value for the joining thread: */
-		pthread->ret = curthread->ret;
-		pthread->error = 0;
+		pthread->join_status.ret = curthread->ret;
+		pthread->join_status.error = 0;
+		pthread->join_status.thread = NULL;
 
 		/* Make this thread collectable by the garbage collector. */
 		PTHREAD_ASSERT(((curthread->attr.flags & PTHREAD_DETACHED) ==
Index: uthread/uthread_join.c
===================================================================
RCS file: /opt/d/CVS/src/lib/libc_r/uthread/uthread_join.c,v
retrieving revision 1.19
diff -u -r1.19 uthread_join.c
--- uthread/uthread_join.c	16 Aug 2001 06:31:32 -0000	1.19
+++ uthread/uthread_join.c	10 Nov 2001 14:36:14 -0000
@@ -122,18 +122,20 @@
 		pthread->joiner = curthread;
 
 		/* Keep track of which thread we're joining to: */
-		curthread->data.thread = pthread;
+		curthread->join_status.thread = pthread;
 
-		/* Schedule the next thread: */
-		_thread_kern_sched_state(PS_JOIN, __FILE__, __LINE__);
+		while (curthread->join_status.thread == pthread) {
+			/* Schedule the next thread: */
+			_thread_kern_sched_state(PS_JOIN, __FILE__, __LINE__);
+		}
 
 		/*
 		 * The thread return value and error are set by the thread we're
 		 * joining to when it exits or detaches:
 		 */
-		ret = curthread->error;
+		ret = curthread->join_status.error;
 		if ((ret == 0) && (thread_return != NULL))
-			*thread_return = curthread->ret;
+			*thread_return = curthread->join_status.ret;
 	} else {
 		/*
 		 * The thread exited (is dead) without being detached, and no
Index: uthread/uthread_sig.c
===================================================================
RCS file: /opt/d/CVS/src/lib/libc_r/uthread/uthread_sig.c,v
retrieving revision 1.38
diff -u -r1.38 uthread_sig.c
--- uthread/uthread_sig.c	29 Jun 2001 17:09:07 -0000	1.38
+++ uthread/uthread_sig.c	10 Nov 2001 14:28:09 -0000
@@ -671,7 +671,6 @@
 	 * signal handler to run:
 	 */
 	case PS_COND_WAIT:
-	case PS_JOIN:
 	case PS_MUTEX_WAIT:
 		/*
 		 * Remove the thread from the wait queue.  It will
@@ -679,6 +678,17 @@
 		 * handlers have been invoked.
 		 */
 		PTHREAD_WAITQ_REMOVE(pthread);
+		break;
+
+	case PS_JOIN:
+		/*
+		 * Remove the thread from the wait queue.  It will
+		 * be added back to the wait queue once all signal
+		 * handlers have been invoked.
+		 */
+		PTHREAD_WAITQ_REMOVE(pthread);
+		/* Make the thread runnable: */
+		PTHREAD_SET_STATE(pthread, PS_RUNNING);
 		break;
 
 	/*


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.SUN.3.91.1011110095953.20896A-100000>