Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 25 May 2006 10:49:33 -0400
From:      Kurt Miller <kurt@intricatesoftware.com>
To:        freebsd-threads@freebsd.org, Daniel Eischen <deischen@freebsd.org>
Subject:   Re: pthread_cond_signal w/suspended threads
Message-ID:  <200605251049.34486.kurt@intricatesoftware.com>
In-Reply-To: <Pine.GSO.4.64.0605241900060.29100@sea.ntplx.net>
References:  <200605241814.11452.lists@intricatesoftware.com> <Pine.GSO.4.64.0605241900060.29100@sea.ntplx.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wednesday 24 May 2006 7:07 pm, Daniel Eischen wrote:
> On Wed, 24 May 2006, Kurt Miller wrote:
> 
> > I've been working on an intermittent deadlock in
> > the jvm and have isolated it to the following issue.
> > When multiple threads are waiting on a condition
> > variable and some of those threads have been suspended
> > with pthread_suspend_np(), there is an expectation that
> > pthread_cond_signal() will signal an unsuspended thread
> > first. However, the following program shows this is not
> > the case on 6.1-release/kse (thr works as expected).
> >
> > Can kse behavior be changed to signal unsuspended
> > threads first?
> 
> Relying on this behavior isn't exactly portable, and what
> if there are no unsuspended threads when the signal occurs.
> A suspended thread would still be signaled in that case.

I agree its not portable, but we are talking about a
non portable function (pthread_suspend_np). :-)
Since we have some latitude here, doesn't it make sense
to have our pthread_suspend_np be feature compatible
with Solaris? At least in the case of the jvm it would
help a bit if this behavior matched Solaris. Below is
a diff that makes the behavior change for your
consideration. Also the test program converted for
Solaris is after that.

As it turns out there are two deadlocks I've found.
The one described here appears to be hard to reproduce
and occurs upon jvm shutdown. In fact I haven't been
able to reproduce it again. The other deadlock I'll
post under another thread when I make a C program to
reproduce it and if it appears to be a thread issue.

> The whole suspended threads thing is kind of dangerous
> anyways.  If one of these threads hold a lock, or is
> waiting in some other queue, then deadlock can occur.
> I think you need a way of waiting until they are out
> of critical regions before you suspend them in order
> for this not to cause problems.

Sure, I agree. Right now the only alternative I see is
to not use pthread_suspend/resume_np and do thread
suspension with signals the way hotspot does it for
linux. I'd perfer not to do that if possible.

--- thr_cond.c.orig	Wed May 24 21:36:18 2006
+++ thr_cond.c	Thu May 25 10:44:46 2006
@@ -596,13 +596,23 @@
 			(*cond)->c_seqno++;
 
 			/*
+			 * Wakeup the first unsuspended thread, otherwise
+			 * the first suspended. Matches Solaris behavior.
+			 */
+			pthread = TAILQ_FIRST(&(*cond)->c_queue);
+			while (pthread != NULL && 
+			    (pthread->flags & THR_FLAGS_SUSPENDED) != 0)
+				pthread = TAILQ_NEXT(pthread, sqe);
+			if (pthread == NULL)
+				pthread = TAILQ_FIRST(&(*cond)->c_queue);
+
+			/*
 			 * Wakeups have to be done with the CV lock held;
 			 * otherwise there is a race condition where the
 			 * thread can timeout, run on another KSE, and enter
 			 * another blocking state (including blocking on a CV).
 			 */
-			if ((pthread = TAILQ_FIRST(&(*cond)->c_queue))
-			    != NULL) {
+			if (pthread != NULL) {
 				THR_SCHED_LOCK(curthread, pthread);
 				cond_queue_remove(*cond, pthread);
 				pthread->sigbackout = NULL;


#include <stdio.h>
#include <unistd.h>
#include <sched.h>
#include <thread.h>

cond_t  cond;
mutex_t mutex;

static void *
start_routine(void *arg)
{
    mutex_lock(&mutex);
    cond_wait(&cond, &mutex);
    printf("unblocked %d\n", (int)arg);
    mutex_unlock(&mutex);
    return (0);
}

int
main(int argc, char *argv[])
{
    thread_t thread1;
    thread_t thread2;

    cond_init(&cond, NULL, NULL);
    mutex_init(&mutex, NULL, NULL);

    thr_create(NULL, NULL, start_routine, (void *)1, NULL, &thread1);
    sleep(1);
    thr_create(NULL, NULL, start_routine, (void *)2, NULL, &thread2);
    sleep(1);
    thr_suspend(thread1);
    cond_signal(&cond);
    sleep(1);
    thr_continue(thread1);
    sleep(1);
    return (0);
}



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200605251049.34486.kurt>