From owner-freebsd-threads@FreeBSD.ORG  Sun Dec 23 05:47:17 2007
Return-Path: <owner-freebsd-threads@FreeBSD.ORG>
Delivered-To: freebsd-threads@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 25A7D16A417;
	Sun, 23 Dec 2007 05:47:17 +0000 (UTC)
	(envelope-from deischen@freebsd.org)
Received: from mail.netplex.net (mail.netplex.net [204.213.176.10])
	by mx1.freebsd.org (Postfix) with ESMTP id D7EBC13C45A;
	Sun, 23 Dec 2007 05:47:16 +0000 (UTC)
	(envelope-from deischen@freebsd.org)
Received: from sea.ntplx.net (sea.ntplx.net [204.213.176.11])
	by mail.netplex.net (8.14.2/8.14.2/NETPLEX) with ESMTP id
	lBN5lFB3007275; Sun, 23 Dec 2007 00:47:15 -0500 (EST)
X-Virus-Scanned: by AMaViS and Clam AntiVirus (mail.netplex.net)
X-Greylist: Message whitelisted by DRAC access database, not delayed by
	milter-greylist-4.0 (mail.netplex.net [204.213.176.10]);
	Sun, 23 Dec 2007 00:47:15 -0500 (EST)
Date: Sun, 23 Dec 2007 00:47:15 -0500 (EST)
From: Daniel Eischen <deischen@freebsd.org>
X-X-Sender: eischen@sea.ntplx.net
To: Ivan Voras <ivoras@freebsd.org>
In-Reply-To: <fkk5ht$gsj$1@ger.gmane.org>
Message-ID: <Pine.GSO.4.64.0712230035400.29917@sea.ntplx.net>
References: <fkk5ht$gsj$1@ger.gmane.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-threads@freebsd.org
Subject: Re: Proper use of condition variables?
X-BeenThere: freebsd-threads@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Daniel Eischen <deischen@freebsd.org>
List-Id: Threading on FreeBSD <freebsd-threads.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-threads>, 
	<mailto:freebsd-threads-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-threads>
List-Post: <mailto:freebsd-threads@freebsd.org>
List-Help: <mailto:freebsd-threads-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-threads>,
	<mailto:freebsd-threads-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 23 Dec 2007 05:47:17 -0000

On Sun, 23 Dec 2007, Ivan Voras wrote:

> Hi,
>
> I'm implementing what is basically a producer-consumer setup in which
> two threads communicate only by a message queue. The idea is that the
> consumer waits on a condition variable until something gets in the
> queue, then takes it out and processes it. Unfortunately the program
> deadlocks with the provider waiting in pthread_mutex_lock(queue_mtx) and
> the consumer waiting in pthread_cond_wait(queue_cv, queue_mtx). This is
> on RELENG_7. Am I misreading how pthread_cond_wait should behave? I
> thought it should release the mutex until the cv gets signaled.

That is the way things are suppose to work.  I would try libkse
to see if that makes any difference.

> On the consumer side, the code looks like this:
>
> while (1) {
> 	pthread_mutex_lock(&thr->queue_mtx);
> 	if (STAILQ_EMPTY(&thr->queue))
> [X]		pthread_cond_wait(&thr->queue_cv, &thr->queue_mtx);
>
> 	job = STAILQ_FIRST(&thr->queue);
> 	STAILQ_REMOVE_HEAD(&thr->queue, linkage);
>
> 	pthread_mutex_unlock(&thr->queue_mtx);
>
> 	process(job);
> }
>
> On the server side, it's like this:
>
> [X]	pthread_mutex_lock(&thr->queue_mtx);
> 	STAILQ_INSERT_TAIL(&thr->queue, job, linkage);
> 	pthread_mutex_unlock(&thr->queue_mtx);
> 	pthread_cond_signal(&thr->queue_cv);
>
>
> The two lines that deadlock are marked with [X].

I would put some assert()s in for the ptread_mutex_* and
pthread_cond_* calls, to make sure no errors are being returned.

Also, because pthread_cond_signal() is called outside the lock,
it is possible for job to be NULL.  This is because the server
thread may get preempted after pthread_mutex_unlock() and before
pthread_cond_signal() has completed.  In that time, a consumer
thread can run and pull the job out of the queue, process it and
go back to waiting on the CV again.  It is required that
pthread_cond_wait() always be called with a valid mutex and
with that mutex locked.  It is not required for pthread_cond_signal()
to have the mutex locked, but it is generally a good idea to
call pthread_cond_signal() inside the locked region to avoid
any possible problems like this.

-- 
DE