From owner-freebsd-threads@FreeBSD.ORG Sun Dec 23 05:47:17 2007 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 25A7D16A417; Sun, 23 Dec 2007 05:47:17 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from mail.netplex.net (mail.netplex.net [204.213.176.10]) by mx1.freebsd.org (Postfix) with ESMTP id D7EBC13C45A; Sun, 23 Dec 2007 05:47:16 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from sea.ntplx.net (sea.ntplx.net [204.213.176.11]) by mail.netplex.net (8.14.2/8.14.2/NETPLEX) with ESMTP id lBN5lFB3007275; Sun, 23 Dec 2007 00:47:15 -0500 (EST) X-Virus-Scanned: by AMaViS and Clam AntiVirus (mail.netplex.net) X-Greylist: Message whitelisted by DRAC access database, not delayed by milter-greylist-4.0 (mail.netplex.net [204.213.176.10]); Sun, 23 Dec 2007 00:47:15 -0500 (EST) Date: Sun, 23 Dec 2007 00:47:15 -0500 (EST) From: Daniel Eischen X-X-Sender: eischen@sea.ntplx.net To: Ivan Voras In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-threads@freebsd.org Subject: Re: Proper use of condition variables? X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Daniel Eischen List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 23 Dec 2007 05:47:17 -0000 On Sun, 23 Dec 2007, Ivan Voras wrote: > Hi, > > I'm implementing what is basically a producer-consumer setup in which > two threads communicate only by a message queue. The idea is that the > consumer waits on a condition variable until something gets in the > queue, then takes it out and processes it. Unfortunately the program > deadlocks with the provider waiting in pthread_mutex_lock(queue_mtx) and > the consumer waiting in pthread_cond_wait(queue_cv, queue_mtx). This is > on RELENG_7. Am I misreading how pthread_cond_wait should behave? I > thought it should release the mutex until the cv gets signaled. That is the way things are suppose to work. I would try libkse to see if that makes any difference. > On the consumer side, the code looks like this: > > while (1) { > pthread_mutex_lock(&thr->queue_mtx); > if (STAILQ_EMPTY(&thr->queue)) > [X] pthread_cond_wait(&thr->queue_cv, &thr->queue_mtx); > > job = STAILQ_FIRST(&thr->queue); > STAILQ_REMOVE_HEAD(&thr->queue, linkage); > > pthread_mutex_unlock(&thr->queue_mtx); > > process(job); > } > > On the server side, it's like this: > > [X] pthread_mutex_lock(&thr->queue_mtx); > STAILQ_INSERT_TAIL(&thr->queue, job, linkage); > pthread_mutex_unlock(&thr->queue_mtx); > pthread_cond_signal(&thr->queue_cv); > > > The two lines that deadlock are marked with [X]. I would put some assert()s in for the ptread_mutex_* and pthread_cond_* calls, to make sure no errors are being returned. Also, because pthread_cond_signal() is called outside the lock, it is possible for job to be NULL. This is because the server thread may get preempted after pthread_mutex_unlock() and before pthread_cond_signal() has completed. In that time, a consumer thread can run and pull the job out of the queue, process it and go back to waiting on the CV again. It is required that pthread_cond_wait() always be called with a valid mutex and with that mutex locked. It is not required for pthread_cond_signal() to have the mutex locked, but it is generally a good idea to call pthread_cond_signal() inside the locked region to avoid any possible problems like this. -- DE