From owner-freebsd-current Mon Apr 24 5:34:33 2000 Delivered-To: freebsd-current@freebsd.org Received: from mout0.freenet.de (mout0.freenet.de [194.97.50.131]) by hub.freebsd.org (Postfix) with ESMTP id 24DAB37B95B for ; Mon, 24 Apr 2000 05:34:28 -0700 (PDT) (envelope-from netchild@leidinger.net) Received: from [62.104.201.2] (helo=mx1.freenet.de) by mout0.freenet.de with esmtp (Exim 3.14 #3) id 12ji4H-0001p8-00; Mon, 24 Apr 2000 14:34:25 +0200 Received: from [213.6.57.244] (helo=Magelan.Leidinger.net) by mx1.freenet.de with esmtp (Exim 3.14 #3) id 12ji4G-00038D-00; Mon, 24 Apr 2000 14:34:24 +0200 Received: from Leidinger.net (netchild@localhost [127.0.0.1]) by Magelan.Leidinger.net (8.9.3/8.9.3) with ESMTP id NAA01320; Mon, 24 Apr 2000 13:25:42 +0200 (CEST) (envelope-from netchild@Leidinger.net) Message-Id: <200004241125.NAA01320@Magelan.Leidinger.net> Date: Mon, 24 Apr 2000 13:25:39 +0200 (CEST) From: Alexander Leidinger Subject: Re: pthread_cond_broadcast() not delivered To: eischen@vigrid.com Cc: current@freebsd.org In-Reply-To: <200004231809.OAA29230@pcnet1.pcnet.com> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On 23 Apr, Daniel Eischen wrote: >> (14) netchild@ttyp2% uname -a >> FreeBSD Magelan.Leidinger.net 5.0-CURRENT FreeBSD 5.0-CURRENT #14: >> Fri Apr 21 17:28:37 CEST 2000 root@:/big/usr/src/sys/compile/WORK >> i386 >> >> I've an application which uses pthread_cond_{wait,broadcast}() and >> the debug output gives me the impression that the broadcast did not >> get delivered anymore. >> >> I run this program only occasionally, but with 4-current (last year) >> it worked, and I haven't changed anything mutex-/cond-related in it >> since then. >> >> I've attached a short test-prog (1.7k) which shows the same behavior, >> compile it with "cc -D_THREAD_SAFE -pthread test.c" and run >> "./a.out". > > If you want it to work correctly, you have to make the second thread > release the mutex. Look at it more closely: > > void * > second_thread(void *arg) > { > /* syncronize */ > fprintf(stderr, "Second: lock.\n"); > pthread_mutex_lock(main_mutex); > > fprintf(stderr, "Second: broadcast.\n"); > pthread_cond_broadcast(main_cond); > > fprintf(stderr, "Second: unlock.\n"); > pthread_mutex_lock(main_mutex); > ^^^^^^^^^^^^^^^^^^ [...] Yes, sorry, a flaw in my test-prog. And yes, the test-prog works now, but my app didn't. I've verified every lock/unlock with the corresponding fprintf(), it's consistent: ---snip--- [prefill buffer 0-14 and start Output-thread] Decode: (1) lock buffer. Decode: (2) lock buffer 15. Decode: before cond_wait. Output: (1) lock buffer. Output: before broadcast. Output: after broadcast. Output: (2) lock buffer 0. Output: (3) unlock buffer. Output: write buffer 0. Output: (5) unlock buffer 0 Output: (6) lock buffer. Output: (2) lock buffer 1. Output: (3) unlock buffer. Output: write buffer 1. Output: (5) unlock buffer 1 Output: (6) lock buffer. Output: (2) lock buffer 2. Output: (3) unlock buffer. Output: write buffer 2. Output: (5) unlock buffer 2 [... buffer 3-13] Output: (6) lock buffer. Output: (2) lock buffer 14. Output: (3) unlock buffer. Output: write buffer 14. Output: (5) unlock buffer 14 Output: (6) lock buffer. Output: (2) lock buffer 15. [deadlock] ---snip--- (after buf 15 it has to start with buf 0 again). The corresponding code (Decode-thread): ---snip--- #if 1 fprintf(stderr, "Decode: (1) lock buffer.\n"); #endif pthread_mutex_lock(output->mutex); /* [create output thread] */ #if 1 fprintf(stderr, "Decode: (2) lock buffer %d.\n", which_buffer); #endif pthread_mutex_lock(output->buffer[which_buffer].mutex); #if 1 fprintf(stderr, "Decode: before cond_wait.\n"); #endif pthread_cond_wait(&output->output_startet, output->mutex); #if 1 fprintf(stderr, "Decode: (3) unlock buffer.\n"); #endif pthread_mutex_unlock(output->mutex); ---snip--- and (Output-thread): ---snip--- #if 1 fprintf(stderr, "Output: (1) lock buffer.\n"); #endif pthread_mutex_lock(output->mutex); /* we are in sync, awake it */ #if 1 fprintf(stderr, "Output: before broadcast.\n"); #endif ret = pthread_cond_broadcast(&output->output_startet); #if 1 fprintf(stderr, "Output: after broadcast.\n"); #endif while((output->num_bytes == 0) || (output->num_bytes > bytes_written)) { #if 1 fprintf(stderr, "Output: (2) lock buffer %d.\n", which_buffer); #endif pthread_mutex_lock(output->buffer[which_buffer].mutex); #if 1 fprintf(stderr, "Output: (3) unlock buffer.\n"); #endif pthread_mutex_unlock(output->mutex); ---snip--- Everything looks fine here. And it worked a while ago. The only code-change is in the "Output: write buffer %d"-part. I'm now under the impression that the output part locks/unlocks output->mutex very fast and the Decode-thread isn't able to get the lock on it (after a little bit of restarting the app: sometimes it works, so it seems to be timing related). I replace the "Output: (2) lock buffer %d"-part with a trylock, usleep() a little bit if it returns EBUSY and have a look how it works. Sorry to have bothered the list with it, Alexander. -- It is easier to fix Unix than to live with NT. http://www.Leidinger.net Alexander+Home @ Leidinger.net GPG fingerprint = 7423 F3E6 3A7E B334 A9CC B10A 1F5F 130A A638 6E7E To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message