From owner-freebsd-threads@FreeBSD.ORG Thu Jun 26 21:26:36 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0135B37B401 for ; Thu, 26 Jun 2003 21:26:36 -0700 (PDT) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 41DBC43FCB for ; Thu, 26 Jun 2003 21:26:35 -0700 (PDT) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h5R4QXXh025279; Fri, 27 Jun 2003 00:26:33 -0400 (EDT) Date: Fri, 27 Jun 2003 00:26:33 -0400 (EDT) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Mike Makonnen In-Reply-To: <20030627012456.YLSF27254.pop017.verizon.net@kokeb.ambesa.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-threads@freebsd.org cc: Marcel Moolenaar Subject: Re: libkse / libthr bugs? X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: deischen@freebsd.org List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jun 2003 04:26:36 -0000 On Thu, 26 Jun 2003, Mike Makonnen wrote: > On Thu, 26 Jun 2003 17:39:22 -0700 > Marcel Moolenaar wrote: > > > On Thu, Jun 26, 2003 at 08:21:54PM -0400, Mike Makonnen wrote: > > > > > > > > % ./foo2 1000 > > > > [very long list of random "bar #" > > > > : > > > > bar 999 > > > > bar 226 > > > > bar 244 > > > > Thread (_thread_initial:0) already on mutexq > > > > Fatal error 'Illegal call from signal handler' at line 1347 in file > > > > /nfs/freebsd/5.x/src/lib/libthr/thread/thr_mutex.c (errno = 0) > > > > > > Great! I've been waiting for this message to appear for some time. Do you > > > have a backtrace by any chance? > > > > gdb(1) hasn't been ported yet, so no. > > BTW: The thread is not always _thread_initial... > > hmm.. ok if it's not always thread initial, that's gonna make it a bit harder. > Can you try this quick hack: > http://people.freebsd.org/~mtm/patches/libthr.sigblock.diff > > It simply blocks all signals while libthr holds a lock. I'd be > interested in knowing whether you still get the same errors. I haven't looked at libthr much, so perhaps this doesn't apply... Signal handling and locks (low-level, CV, mutex, etc) are somewhat difficult to deal with, especially when there are mutexes in libc that the application doesn't even see. In general, signals can't be deferred around big locks (mutexes, CVs, rwlocks, etc), but may be around low-level locks (which is what I think your patch is doing). It is valid for an application to have a thread blocked in pthread_mutex_lock(), pthread_cond_timedwait(), etc, receive a signal. The signal handler should run, but those functions should not return EINTR; they should continue blocking when the handler returns. It is also valid for a thread to be blocked in fwrite() (or some other libc function that has locking) and receive a signal. In either case, you also have to handle the the thread _not_ returning normally; it could [sig]longjmp() or setcontext() out of the locked area. So if you are keeping any internal queues for mutexes, CVs, etc, you have to ensure the thread is removed from the queue before the signal handler is invoked, and then reinsert the thread back into the queue if the signal handler returns normally. The error message that Marcel posted (already on mutexq) reminds me of similar problems long ago in libc_r. -- Dan Eischen