Date: Fri, 27 Jun 2003 00:26:33 -0400 (EDT) From: Daniel Eischen <eischen@vigrid.com> To: Mike Makonnen <mtm@identd.net> Cc: Marcel Moolenaar <marcel@xcllnt.net> Subject: Re: libkse / libthr bugs? Message-ID: <Pine.GSO.4.10.10306270005060.22365-100000@pcnet5.pcnet.com> In-Reply-To: <20030627012456.YLSF27254.pop017.verizon.net@kokeb.ambesa.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 26 Jun 2003, Mike Makonnen wrote: > On Thu, 26 Jun 2003 17:39:22 -0700 > Marcel Moolenaar <marcel@xcllnt.net> wrote: > > > On Thu, Jun 26, 2003 at 08:21:54PM -0400, Mike Makonnen wrote: > > > > > > > > % ./foo2 1000 > > > > [very long list of random "bar #" > > > > : > > > > bar 999 > > > > bar 226 > > > > bar 244 > > > > Thread (_thread_initial:0) already on mutexq > > > > Fatal error 'Illegal call from signal handler' at line 1347 in file > > > > /nfs/freebsd/5.x/src/lib/libthr/thread/thr_mutex.c (errno = 0) > > > > > > Great! I've been waiting for this message to appear for some time. Do you > > > have a backtrace by any chance? > > > > gdb(1) hasn't been ported yet, so no. > > BTW: The thread is not always _thread_initial... > > hmm.. ok if it's not always thread initial, that's gonna make it a bit harder. > Can you try this quick hack: > http://people.freebsd.org/~mtm/patches/libthr.sigblock.diff > > It simply blocks all signals while libthr holds a lock. I'd be > interested in knowing whether you still get the same errors. I haven't looked at libthr much, so perhaps this doesn't apply... Signal handling and locks (low-level, CV, mutex, etc) are somewhat difficult to deal with, especially when there are mutexes in libc that the application doesn't even see. In general, signals can't be deferred around big locks (mutexes, CVs, rwlocks, etc), but may be around low-level locks (which is what I think your patch is doing). It is valid for an application to have a thread blocked in pthread_mutex_lock(), pthread_cond_timedwait(), etc, receive a signal. The signal handler should run, but those functions should not return EINTR; they should continue blocking when the handler returns. It is also valid for a thread to be blocked in fwrite() (or some other libc function that has locking) and receive a signal. In either case, you also have to handle the the thread _not_ returning normally; it could [sig]longjmp() or setcontext() out of the locked area. So if you are keeping any internal queues for mutexes, CVs, etc, you have to ensure the thread is removed from the queue before the signal handler is invoked, and then reinsert the thread back into the queue if the signal handler returns normally. The error message that Marcel posted (already on mutexq) reminds me of similar problems long ago in libc_r. -- Dan Eischen
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.GSO.4.10.10306270005060.22365-100000>