Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 27 Jun 2003 00:26:33 -0400 (EDT)
From:      Daniel Eischen <eischen@vigrid.com>
To:        Mike Makonnen <mtm@identd.net>
Cc:        Marcel Moolenaar <marcel@xcllnt.net>
Subject:   Re: libkse / libthr bugs?
Message-ID:  <Pine.GSO.4.10.10306270005060.22365-100000@pcnet5.pcnet.com>
In-Reply-To: <20030627012456.YLSF27254.pop017.verizon.net@kokeb.ambesa.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 26 Jun 2003, Mike Makonnen wrote:

> On Thu, 26 Jun 2003 17:39:22 -0700
> Marcel Moolenaar <marcel@xcllnt.net> wrote:
> 
> > On Thu, Jun 26, 2003 at 08:21:54PM -0400, Mike Makonnen wrote:
> > > > 
> > > > % ./foo2 1000
> > > > [very long list of random "bar #"
> > > >  :
> > > > bar 999
> > > > bar 226
> > > > bar 244
> > > > Thread (_thread_initial:0) already on mutexq
> > > > Fatal error 'Illegal call from signal handler' at line 1347 in file
> > > > /nfs/freebsd/5.x/src/lib/libthr/thread/thr_mutex.c (errno = 0)
> > > 
> > > Great! I've been waiting for this message to appear for some time. Do you
> > > have a backtrace by any chance?
> > 
> > gdb(1) hasn't been ported yet, so no.
> > BTW: The thread is not always _thread_initial...
> 
> hmm.. ok if it's not always thread initial, that's gonna make it a bit harder.
> Can you try this quick hack:
> http://people.freebsd.org/~mtm/patches/libthr.sigblock.diff
> 
> It simply blocks all signals while libthr holds a lock. I'd be
> interested in knowing whether you still get the same errors.

I haven't looked at libthr much, so perhaps this doesn't apply...

Signal handling and locks (low-level, CV, mutex, etc) are
somewhat difficult to deal with, especially when there are
mutexes in libc that the application doesn't even see.

In general, signals can't be deferred around big locks
(mutexes, CVs, rwlocks, etc), but may be around low-level
locks (which is what I think your patch is doing).
It is valid for an application to have a thread blocked
in pthread_mutex_lock(), pthread_cond_timedwait(), etc,
receive a signal.  The signal handler should run, but
those functions should not return EINTR; they should
continue blocking when the handler returns.

It is also valid for a thread to be blocked in fwrite()
(or some other libc function that has locking) and
receive a signal.

In either case, you also have to handle the the thread
_not_ returning normally; it could [sig]longjmp() or
setcontext() out of the locked area.  So if you are keeping
any internal queues for mutexes, CVs, etc, you have
to ensure the thread is removed from the queue before
the signal handler is invoked, and then reinsert the
thread back into the queue if the signal handler
returns normally.

The error message that Marcel posted (already on mutexq)
reminds me of similar problems long ago in libc_r.

-- 
Dan Eischen



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.GSO.4.10.10306270005060.22365-100000>