From owner-freebsd-threads@FreeBSD.ORG Thu Oct 28 20:27:58 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8A52B16A4CE; Thu, 28 Oct 2004 20:27:58 +0000 (GMT) Received: from mail.ntplx.net (mail.ntplx.net [204.213.176.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1BA0543D49; Thu, 28 Oct 2004 20:27:58 +0000 (GMT) (envelope-from deischen@freebsd.org) Received: from sea.ntplx.net (sea.ntplx.net [204.213.176.11]) i9SKRukD009928; Thu, 28 Oct 2004 16:27:56 -0400 (EDT) Date: Thu, 28 Oct 2004 16:27:56 -0400 (EDT) From: Daniel Eischen X-X-Sender: eischen@sea.ntplx.net To: John Baldwin In-Reply-To: <200410281554.07222.jhb@FreeBSD.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by AMaViS and Clam AntiVirus (mail.ntplx.net) cc: threads@freebsd.org Subject: Re: Infinite loop bug in libc_r on 4.x with condition variables and signals X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Daniel Eischen List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Oct 2004 20:27:58 -0000 On Thu, 28 Oct 2004, John Baldwin wrote: > On Wednesday 27 October 2004 06:30 pm, Daniel Eischen wrote: > > On Wed, 27 Oct 2004, John Baldwin wrote: > > > > > > FWIW, we are having (I think) the same problem on 5.3 with libpthread. > > > The panic there is in the mutex code about an assertion failing because a > > > thread is on a syncq when it is not supposed to be. > > > > David and I recently fixed some races in pthread_join() and > > pthread_exit() in -current libpthread. Don't know if those > > were responsible... > > > > Here's a test program that shows correct behavior with both > > libc_r and libpthread in -current. > > We've started testing on -current and are seeing several problems with > libpthread. Using a UP kernel (machines have single processor with HTT) > seems to make it better, but we seem to be getting SIG 11's in > pthread_testcancel() as well as the failed lock assertions that were > mentioned earlier on the list in the PR. Just running monodevelop from the > bsd-sharp stuff mentioned earlier can break in that one of the processes dies > with the assertion failure. If you let the other processes run, then you can > run it again and get the window to pop up, but then clicking on any of the > controls results in the pthread_testcancel() crash. FWIW, I think the reason > that the stack traces look weird in the PR's thread may be due to catching a > signal. When we were looking at the problems with libc_r on 4.x we would get > some weird looking backtraces sometimes when the assertion in uthread_sig.c > that I added failed. Seems that gdb doesn't handle the signal frames very > well. You also want to make sure you're not running out of stack space for your threads. Is the code trying to install signal frames on threads itself? That could cause the problems you are seeing. -- Dan Eischen