Date: Wed, 4 Jun 2003 23:09:41 -0400 (EDT) From: Daniel Eischen <eischen@pcnet.com> To: Thomas Moestl <t.moestl@tu-bs.de> Cc: Kris Kennaway <kris@obsecurity.org> Subject: Re: phoenix crash in libc_r on sparc64 Message-ID: <Pine.GSO.4.10.10306042301120.6015-100000@pcnet5.pcnet.com> In-Reply-To: <20030604235607.GA682@crow.dom2ip.de>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 5 Jun 2003, Thomas Moestl wrote: > On Wed, 2003/06/04 at 00:30:36 -0700, Kris Kennaway wrote: > > On Mon, Jun 02, 2003 at 04:15:43PM -0700, Kris Kennaway wrote: > > > phoenix on my sparc64 crashed while idle with the following: > > > > > > Fatal error '_waitq_insert: Already in queue' at line 321 in file /usr/src/lib/libc_r/uthread/uthread_priority_queue.c (errno = 2) > > > > > > Any ideas? > > It should have dropped a core - can you please take a look at it with > gdb? > > > One of the libc_r tests seems to hang: > > > > Test static library: > > -------------------------------------------------------------------------- > > Test c_user c_system c_total chng > > passed/FAILED h_user h_system h_total % chng > > -------------------------------------------------------------------------- > > hello_d 0.00 0.02 0.02 > > passed > > -------------------------------------------------------------------------- > > hello_s 0.00 0.02 0.02 > > passed > > -------------------------------------------------------------------------- > > join_leak_d 0.77 0.18 0.95 > > passed > > -------------------------------------------------------------------------- > > mutex_d 9.08 92.42 101.50 > > passed > > -------------------------------------------------------------------------- > > sem_d 0.01 0.02 0.02 > > passed > > -------------------------------------------------------------------------- > > sigsuspend_d 0.00 0.02 0.02 > > passed > > -------------------------------------------------------------------------- > > sigwait_d 0.00 0.02 0.02 > > *** FAILED *** This one is suppose to kill the process at the end. > > -------------------------------------------------------------------------- > > guard_s.pl > > > > It's been sitting there for hours now. > > This an unfortunate failure mode, which is caused by a fault on the > stack while all signals are masked (by libc_r internals, I assume); > the kernel will fail to store the user register windows on the stack, > and because SIGILL is blocked, it cannot notify (or terminate) the > process and is stuck trying to copy out the register windows over and > over. > > > P.S. Why do 3 of the tests even fail on i386? > > The guard test includes constants which are machine- and > compiler-specific, probably this broke due to a gcc upgrade. > > The sigwait test is killed by it's own SIGUSR1, and this behaviour > actually looks correct to me (but I could easily be wrong, since the > signal behaviour of pthreads seems to be quite complex). Right, that is part of the test. I guess the expect script doesn't know that though. > The propagate test failure is due to problems in libc (failing to > use the underscored versions of functions overridden in libc_r). The > attached patch should fix that; Daniel, does this look OK to you? Yes, if those functions are used in libc, then that is what [un-]namespace.h is for. Any overridden functions in libc_r must use single underscore versions so that libc_r won't introduce cancellation points in places where there shouldn't be any or invoke signal handlers while a library-private lock is held. -- Dan Eischen
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.GSO.4.10.10306042301120.6015-100000>