Date: Fri, 18 Dec 2009 10:09:51 -0500 From: John Baldwin <jhb@freebsd.org> To: freebsd-stable@freebsd.org Cc: freebsd-hackers@freebsd.org, Steven Hartland <killing@multiplay.co.uk> Subject: Re: Passenger hangs on live and SEGV on tests possible threading / kernel bug? Message-ID: <200912181009.51798.jhb@freebsd.org> In-Reply-To: <28F90357192743E085ABEE7CD4C9FDF9@multiplay.co.uk> References: <DD0B1DB4EEAE4FB49FFFE1FDF5E9D7E3@multiplay.co.uk> <200912170908.49119.jhb@freebsd.org> <28F90357192743E085ABEE7CD4C9FDF9@multiplay.co.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday 17 December 2009 12:27:17 pm Steven Hartland wrote: > ----- Original Message ----- > From: "John Baldwin" <jhb@freebsd.org> > > For the hang it seems you have a thread waiting in a blocking read(), a thread > > waiting in a blocking accept(), and lots of threads creating condition > > variables. However, the pthread_cond_init() in libpthread (libthr on FreeBSD) > > doesn't call pthread_cleanup_push(), so your stack trace doesn't make sense to > > me. However, that may be gdb getting confused. The pthread_cleanup_push() > > frame may be cond_init(). However, it doesn't call umtx_op() (the > > _thr_umutex_init() call it makes just initializes the structure, it doesn't > > make a _umtx_op() system call). You might try posting on threads@ to try to > > get more info on this, but your pthread_cond_init() stack traces don't really > > make sense. Can you rebuild libc and libthr with debug symbols? > > > > For example: > > > > # cd /usr/src/lib/libc > > # make clean > > # make DEBUG_FLAGS=-g > > # make DEBUG_FLAGS=-g install > > > > However, if you are hanging in read(), that usually means you have a socket > > that just doesn't have data. That might be an application bug of some sort. > > > > The segv trace doesn't include the first part of GDB messages which show which > > thread actually had a seg fault. It looks like it was the thread that was > > throwing an exception. However, nanosleep() doesn't throw exceptions, so that > > stack trace doesn't really make sense either. Perhaps that stack is hosed by > > the exception handling code? > > I've uploaded a two more traces for the oxt test failure / segv. > http://code.google.com/p/phusion-passenger/issues/detail?id=441#c1 > > >From looking at the test case it testing the capture of failures and its ability > to create a stack trace output so that may give others some indication where > the issue may be? > > I will look to do the same on for the hang issue but that's on a live site so > will need to schedule some downtime before I can get those rebuilt and then > wait for it to hang again, which could be quite some time :( Hmmm, the only seg fault I see is happening down inside libgcc in the stack unwinding code and that is 3rd party code from gcc. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200912181009.51798.jhb>