Date: Thu, 18 Sep 2008 17:35:29 -0400 From: John Baldwin <jhb@freebsd.org> To: freebsd-current@freebsd.org Cc: David Naylor <naylor.b.david@gmail.com> Subject: Re: FreeBSD deadlock (with fork?) Message-ID: <200809181735.29504.jhb@freebsd.org> In-Reply-To: <200809180631.47071.naylor.b.david@gmail.com> References: <200809180631.47071.naylor.b.david@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday 18 September 2008 12:31:42 am David Naylor wrote: > Hi, > > I have a program that spawns a lot of subprocesses (with pipes open) from > multiple threads. The problem is the program often deadlocks, but not > consistently. Sometimes the program can run over 5 times to competition > without incidence and yet othertimes it locks within a few seconds. > > However if I limit the thread count to 1 the problem does not appear to be > present. > > Here are the logs from gdb: > (gdb) info thread > 5 Thread 7021c0 (LWP 100203) 0x00000008009a2e8c in _umtx_op_err () > at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37 > 4 Thread a28480 (LWP 100174) 0x00000008009a2e8c in _umtx_op_err () > at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37 > 3 Thread a61d80 (LWP 100175) 0x00000008009a2e8c in _umtx_op_err () > at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37 > 2 Thread a61bc0 (LWP 100176) 0x00000008009a2e8c in _umtx_op_err () > at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37 > * 1 Thread a61840 (LWP 100177) 0x00000008009a2e8c in _umtx_op_err () > at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37 > > > (gdb) bt > #0 0x00000008009a2e8c in _umtx_op_err () > at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37 > #1 0x00000008009a1331 in cond_wait_common (cond=Variable "cond" is not > available. This is not waiting on a lock, this is a pthread_condvar_wait() of some sort. > (gdb) thr 2 > [Switching to thread 2 (Thread a61bc0 (LWP 100176))]#0 0x00000008009a2e8c in > _umtx_op_err () > at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37 > 37 RSYSCALL_ERR(_umtx_op) > (gdb) bt > #0 0x00000008009a2e8c in _umtx_op_err () > at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37 > #1 0x00000008009a1331 in cond_wait_common (cond=Variable "cond" is not > available. Simiarly here. I don't think you have a deadlock. I think you have a bug where you are missing a pthread_condvar_signal() or broadcast or some such. Or maybe you aren't holding the mutex when doing the signal or broadcast. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200809181735.29504.jhb>