Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 18 Sep 2008 17:35:29 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        freebsd-current@freebsd.org
Cc:        David Naylor <naylor.b.david@gmail.com>
Subject:   Re: FreeBSD deadlock (with fork?)
Message-ID:  <200809181735.29504.jhb@freebsd.org>
In-Reply-To: <200809180631.47071.naylor.b.david@gmail.com>
References:  <200809180631.47071.naylor.b.david@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday 18 September 2008 12:31:42 am David Naylor wrote:
> Hi,
> 
> I have a program that spawns a lot of subprocesses (with pipes open) from 
> multiple threads.  The problem is the program often deadlocks, but not 
> consistently.  Sometimes the program can run over 5 times to competition 
> without incidence and yet othertimes it locks within a few seconds.  
> 
> However if I limit the thread count to 1 the problem does not appear to be 
> present.  
> 
> Here are the logs from gdb:
> (gdb) info thread
>   5 Thread 7021c0 (LWP 100203)  0x00000008009a2e8c in _umtx_op_err ()
>     at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
>   4 Thread a28480 (LWP 100174)  0x00000008009a2e8c in _umtx_op_err ()
>     at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
>   3 Thread a61d80 (LWP 100175)  0x00000008009a2e8c in _umtx_op_err ()
>     at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
>   2 Thread a61bc0 (LWP 100176)  0x00000008009a2e8c in _umtx_op_err ()
>     at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
> * 1 Thread a61840 (LWP 100177)  0x00000008009a2e8c in _umtx_op_err ()
>     at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
> 
> 
> (gdb) bt
> #0  0x00000008009a2e8c in _umtx_op_err () 
> at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
> #1  0x00000008009a1331 in cond_wait_common (cond=Variable "cond" is not 
> available.

This is not waiting on a lock, this is a pthread_condvar_wait() of some sort.

> (gdb) thr 2
> [Switching to thread 2 (Thread a61bc0 (LWP 100176))]#0  0x00000008009a2e8c 
in 
> _umtx_op_err ()
>     at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
> 37      RSYSCALL_ERR(_umtx_op)
> (gdb) bt
> #0  0x00000008009a2e8c in _umtx_op_err () 
> at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
> #1  0x00000008009a1331 in cond_wait_common (cond=Variable "cond" is not 
> available.

Simiarly here.  I don't think you have a deadlock.  I think you have a bug 
where you are missing a pthread_condvar_signal() or broadcast or some such. 
Or maybe you aren't holding the mutex when doing the signal or broadcast.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200809181735.29504.jhb>