Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 18 Sep 2008 09:23:22 -0700
From:      Julian Elischer <julian@elischer.org>
To:        David Naylor <naylor.b.david@gmail.com>
Cc:        freebsd-current@freebsd.org
Subject:   Re: FreeBSD deadlock (with fork?)
Message-ID:  <48D2807A.1050106@elischer.org>
In-Reply-To: <200809180631.47071.naylor.b.david@gmail.com>
References:  <200809180631.47071.naylor.b.david@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
David Naylor wrote:
> Hi,
> 
> I have a program that spawns a lot of subprocesses (with pipes open) from 
> multiple threads.  The problem is the program often deadlocks, but not 
> consistently.  Sometimes the program can run over 5 times to competition 
> without incidence and yet othertimes it locks within a few seconds.  

you sent this to -current. Is it in -current? (we fixed something like 
this some months back in current and 7)

do your post-fork processes do an exec?  according to the spec they 
should.

if any of your threads other than the one that did the fork ho;ds any 
mutex at teh time of fork then your process will hang. If they hold a 
umtx (between processes) then everything will hang.



> 
> However if I limit the thread count to 1 the problem does not appear to be 
> present.  
> 
> Here are the logs from gdb:
> (gdb) info thread
>   5 Thread 7021c0 (LWP 100203)  0x00000008009a2e8c in _umtx_op_err ()
>     at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
>   4 Thread a28480 (LWP 100174)  0x00000008009a2e8c in _umtx_op_err ()
>     at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
>   3 Thread a61d80 (LWP 100175)  0x00000008009a2e8c in _umtx_op_err ()
>     at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
>   2 Thread a61bc0 (LWP 100176)  0x00000008009a2e8c in _umtx_op_err ()
>     at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
> * 1 Thread a61840 (LWP 100177)  0x00000008009a2e8c in _umtx_op_err ()
>     at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
> 
> 
> (gdb) bt
> #0  0x00000008009a2e8c in _umtx_op_err () 
> at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
> #1  0x00000008009a1331 in cond_wait_common (cond=Variable "cond" is not 
> available.
> ) at /usr/src/lib/libthr/thread/thr_cond.c:204
> #2  0x00000000004c0573 in PyThread_acquire_lock (lock=0x70a760, waitflag=1) at 
> thread_pthread.h:452
> #3  0x00000000004c38b0 in lock_PyThread_acquire_lock (self=0x83f258, 
> args=Variable "args" is not available.
> ) at ./Modules/threadmodule.c:46
> #4  0x00000000004939e3 in PyEval_EvalFrameEx (f=0xa53920, 
> throwflag=Variable "throwflag" is not available.
> ) at Python/ceval.c:3679
> #5  0x00000000004939a8 in PyEval_EvalFrameEx (f=0xb57420, 
> throwflag=Variable "throwflag" is not available.
> ) at Python/ceval.c:3765
> #6  0x0000000000494ac1 in PyEval_EvalCodeEx (co=0x9aab70, 
> globals=Variable "globals" is not available.
> ) at Python/ceval.c:2942
> #7  0x00000000004eb6bc in function_call (func=0x9d8938, arg=0x9d9990, 
> kw=0xa24da0)
>     at Objects/funcobject.c:524
> #8  0x0000000000417add in PyObject_Call (func=0x9d8938, arg=0x9d9990, 
> kw=0xa24da0)
>     at Objects/abstract.c:2487
> #9  0x000000000048f730 in PyEval_EvalFrameEx (f=0xb48e20, 
> throwflag=Variable "throwflag" is not available.
> ) at Python/ceval.c:3978
> #10 0x00000000004939a8 in PyEval_EvalFrameEx (f=0xb57720, 
> throwflag=Variable "throwflag" is not available.
> ) at Python/ceval.c:3765
> #11 0x00000000004939a8 in PyEval_EvalFrameEx (f=0xb49020, 
> throwflag=Variable "throwflag" is not available.
> ) at Python/ceval.c:3765
> #12 0x0000000000494ac1 in PyEval_EvalCodeEx (co=0x947cd8, 
> globals=Variable "globals" is not available.
> ) at Python/ceval.c:2942
> #13 0x00000000004eb5bd in function_call (func=0x9a5320, arg=0x9d9950, kw=0x0) 
> at Objects/funcobject.c:524
> #14 0x0000000000417add in PyObject_Call (func=0x9a5320, arg=0x9d9950, kw=0x0) 
> at Objects/abstract.c:2487
> #15 0x000000000041f12f in instancemethod_call (func=0x9a5320, arg=0x9d9950, 
> kw=0x0)
>     at Objects/classobject.c:2558
> #16 0x0000000000417add in PyObject_Call (func=0x93b410, arg=0x718050, kw=0x0) 
> at Objects/abstract.c:2487
> #17 0x000000000048d4c6 in PyEval_CallObjectWithKeywords (func=0x93b410, 
> arg=0x718050, kw=0x0)
>     at Python/ceval.c:3548
> #18 0x00000000004c3cbd in t_bootstrap (boot_raw=0x70ae60) 
> at ./Modules/threadmodule.c:425
> #19 0x000000080099ac11 in thread_start (curthread=0xa61840) 
> at /usr/src/lib/libthr/thread/thr_create.c:288
> #20 0x0000000000000000 in ?? ()
> Error accessing memory address 0x7fffff5fc000: Bad address.
> 
> 
> (gdb) thr 2
> [Switching to thread 2 (Thread a61bc0 (LWP 100176))]#0  0x00000008009a2e8c in 
> _umtx_op_err ()
>     at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
> 37      RSYSCALL_ERR(_umtx_op)
> (gdb) bt
> #0  0x00000008009a2e8c in _umtx_op_err () 
> at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
> #1  0x00000008009a1331 in cond_wait_common (cond=Variable "cond" is not 
> available.
> ) at /usr/src/lib/libthr/thread/thr_cond.c:204
> #2  0x00000000004c0573 in PyThread_acquire_lock (lock=0x70a600, waitflag=1) at 
> thread_pthread.h:452
> #3  0x00000000004c38b0 in lock_PyThread_acquire_lock (self=0x83f168, 
> args=Variable "args" is not available.
> ) at ./Modules/threadmodule.c:46
> #4  0x00000000004939e3 in PyEval_EvalFrameEx (f=0xb56520, 
> throwflag=Variable "throwflag" is not available.
> ) at Python/ceval.c:3679
> #5  0x0000000000494ac1 in PyEval_EvalCodeEx (co=0x8d5dc8, 
> globals=Variable "globals" is not available.
> ) at Python/ceval.c:2942
> #6  0x0000000000492c88 in PyEval_EvalFrameEx (f=0xb2de20, 
> throwflag=Variable "throwflag" is not available.
> ) at Python/ceval.c:3775
> #7  0x0000000000494ac1 in PyEval_EvalCodeEx (co=0x8d5288, 
> globals=Variable "globals" is not available.
> ) at Python/ceval.c:2942
> #8  0x00000000004eb5bd in function_call (func=0x9a8b18, arg=0xc524d0, kw=0x0) 
> at Objects/funcobject.c:524
> #9  0x0000000000417add in PyObject_Call (func=0x9a8b18, arg=0xc524d0, kw=0x0) 
> at Objects/abstract.c:2487
> #10 0x000000000041f12f in instancemethod_call (func=0x9a8b18, arg=0xc524d0, 
> kw=0x0)
>     at Objects/classobject.c:2558
> #11 0x0000000000417add in PyObject_Call (func=0x93b370, arg=0x10331d0, kw=0x0) 
> at Objects/abstract.c:2487
> #12 0x0000000000464158 in slot_tp_init (self=Variable "self" is not available.
> ) at Objects/typeobject.c:5532
> #13 0x0000000000461561 in type_call (type=0x7d9420, args=0x10331d0, kwds=0x0) 
> at Objects/typeobject.c:700
> #14 0x0000000000417add in PyObject_Call (func=0x7d9420, arg=0x10331d0, kw=0x0) 
> at Objects/abstract.c:2487
> #15 0x00000000004906f9 in PyEval_EvalFrameEx (f=0xb3a120, 
> throwflag=Variable "throwflag" is not available.
> ) at Python/ceval.c:3890
> #16 0x0000000000494ac1 in PyEval_EvalCodeEx (co=0x8d5eb8, 
> globals=Variable "globals" is not available.
> ) at Python/ceval.c:2942
> #17 0x0000000000492c88 in PyEval_EvalFrameEx (f=0xa47a20, 
> throwflag=Variable "throwflag" is not available.
> ) at Python/ceval.c:3775
> #18 0x0000000000494ac1 in PyEval_EvalCodeEx (co=0x8d5cd8, 
> globals=Variable "globals" is not available.
> ) at Python/ceval.c:2942
> #19 0x0000000000492c88 in PyEval_EvalFrameEx (f=0xb3a420, 
> throwflag=Variable "throwflag" is not available.
> ) at Python/ceval.c:3775
> #20 0x0000000000494ac1 in PyEval_EvalCodeEx (co=0x9aabe8, 
> globals=Variable "globals" is not available.
> ) at Python/ceval.c:2942
> #21 0x0000000000492c88 in PyEval_EvalFrameEx (f=0xb3a720, 
> throwflag=Variable "throwflag" is not available.
> ) at Python/ceval.c:3775
> #22 0x0000000000494ac1 in PyEval_EvalCodeEx (co=0x9aab70, 
> globals=Variable "globals" is not available.
> ) at Python/ceval.c:2942
> #23 0x00000000004eb6bc in function_call (func=0x9d8938, arg=0x9d98d0, 
> kw=0xa5a660)
>     at Objects/funcobject.c:524
> #24 0x0000000000417add in PyObject_Call (func=0x9d8938, arg=0x9d98d0, 
> kw=0xa5a660)
>     at Objects/abstract.c:2487
> #25 0x000000000048f730 in PyEval_EvalFrameEx (f=0xb2e020, 
> throwflag=Variable "throwflag" is not available.
> ) at Python/ceval.c:3978
> ---Type <return> to continue, or q <return> to quit---
> #26 0x00000000004939a8 in PyEval_EvalFrameEx (f=0xb3aa20, 
> throwflag=Variable "throwflag" is not available.
> ) at Python/ceval.c:3765
> #27 0x00000000004939a8 in PyEval_EvalFrameEx (f=0xb2e220, 
> throwflag=Variable "throwflag" is not available.
> ) at Python/ceval.c:3765
> #28 0x0000000000494ac1 in PyEval_EvalCodeEx (co=0x947cd8, 
> globals=Variable "globals" is not available.
> ) at Python/ceval.c:2942
> #29 0x00000000004eb5bd in function_call (func=0x9a5320, arg=0x9d9890, kw=0x0) 
> at Objects/funcobject.c:524
> #30 0x0000000000417add in PyObject_Call (func=0x9a5320, arg=0x9d9890, kw=0x0) 
> at Objects/abstract.c:2487
> #31 0x000000000041f12f in instancemethod_call (func=0x9a5320, arg=0x9d9890, 
> kw=0x0)
>     at Objects/classobject.c:2558
> #32 0x0000000000417add in PyObject_Call (func=0x93b320, arg=0x718050, kw=0x0) 
> at Objects/abstract.c:2487
> #33 0x000000000048d4c6 in PyEval_CallObjectWithKeywords (func=0x93b320, 
> arg=0x718050, kw=0x0)
>     at Python/ceval.c:3548
> #34 0x00000000004c3cbd in t_bootstrap (boot_raw=0x70aee0) 
> at ./Modules/threadmodule.c:425
> #35 0x000000080099ac11 in thread_start (curthread=0xa61bc0) 
> at /usr/src/lib/libthr/thread/thr_create.c:288
> #36 0x0000000000000000 in ?? ()
> Error accessing memory address 0x7fffff7fd000: Bad address.
> 
> Apologies about the long backtraces [and thus the long message].  
> 
> Thanks
> 
> David




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?48D2807A.1050106>