Date: Thu, 29 Aug 2013 14:01:06 +0300 From: Andriy Gapon <avg@FreeBSD.org> To: freebsd-dtrace@FreeBSD.org, Mark Johnston <markj@FreeBSD.org> Subject: sdt deadlock Message-ID: <521F29F2.2020208@FreeBSD.org>
next in thread | raw e-mail | index | archive | help
We've just experienced a deadlock between dtrace_lock and sdt_sx. Mark, could you please take a look at the following debugging data if you have some time? Does it look like something that could be fixed by your recent big change to the related code? Debug session follows. (kgdb) tid 101190 (kgdb) bt #0 sched_switch (td=0xfffffe007a109490, newtd=0xfffffe0018c7c920, flags=<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1920 #1 0xffffffff80942634 in mi_switch (flags=260, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:485 #2 0xffffffff8097da72 in sleepq_wait (wchan=0xffffffff81e7d300, pri=0) at /usr/src/sys/kern/subr_sleepqueue.c:623 #3 0xffffffff80941ad6 in _sx_xlock_hard (sx=0xffffffff81e7d300, tid=18446741876734203024, opts=<value optimized out>, file=<value optimized out>, line=<value optimized out>) at /usr/src/sys/kern/kern_sx.c:669 #4 0xffffffff80942295 in _sx_xlock (sx=<value optimized out>, opts=<value optimized out>, file=<value optimized out>, line=<value optimized out>) at sx.h:161 #5 0xffffffff81c57d1b in dtrace_register (name=0xffffffff80f50e60 "sctp", pap=0xffffffff81ea73f8, priv=2, cr=0x0, pops=0xffffffff81ea7420, arg=0x0, idp=0xffffffff812fbde8) at /usr/src/sys/modules/dtrace/dtrace/../../../cddl/contrib/opensolaris/uts/common/dtrace/dtrace.c:7471 #6 0xffffffff81ea7220 in sdt_provider_reg_callback (prov=<value optimized out>, arg=<value optimized out>) at /usr/src/sys/modules/dtrace/sdt/../../../cddl/dev/sdt/sdt.c:182 #7 0xffffffff80936ed5 in sdt_provider_listall_locked (callback_func=0xffffffff81ea71f0 <sdt_provider_reg_callback>, arg=0x0) at /usr/src/sys/kern/kern_sdt.c:246 #8 0xffffffff80937424 in sdt_register_callbacks (register_prov=0xffffffff81ea71f0 <sdt_provider_reg_callback>, reg_prov_arg=0x0, deregister_prov=0xffffffff81ea71e0 <sdt_provider_unreg_callback>, dereg_prov_arg=0x0, register_probe=0xffffffff81ea7100 <sdt_probe_callback>, reg_probe_arg=0x0) at /usr/src/sys/kern/kern_sdt.c:320 #9 0xffffffff8091aefe in linker_load_module (kldname=<value optimized out>, modname=0xffffffff81c3d0c4 "sdt", parent=0xfffffe02b6b5e800, verinfo=0xffffffff81c3d38c, lfpp=0x0) at /usr/src/sys/kern/kern_linker.c:239 #10 0xffffffff8091b44c in linker_load_dependencies (lf=0xfffffe02b6b5e800) at /usr/src/sys/kern/kern_linker.c:2131 #11 0xffffffff80d42690 in link_elf_load_file (cls=<value optimized out>, filename=<value optimized out>, result=0xffffff91e59998f8) at /usr/src/sys/kern/link_elf_obj.c:845 #12 0xffffffff8091adef in linker_load_module (kldname=<value optimized out>, modname=0xfffffe010cd64c00 "dtraceall", parent=0x0, verinfo=0x0, lfpp=0xffffff91e5999958) at linker_if.h:161 #13 0xffffffff8091b538 in kern_kldload (td=0xfffffe007a109490, file=<value optimized out>, fileid=0xffffff91e59999b4) at /usr/src/sys/kern/kern_linker.c:1039 #14 0xffffffff8091b754 in sys_kldload (td=0xfffffe007a109490, uap=<value optimized out>) at /usr/src/sys/kern/kern_linker.c:1073 #15 0xffffffff80cd523a in amd64_syscall (td=0xfffffe007a109490, traced=0) at subr_syscall.c:135 #16 0xffffffff80cbf7f7 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:387 #17 0x000000080087e6fc in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) fr 3 #3 0xffffffff80941ad6 in _sx_xlock_hard (sx=0xffffffff81e7d300, tid=18446741876734203024, opts=<value optimized out>, file=<value optimized out>, line=<value optimized out>) at /usr/src/sys/kern/kern_sx.c:669 669 /usr/src/sys/kern/kern_sx.c: No such file or directory. in /usr/src/sys/kern/kern_sx.c (kgdb) p *sx $1 = {lock_object = {lo_name = 0xffffffff81c6e9dd "dtrace_provider_lock", lo_flags = 41091072, lo_data = 0, lo_witness = 0x0}, sx_lock = 18446741876595964196} (kgdb) p/x *sx $2 = {lock_object = {lo_name = 0xffffffff81c6e9dd, lo_flags = 0x2730000, lo_data = 0x0, lo_witness = 0x0}, sx_lock = 0xfffffe0071d33924} (kgdb) p ((struct thread*)0xfffffe0071d33920)->td_tid $3 = 100433 (kgdb) tid 100433 (kgdb) bt #0 sched_switch (td=0xfffffe0071d33920, newtd=0xfffffe0018c88920, flags=<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1920 #1 0xffffffff80942634 in mi_switch (flags=260, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:485 #2 0xffffffff8097da72 in sleepq_wait (wchan=0xffffffff813ac880, pri=0) at /usr/src/sys/kern/subr_sleepqueue.c:623 #3 0xffffffff80941ad6 in _sx_xlock_hard (sx=0xffffffff813ac880, tid=18446741876595964192, opts=<value optimized out>, file=<value optimized out>, line=<value optimized out>) at /usr/src/sys/kern/kern_sx.c:669 #4 0xffffffff80937793 in sdt_provider_listall (callback_func=0xffffffff81ea7280 <sdt_provider_entry>, arg=0x0) at sx.h:161 #5 0xffffffff81c523da in dtrace_probe_provide (desc=0x0, prv=0xfffffe011d7bb000) at /usr/src/sys/modules/dtrace/dtrace/../../../cddl/contrib/opensolaris/uts/common/dtrace/dtrace.c:7984 #6 0xffffffff81c52487 in dtrace_open (dev=0xfffffe02b6b5bc00, oflags=<value optimized out>, devtype=<value optimized out>, td=<value optimized out>) at /usr/src/sys/modules/dtrace/dtrace/../../../cddl/contrib/opensolaris/uts/common/dtrace/dtrace.c:15418 #7 0xffffffff80820658 in devfs_open (ap=0xffffff91e4ad07d0) at /usr/src/sys/fs/devfs/devfs_vnops.c:1084 #8 0xffffffff80d6b434 in VOP_OPEN_APV (vop=0xffffffff812be940, a=0xffffff91e4ad07d0) at vnode_if.c:447 #9 0xffffffff809e826b in vn_open_cred (ndp=0xffffff91e4ad0880, flagp=0xffffff91e4ad087c, cmode=0, vn_open_flags=<value optimized out>, cred=0xfffffe0018c6da00, fp=0xfffffe02b6ed36e0) at vnode_if.h:196 #10 0xffffffff809e3faa in kern_openat (td=0xfffffe0071d33920, fd=-100, path=0x800aa89b2 <Error reading address 0x800aa89b2: Bad address>, pathseg=UIO_USERSPACE, flags=3, mode=<value optimized out>) at /usr/src/sys/kern/vfs_syscalls.c:1132 #11 0xffffffff80cd523a in amd64_syscall (td=0xfffffe0071d33920, traced=0) at subr_syscall.c:135 #12 0xffffffff80cbf7f7 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:387 #13 0x0000000801a2f50c in ?? () (kgdb) fr 3 #3 0xffffffff80941ad6 in _sx_xlock_hard (sx=0xffffffff813ac880, tid=18446741876595964192, opts=<value optimized out>, file=<value optimized out>, line=<value optimized out>) at /usr/src/sys/kern/kern_sx.c:669 669 /usr/src/sys/kern/kern_sx.c: No such file or directory. in /usr/src/sys/kern/kern_sx.c (kgdb) p/x *sx $4 = {lock_object = {lo_name = 0xffffffff80f340fa, lo_flags = 0x2310000, lo_data = 0x0, lo_witness = 0x0}, sx_lock = 0xfffffe007a109494} (kgdb) p ((struct thread*)0xfffffe007a109490)->td_tid $5 = 101190 The related snippets from the source code: int dtrace_register(const char *name, const dtrace_pattr_t *pap, uint32_t priv, cred_t *cr, const dtrace_pops_t *pops, void *arg, dtrace_provider_id_t *idp) { ... mutex_enter(&dtrace_provider_lock); <------------ mutex_enter(&dtrace_lock); ... int sdt_provider_listall(sdt_provider_listall_func_t callback_func, void *arg) { int error; sx_xlock(&sdt_sx); <----------- ... So it seems that thread 101190 has sdt_sx and wants to take dtrace_provider_lock in dtrace_register. At the same time thread 100433 has dtrace_provider_lock (taken in dtrace_open right before dtrace_probe_provide) and wants to take sdt_sx in sdt_provider_listall. -- Andriy Gapon
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?521F29F2.2020208>