From owner-freebsd-fs@FreeBSD.ORG Fri Sep 19 20:36:07 2014 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D1F1A795 for ; Fri, 19 Sep 2014 20:36:07 +0000 (UTC) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B0744840 for ; Fri, 19 Sep 2014 20:36:07 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.9/8.14.9) with ESMTP id s8JKa7nn053942 for ; Fri, 19 Sep 2014 20:36:07 GMT (envelope-from bdrewery@freefall.freebsd.org) Received: (from bdrewery@localhost) by freefall.freebsd.org (8.14.9/8.14.9/Submit) id s8JKa73T053941 for freebsd-fs@FreeBSD.org; Fri, 19 Sep 2014 20:36:07 GMT (envelope-from bdrewery) Received: (qmail 40702 invoked from network); 19 Sep 2014 15:36:05 -0500 Received: from unknown (HELO ?10.10.0.24?) (freebsd@shatow.net@10.10.0.24) by sweb.xzibition.com with ESMTPA; 19 Sep 2014 15:36:05 -0500 Message-ID: <541C93AE.60906@FreeBSD.org> Date: Fri, 19 Sep 2014 15:35:58 -0500 From: Bryan Drewery Organization: FreeBSD User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.1.1 MIME-Version: 1.0 To: FreeBSD FS Subject: Re: ZFS deadlock in zfs_lookup/zfs_dirent_lock on head r271170 References: <541C5E01.9050609@FreeBSD.org> In-Reply-To: <541C5E01.9050609@FreeBSD.org> OpenPGP: id=6E4697CF; url=http://www.shatow.net/bryan/bryan2.asc Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="nvEVGjCo5XphTDeI2mjQ0QPTSD130KL7C" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Sep 2014 20:36:08 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --nvEVGjCo5XphTDeI2mjQ0QPTSD130KL7C Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable On 9/19/2014 11:46 AM, Bryan Drewery wrote: > rename is: https://people.freebsd.org/~bdrewery/rename.c >=20 > All files are in the same mountpoint. >=20 >> # ps uaxwww|egrep "(rm|rename)" >> root 96901 0.0 0.0 12300 1948 2 D+ 11:01AM 0:00.00 rm = -rf /poudriere/data/.m/exp-head-commit-test-devel/ref/.p/cleaning/deps/py= 27-PasteScript-1.7.5_2 >> root 96903 0.0 0.0 8196 1912 2 D+ 11:01AM 0:00.00 ren= ame /poudriere/data/.m/exp-head-commit-test-devel/ref/.p/deps/py27-PasteD= eploy-1.5.0_1 /poudriere/data/.m/exp-head-commit-test-devel/ref/.p/cleani= ng/deps/py27-PasteDeploy-1.5.0_1 >> root 3720 0.0 0.0 8196 1912 3 D+ 11:05AM 0:00.00 ./r= ename /poudriere/data/.m/exp-head-commit-test-devel/ref/.p/deps/py27-Past= eDeploy-1.5.0_1 /poudriere/data/.m/exp-head-commit-test-devel/ref/.p/clea= ning/deps/py27-PasteDeploy-1.5.0_1 >> >> # procstat -kka|egrep "(96901|96903|3720)" >> 3720 100969 rename - mi_switch+0x179 sleepq_= switch+0x152 sleepq_wait+0x43 _cv_wait+0x1da zfs_dirent_lock+0x212 zfs_di= rlook+0x1b0 zfs_lookup+0x223 zfs_freebsd_lookup+0x91 VOP_CACHEDLOOKUP_APV= +0xf1 vfs_cache_lookup+0xd6 VOP_LOOKUP_APV+0xf1 lookup+0x5ad namei+0x4e4 = kern_renameat+0x95 filemon_wrapper_rename+0x19 amd64_syscall+0x25a Xfast_= syscall+0xfb >> 96901 101213 rm - mi_switch+0x179 sleepq_= switch+0x152 sleepq_wait+0x43 sleeplk+0x14a __lockmgr_args+0x862 vop_stdl= ock+0x3c VOP_LOCK1_APV+0xfc _vn_lock+0xaa zfs_lookup+0x462 zfs_freebsd_lo= okup+0x91 VOP_CACHEDLOOKUP_APV+0xf1 vfs_cache_lookup+0xd6 VOP_LOOKUP_APV+= 0xf1 lookup+0x5ad namei+0x4e4 kern_statat_vnhook+0xae sys_fstatat+0x2c am= d64_syscall+0x25a >> 96903 101290 rename - mi_switch+0x179 sleepq_= switch+0x152 sleepq_wait+0x43 sleeplk+0x14a __lockmgr_args+0xea1 vop_stdl= ock+0x3c VOP_LOCK1_APV+0xfc _vn_lock+0xaa vputx+0x232 zfs_rename_unlock+0= x78 zfs_freebsd_rename+0xfbb VOP_RENAME_APV+0xfc kern_renameat+0x3ef file= mon_wrapper_rename+0x19 amd64_syscall+0x25a Xfast_syscall+0xfb >> >> # procstat -kka|grep zfs >> 0 100698 kernel zfs_vn_rele_task mi_switch+0x179 sleepq_= switch+0x152 sleepq_wait+0x43 _sleep+0x366 taskqueue_thread_loop+0xc8 for= k_exit+0x84 fork_trampoline+0xe >> 3 100159 zfskern arc_reclaim_thre mi_switch+0x179 sleepq_= switch+0x152 sleepq_timedwait+0x43 _cv_timedwait_sbt+0x200 arc_reclaim_th= read+0x2de fork_exit+0x84 fork_trampoline+0xe >> 3 100160 zfskern l2arc_feed_threa mi_switch+0x179 sleepq_= switch+0x152 sleepq_timedwait+0x43 _cv_timedwait_sbt+0x200 l2arc_feed_thr= ead+0x1d8 fork_exit+0x84 fork_trampoline+0xe >> 3 100681 zfskern trim zroot mi_switch+0x179 sleepq_= switch+0x152 sleepq_timedwait+0x43 _cv_timedwait_sbt+0x200 trim_thread+0x= 9e fork_exit+0x84 fork_trampoline+0xe >> 3 100715 zfskern txg_thread_enter mi_switch+0x179 sleepq_= switch+0x152 sleepq_wait+0x43 _cv_wait+0x1da txg_thread_wait+0x9b txg_qui= esce_thread+0x420 fork_exit+0x84 fork_trampoline+0xe >> 3 100716 zfskern txg_thread_enter mi_switch+0x179 sleepq_= switch+0x152 sleepq_wait+0x43 _cv_wait+0x1da zio_wait+0x9b dsl_pool_sync+= 0x35c spa_sync+0x530 txg_sync_thread+0x24d fork_exit+0x84 fork_trampoline= +0xe >> 3720 100969 rename - mi_switch+0x179 sleepq_= switch+0x152 sleepq_wait+0x43 _cv_wait+0x1da zfs_dirent_lock+0x212 zfs_di= rlook+0x1b0 zfs_lookup+0x223 zfs_freebsd_lookup+0x91 VOP_CACHEDLOOKUP_APV= +0xf1 vfs_cache_lookup+0xd6 VOP_LOOKUP_APV+0xf1 lookup+0x5ad namei+0x4e4 = kern_renameat+0x95 filemon_wrapper_rename+0x19 amd64_syscall+0x25a Xfast_= syscall+0xfb >> 96901 101213 rm - mi_switch+0x179 sleepq_= switch+0x152 sleepq_wait+0x43 sleeplk+0x14a __lockmgr_args+0x862 vop_stdl= ock+0x3c VOP_LOCK1_APV+0xfc _vn_lock+0xaa zfs_lookup+0x462 zfs_freebsd_lo= okup+0x91 VOP_CACHEDLOOKUP_APV+0xf1 vfs_cache_lookup+0xd6 VOP_LOOKUP_APV+= 0xf1 lookup+0x5ad namei+0x4e4 kern_statat_vnhook+0xae sys_fstatat+0x2c am= d64_syscall+0x25a >> 96903 101290 rename - mi_switch+0x179 sleepq_= switch+0x152 sleepq_wait+0x43 sleeplk+0x14a __lockmgr_args+0xea1 vop_stdl= ock+0x3c VOP_LOCK1_APV+0xfc _vn_lock+0xaa vputx+0x232 zfs_rename_unlock+0= x78 zfs_freebsd_rename+0xfbb VOP_RENAME_APV+0xfc kern_renameat+0x3ef file= mon_wrapper_rename+0x19 amd64_syscall+0x25a Xfast_syscall+0xfb >=20 >=20 >=20 PID 96903 / TID 101290: zfs_rename takes dirent_lock. It then calls zfs_rename_lock, which then is stuck in vrele() PID 3720 / TID 100969: zfs_dirlook waits on zfs_dirent_lock. PID 96901 I think can be ignored, it was fallout and ran 4 minutes after the deadlock. Some info from gdb the vnode stuck in vrele() follows. I'm not sure where to look from here due to it being 'SHARED' with no pid/tid owning i= t. > (kgdb) backtrace > #0 sched_switch (td=3D0xfffff80834e36490, newtd=3D, flags=3D) at /usr/src/sys/kern/sched_ule.c:1932 > #1 0xffffffff80929379 in mi_switch (flags=3D260, newtd=3D0x0) at /usr/= src/sys/kern/kern_synch.c:493 > #2 0xffffffff80966af2 in sleepq_switch (wchan=3D,= pri=3D) at /usr/src/sys/kern/subr_sleepqueue.c:552 > #3 0xffffffff80966953 in sleepq_wait (wchan=3D0xfffff8090c3259a0, pri=3D= 96) at /usr/src/sys/kern/subr_sleepqueue.c:631 > #4 0xffffffff808fdf3a in sleeplk (lk=3D, flags=3D= , ilk=3D, wmesg=3D, pri=3D, > timo=3D) at /usr/src/sys/kern/kern_lock.c:225 > #5 0xffffffff808fd601 in __lockmgr_args (lk=3D0xfffff8090c3259a0, flag= s=3D, ilk=3D0xfffff8090c3259d0, wmesg=3D, pri=3D, > timo=3D, file=3D0xffffffff80fcd0e8 "/usr/src/s= ys/kern/vfs_subr.c", line=3D2248) at /usr/src/sys/kern/kern_lock.c:939 > #6 0xffffffff809be27c in vop_stdlock (ap=3D) at l= ockmgr.h:98 > #7 0xffffffff80e5b75c in VOP_LOCK1_APV (vop=3D, a= =3D) at vnode_if.c:2082 > #8 0xffffffff809ddbca in _vn_lock (vp=3D0xfffff8090c325938, flags=3D, file=3D0xffffffff80fcd0e8 "/usr/src/sys/kern/vfs_sub= r.c", line=3D2248) at vnode_if.h:859 > #9 0xffffffff809ce662 in vputx (vp=3D0xfffff8090c325938, func=3D1) at = /usr/src/sys/kern/vfs_subr.c:2248 > #10 0xffffffff81e99b28 in zfs_rename_unlock (zlpp=3D0xfffffe12484b8758)= at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:3= 630 > #11 0xffffffff81e969bb in zfs_freebsd_rename (ap=3D) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c= :4067 > #12 0xffffffff80e5ac1c in VOP_RENAME_APV (vop=3D, = a=3D) at vnode_if.c:1544 > #13 0xffffffff809dadcf in kern_renameat (td=3D, ol= dfd=3D, old=3D, newfd=3D, new=3D, > pathseg=3D) at vnode_if.h:636 > #14 0xffffffff820c5ab9 in filemon_wrapper_rename (td=3D0x0, uap=3D0xfff= ffe12484b8b80) at filemon_wrapper.c:326 > #15 0xffffffff80d38b6a in amd64_syscall (td=3D0xfffff80834e36490, trace= d=3D0) at subr_syscall.c:133 > #16 0xffffffff80d1a44b in Xfast_syscall () at /usr/src/sys/amd64/amd64/= exception.S:390 > #17 0x000000080088df9a in ?? () > Previous frame inner to this frame (corrupt stack?) > (kgdb) vprint 0xfffff8090c325938 > 0xfffff8090c325938: tag zfs, type VDIR > usecount 1, writecount 0, refcount 2 mountedhere 0x0 > flags () > lock type zfs: SHARED (count 2) with exclusive waiters pending > mount /poudriere/data/.m/exp-head-commit-test-devel/ref from zroot/= poudriere/jails/exp-head-commit-test-devel-ref > (kgdb) p *(struct vnode*)0xfffff8090c325938 > $22 =3D {v_tag =3D 0xffffffff81f0f8ff "zfs", v_op =3D 0xffffffff81f24b5= 0, v_data =3D 0xfffff809b8612a10, v_mount =3D 0xfffff8014c262660, v_nmntv= nodes =3D {tqe_next =3D 0xfffff8095e9aa760, > tqe_prev =3D 0xfffff80aa43d7780}, v_un =3D {vu_mount =3D 0x0, vu_so= cket =3D 0x0, vu_cdev =3D 0x0, vu_fifoinfo =3D 0x0}, v_hashlist =3D {le_n= ext =3D 0x0, le_prev =3D 0x0}, v_cache_src =3D { > lh_first =3D 0x0}, v_cache_dst =3D {tqh_first =3D 0xfffff80ac9c3531= 0, tqh_last =3D 0xfffff80ac9c35330}, v_cache_dd =3D 0xfffff80ac9c35310, v= _lock =3D {lock_object =3D { > lo_name =3D 0xffffffff81f0f8ff "zfs", lo_flags =3D 117112832, lo_= data =3D 0, lo_witness =3D 0xfffffe0000b28780}, lk_lock =3D 21, lk_exslpf= ail =3D 0, lk_timo =3D 51, lk_pri =3D 96}, v_interlock =3D { > lock_object =3D {lo_name =3D 0xffffffff80fc4bc6 "vnode interlock", = lo_flags =3D 16973824, lo_data =3D 0, lo_witness =3D 0xfffffe0000b1e500},= mtx_lock =3D 4}, v_vnlock =3D 0xfffff8090c3259a0, > v_actfreelist =3D {tqe_next =3D 0xfffff80a5421f1d8, tqe_prev =3D 0xff= fff8095e9aa820}, v_bufobj =3D {bo_lock =3D {lock_object =3D {lo_name =3D = 0xffffffff80fcd144 "bufobj interlock", > lo_flags =3D 86179840, lo_data =3D 0, lo_witness =3D 0xfffffe00= 00b26580}, rw_lock =3D 1}, bo_ops =3D 0xffffffff814ab900, bo_object =3D 0= x0, bo_synclist =3D {le_next =3D 0x0, le_prev =3D 0x0}, > bo_private =3D 0xfffff8090c325938, __bo_vnode =3D 0xfffff8090c32593= 8, bo_clean =3D {bv_hd =3D {tqh_first =3D 0x0, tqh_last =3D 0xfffff8090c3= 25a58}, bv_root =3D {pt_root =3D 0}, bv_cnt =3D 0}, > bo_dirty =3D {bv_hd =3D {tqh_first =3D 0x0, tqh_last =3D 0xfffff809= 0c325a78}, bv_root =3D {pt_root =3D 0}, bv_cnt =3D 0}, bo_numoutput =3D 0= , bo_flag =3D 0, bo_bsize =3D 131072}, v_pollinfo =3D 0x0, > v_label =3D 0x0, v_lockf =3D 0x0, v_rl =3D {rl_waiters =3D {tqh_first= =3D 0x0, tqh_last =3D 0xfffff8090c325ac0}, rl_currdep =3D 0x0}, v_cstart= =3D 0, v_lasta =3D 0, v_lastw =3D 0, v_clen =3D 0, > v_holdcnt =3D 2, v_usecount =3D 1, v_iflag =3D 512, v_vflag =3D 0, v_= writecount =3D 0, v_hash =3D 151794265, v_type =3D VDIR} --=20 Regards, Bryan Drewery --nvEVGjCo5XphTDeI2mjQ0QPTSD130KL7C Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (MingW32) iQEcBAEBAgAGBQJUHJOuAAoJEDXXcbtuRpfPwEMH/AuNhw92TtrDsKCjmzuuBv0s uH37y0Fk1e7LSB4dChTeyH00bO+yEK9YKD3bSNqlr7PJY/nEZwBbulwYHPpYi4PN /HE7ykx0ZHrLAetuifQCXaZWHz15T40F5xmKduFgXC0F4Q0B9T3WV/r/dgxBNlGg lIbnJh7Nw0diOBQHtbCL0imQQMvsnr3koz7dRn29FrXikWbVB/4iMbBqBWU9KXV/ osNpeAMBhX69tLdf3zHVL3o7iFAInjLKovQgGgo8HdjUB5bigitBQ4meTL0Y0JI2 cba5zKcB2JvCa2DeoE3yynmqS6G2U3QtXMFPcUPjJFpObdQ63nLeyTdXGtUiP1A= =McIQ -----END PGP SIGNATURE----- --nvEVGjCo5XphTDeI2mjQ0QPTSD130KL7C--