From owner-freebsd-fs@FreeBSD.ORG Sat Sep 22 13:33:59 2012 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id EEE951065670; Sat, 22 Sep 2012 13:33:58 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id E1A668FC08; Sat, 22 Sep 2012 13:33:57 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA06629; Sat, 22 Sep 2012 16:33:55 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1TFPqE-000NLt-W9; Sat, 22 Sep 2012 16:33:55 +0300 Message-ID: <505DBE41.20303@FreeBSD.org> Date: Sat, 22 Sep 2012 16:33:53 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:15.0) Gecko/20120913 Thunderbird/15.0.1 MIME-Version: 1.0 To: FreeBSD FS References: <505DB4E6.8030407@smeets.im> In-Reply-To: <505DB4E6.8030407@smeets.im> X-Enigmail-Version: 1.4.3 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Florian Smeets , Pawel Jakub Dawidek Subject: Re: panic: _sx_xlock_hard: recursed on non-recursive sx zfsvfs->z_hold_mtx[i] @ ...cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:1407 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 22 Sep 2012 13:33:59 -0000 on 22/09/2012 15:53 Florian Smeets said the following: > Hi, > > I hit the above mentioned panic quite frequently on recent versions of head > (r240806). This happens when building packages in the ports tinderbox which > uses nullfs and zfs extensively. Kib had a look at it and suspects that his > recent nullfs changes expose a bug in zfs. > > The backtrace is as follows: Since getnewvnode() can call vnlru_free() the call flow can recurse back into fs code. So it's dangerous in general to hold any fs locks around getnewvnode call, as kib advises. In this case it was a nullfs vnode that caused recursion into zfs, but it could have been a zfs vnode. The only thing required for a panic is a hash collision of zfs object id, so that the same z_hold_mtx is used. But I imagine that it would be quite tough to drop z_hold_mtx in zfs_znode_cache_constructor. > #0 doadump (textdump=1) at > /usr/home/flo/dev/checkouts/svn-src/sys/kern/kern_shutdown.c:266 #1 > 0xffffffff804c6a64 in kern_reboot (howto=260) at > /usr/home/flo/dev/checkouts/svn-src/sys/kern/kern_shutdown.c:449 #2 > 0xffffffff804c648a in panic (fmt=0x0) at > /usr/home/flo/dev/checkouts/svn-src/sys/kern/kern_shutdown.c:637 #3 > 0xffffffff804ce6e5 in _sx_xlock_hard (sx=Variable "sx" is not available. ) > at /usr/home/flo/dev/checkouts/svn-src/sys/kern/kern_sx.c:523 #4 > 0xffffffff804ce77e in _sx_xlock (sx=Variable "sx" is not available. ) at > sx.h:152 #5 0xffffffff80e17533 in zfs_zinactive (zp=0xfffffe011951ec80) > at > /usr/home/flo/dev/checkouts/svn-src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:1407 > > #6 0xffffffff80e45366 in zfs_inactive (vp=0xfffffe019bdfad90, > cr=Variable "cr" is not available. ) at > /usr/home/flo/dev/checkouts/svn-src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4590 > > #7 0xffffffff80e4552a in zfs_freebsd_inactive (ap=Variable "ap" is not > available. ) at > /usr/home/flo/dev/checkouts/svn-src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:6102 > > #8 0xffffffff8070aae7 in VOP_INACTIVE_APV (vop=0xffffffff80eb5fe0, > a=0xffffff89092d3d20) at vnode_if.c:1863 #9 0xffffffff8055e3b7 in > vinactive (vp=0xfffffe019bdfad90, td=0xfffffe0017bad900) at vnode_if.h:807 > #10 0xffffffff80562526 in vputx (vp=0xfffffe019bdfad90, func=2) at > /usr/home/flo/dev/checkouts/svn-src/sys/kern/vfs_subr.c:2290 #11 > 0xffffffff80d8a5f0 in null_reclaim (ap=Variable "ap" is not available. ) > at > /usr/home/flo/dev/checkouts/svn-src/sys/modules/nullfs/../../fs/nullfs/null_vnops.c:706 > > #12 0xffffffff8070a9d7 in VOP_RECLAIM_APV (vop=0xffffffff80d8b180, > a=0xffffff89092d3e60) at vnode_if.c:1926 #13 0xffffffff8055f64d in vgonel > (vp=0xfffffe019bdb73e0) at vnode_if.h:830 #14 0xffffffff80561815 in > vnlru_free (count=1) at > /usr/home/flo/dev/checkouts/svn-src/sys/kern/vfs_subr.c:931 #15 > 0xffffffff80561b1f in getnewvnode (tag=0xffffffff80eae0f3 "zfs", > mp=0xfffffe0010dc3cc0, vops=0xffffffff80eb5fe0, vpp=0xffffff89092d3f88) at > /usr/home/flo/dev/checkouts/svn-src/sys/kern/vfs_subr.c:953 #16 > 0xffffffff80e168b5 in zfs_znode_cache_constructor (buf=0xfffffe019b437af0, > arg=Variable "arg" is not available. ) at > /usr/home/flo/dev/checkouts/svn-src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:135 > > #17 0xffffffff80e189cc in zfs_znode_alloc (zfsvfs=0xfffffe0010de4000, > db=0xfffffe048c138000, blksz=0, obj_type=DMU_OT_SA, > hdl=0xfffffe019b441cd0) at > /usr/home/flo/dev/checkouts/svn-src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:663 > > #18 0xffffffff80e19b65 in zfs_mknode (dzp=0xfffffe00b84dd7d0, > vap=0xffffff89092d4740, tx=0xfffffe0303916600, cr=0xfffffe000c668e00, > flag=0, zpp=0xffffff89092d46a0, acl_ids=0xffffff89092d4670) at > /usr/home/flo/dev/checkouts/svn-src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:1012 > > #19 0xffffffff80e46d6f in zfs_freebsd_create (ap=Variable "ap" is not > available. ) at > /usr/home/flo/dev/checkouts/svn-src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1657 > > #20 0xffffffff8070cef1 in VOP_CREATE_APV (vop=0xffffffff80eb5fe0, > a=0xffffff89092d47f0) at vnode_if.c:250 #21 0xffffffff8056f569 in > vn_open_cred (ndp=0xffffff89092d4880, flagp=0xffffff89092d487c, > cmode=Variable "cmode" is not available. ) at vnode_if.h:109 #22 > 0xffffffff80569236 in kern_openat (td=0xfffffe0017bad900, fd=-100, > path=0x801c2b300
, pathseg=Variable > "pathseg" is not available. ) at > /usr/home/flo/dev/checkouts/svn-src/sys/kern/vfs_syscalls.c:1134 #23 > 0xffffffff806b8329 in amd64_syscall (td=0xfffffe0017bad900, traced=0) at > subr_syscall.c:135 #24 0xffffffff806a2eb7 in Xfast_syscall () at > /usr/home/flo/dev/checkouts/svn-src/sys/amd64/amd64/exception.S:387 #25 > 0x00000008017702ec in ?? () Previous frame inner to this frame (corrupt > stack?) > > I have the vmcore and kernel symbols, so if someone wants to know more I > should be able to provide further data. > > Florian > -- Andriy Gapon