Date: Tue, 23 Aug 2011 13:50:56 -0700 From: David P Discher <dpd@bitgravity.com> To: freebsd-fs@freebsd.org Subject: _sx_xlock_hard panic - is this zfs_zget panic deadlock ? (8.1-RELEASE) Message-ID: <C58DCBA1-30E0-488F-AAC8-44477C748241@bitgravity.com>
next in thread | raw e-mail | index | archive | help
Hey FreeBSD FS - I got a new one, well, new for us, possibly already = fixed. I got the following panic -=20 Fatal trap 9: general protection fault while in kernel mode cpuid =3D 13; apic id =3D 21 instruction pointer =3D 0x20:0xffffffff80514848 stack pointer =3D 0x28:0xffffff9be057f230 frame pointer =3D 0x28:0xffffff9be057f2b0 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres = 1, long 1, def32 0, gran 1 processor eflags =3D interrupt enabled, resume, IOPL =3D = 0 current process =3D 94019 (rsync) [thread pid 94019 tid 102556 ] Stopped at _sx_xlock_hard+0xd8: movl 0x290(%r12),%r8d More kgdb output below. It looks like the operation in kern_sx panic'ed = : #9 0xffffffff80514848 in _sx_xlock_hard (sx=3D0xffffff008040e0d8, = tid=3D18446742975725269984, opts=3DVariable "opts" is not available. ) at /usr/src/sys/kern/kern_sx.c:513 513 x =3D SX_OWNER(x); The calling chain looks like=20 zfs_zget -> dmu_bonus_hold -> dnode_hold_impl -> _sx_xlock -> = _sx_xlock_hard The best I can tell, is this looks like a race or deadlock issue, leaked = lock, etc, something along those lines. My searching has under covered a patch in = -head/-stable that may address this particular issue: = ------------------------------------------------------------------------ r209097 | mm | 2010-06-12 04:22:45 -0700 (Sat, 12 Jun 2010) | 8 = lines Changed paths: M = /head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c =09 Fix ZFS panic deadlock: cycle in blocking chain via zfs_zget =09 OpenSolaris onnv-revision: 9774:0bb234ab2287 =09 Obtained from: OpenSolaris (Bug ID 6788152) Approved by: pjd, delphij (mentor) MFC after: 3 days =09 = ------------------------------------------------------------------------ I'd like a little bit of confirmation or validation or otherwise if this = panic I'm looking at could in fact be fixed by r209097, or is this something = entirely different ? Thanks ! --- David P. Discher dpd@bitgravity.com * AIM: bgDavidDPD BITGRAVITY * http://www.bitgravity.com (kgdb) bt #0 doadump () at pcpu.h:223 #1 0xffffffff801f0e5c in db_fncall (dummy1=3DVariable "dummy1" is not = available. ) at /usr/src/sys/ddb/db_command.c:548 #2 0xffffffff801f1191 in db_command (last_cmdp=3D0xffffffff80b105e0, = cmd_table=3DVariable "cmd_table" is not available. ) at /usr/src/sys/ddb/db_command.c:445 #3 0xffffffff801f13e0 in db_command_loop () at = /usr/src/sys/ddb/db_command.c:498 #4 0xffffffff801f3339 in db_trap (type=3DVariable "type" is not = available. ) at /usr/src/sys/ddb/db_main.c:229 #5 0xffffffff8053aff5 in kdb_trap (type=3D9, code=3D0, = tf=3D0xffffff9be057f180) at /usr/src/sys/kern/subr_kdb.c:535 #6 0xffffffff8079519d in trap_fatal (frame=3D0xffffff9be057f180, = eva=3DVariable "eva" is not available. ) at /usr/src/sys/amd64/amd64/trap.c:772 #7 0xffffffff80795a9a in trap (frame=3D0xffffff9be057f180) at = /usr/src/sys/amd64/amd64/trap.c:588 #8 0xffffffff8077c827 in calltrap () at = /usr/src/sys/amd64/amd64/exception.S:223 #9 0xffffffff80514848 in _sx_xlock_hard (sx=3D0xffffff008040e0d8, = tid=3D18446742975725269984, opts=3DVariable "opts" is not available. ) at /usr/src/sys/kern/kern_sx.c:513 #10 0xffffffff80514c89 in _sx_xlock (sx=3D0xffffff008040e0d8, = opts=3DVariable "opts" is not available. ) at sx.h:148 #11 0xffffffff8104c82d in dnode_hold_impl (os=3D0xffffff002964f400, = object=3DVariable "object" is not available. ) at = /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/= dnode.c:607 #12 0xffffffff81042c7a in dmu_bonus_hold (os=3DVariable "os" is not = available. ) at = /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/= dmu.c:147 #13 0xffffffff81081e77 in zfs_zget (zfsvfs=3D0xffffff00295de000, = obj_num=3D10078547, zpp=3D0xffffff9be057f518) at = /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/= zfs_znode.c:869 #14 0xffffffff81092fb3 in zfs_dirent_lock (dlpp=3D0xffffff9be057f520, = dzp=3D0xffffff13942938d0, name=3D0xffffff9be057f5f0 "n", = zpp=3D0xffffff9be057f518,=20 flag=3DVariable "flag" is not available. ) at = /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/= zfs_dir.c:321 #15 0xffffffff81093209 in zfs_dirlook (dzp=3D0xffffff13942938d0, = name=3D0xffffff9be057f5f0 "n", vpp=3D0xffffff9be057f970, flags=3DVariable = "flags" is not available. ) at = /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/= zfs_dir.c:413 #16 0xffffffff810a2850 in zfs_lookup (dvp=3D0xffffff08daee6760, = nm=3D0xffffff9be057f5f0 "n", vpp=3D0xffffff9be057f970, = cnp=3D0xffffff9be057f998, nameiop=3D0,=20 cr=3D0xffffff028c6a5200, td=3D0xffffff005b0973e0, flags=3DVariable = "flags" is not available. ) at = /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/= zfs_vnops.c:1171 #17 0xffffffff810a3791 in zfs_freebsd_lookup (ap=3D0xffffff9be057f750) at = /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/= zfs_vnops.c:4059 #18 0xffffffff807e08f4 in VOP_CACHEDLOOKUP_APV (vop=3D0xffffffff81107b20, = a=3D0xffffff9be057f750) at vnode_if.c:187 #19 0xffffffff80580190 in vfs_cache_lookup (ap=3DVariable "ap" is not = available. ) at vnode_if.h:80 #20 0xffffffff807e33ec in VOP_LOOKUP_APV (vop=3D0xffffffff81107b20, = a=3D0xffffff9be057f810) at vnode_if.c:123 #21 0xffffffff80586864 in lookup (ndp=3D0xffffff9be057f940) at = vnode_if.h:54 #22 0xffffffff80587797 in namei (ndp=3D0xffffff9be057f940) at = /usr/src/sys/kern/vfs_lookup.c:269 #23 0xffffffff80594e02 in kern_statat_vnhook (td=3D0xffffff005b0973e0, = flag=3DVariable "flag" is not available. ) at /usr/src/sys/kern/vfs_syscalls.c:2346 #24 0xffffffff80595025 in kern_statat (td=3DVariable "td" is not = available. ) at /usr/src/sys/kern/vfs_syscalls.c:2327 #25 0xffffffff805950ea in lstat (td=3DVariable "td" is not available. ) at /usr/src/sys/kern/vfs_syscalls.c:2390 #26 0xffffffff8079564b in syscall (frame=3D0xffffff9be057fc80) at = /usr/src/sys/amd64/amd64/trap.c:945 #27 0xffffffff8077cb01 in Xfast_syscall () at = /usr/src/sys/amd64/amd64/exception.S:374 #28 0x00000008007810fc in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) up #1 0xffffffff801f0e5c in db_fncall (dummy1=3DVariable "dummy1" is not = available. ) at /usr/src/sys/ddb/db_command.c:548 548 *rv =3D (*f)(args[0], args[1], args[2], args[3], = args[4], args[5], (kgdb) up #2 0xffffffff801f1191 in db_command (last_cmdp=3D0xffffffff80b105e0, = cmd_table=3DVariable "cmd_table" is not available. ) at /usr/src/sys/ddb/db_command.c:445 445 (*cmd->fcn)(addr, have_addr, count, modif); (kgdb) up #3 0xffffffff801f13e0 in db_command_loop () at = /usr/src/sys/ddb/db_command.c:498 498 db_command(&db_last_command, &db_cmd_table, /* = dopager */ 1); (kgdb) up #4 0xffffffff801f3339 in db_trap (type=3DVariable "type" is not = available. ) at /usr/src/sys/ddb/db_main.c:229 229 db_command_loop(); (kgdb) up #5 0xffffffff8053aff5 in kdb_trap (type=3D9, code=3D0, = tf=3D0xffffff9be057f180) at /usr/src/sys/kern/subr_kdb.c:535 535 handled =3D kdb_dbbe->dbbe_trap(type, code); (kgdb) up #6 0xffffffff8079519d in trap_fatal (frame=3D0xffffff9be057f180, = eva=3DVariable "eva" is not available. ) at /usr/src/sys/amd64/amd64/trap.c:772 772 if (kdb_trap(type, 0, frame)) (kgdb) up #7 0xffffffff80795a9a in trap (frame=3D0xffffff9be057f180) at = /usr/src/sys/amd64/amd64/trap.c:588 588 trap_fatal(frame, 0); (kgdb) up #8 0xffffffff8077c827 in calltrap () at = /usr/src/sys/amd64/amd64/exception.S:223 223 call trap Current language: auto; currently asm (kgdb) up #9 0xffffffff80514848 in _sx_xlock_hard (sx=3D0xffffff008040e0d8, = tid=3D18446742975725269984, opts=3DVariable "opts" is not available. ) at /usr/src/sys/kern/kern_sx.c:513 513 x =3D SX_OWNER(x); Current language: auto; currently c (kgdb) p x $1 =3D 281466386776064 (kgdb) p sx $2 =3D (struct sx *) 0xffffff008040e0d8 (kgdb) p *sx $3 =3D {lock_object =3D {lo_name =3D 0x2901fc0044005e <Address = 0x2901fc0044005e out of bounds>, lo_flags =3D 2104063083, lo_data =3D = 1953056627,=20 lo_witness =3D 0x154407f4763bc41}, sx_lock =3D 281466386776064} (kgdb) list 508 * running or the state of the lock changes. 509 */ 510 x =3D sx->sx_lock; 511 if ((sx->lock_object.lo_flags & SX_NOADAPTIVE) = =3D=3D 0) { 512 if ((x & SX_LOCK_SHARED) =3D=3D 0) { 513 x =3D SX_OWNER(x); 514 owner =3D (struct thread *)x; 515 if (TD_IS_RUNNING(owner)) { 516 if = (LOCK_LOG_TEST(&sx->lock_object, 0)) 517 CTR3(KTR_LOCK, (kgdb) up=20 #10 0xffffffff80514c89 in _sx_xlock (sx=3D0xffffff008040e0d8, = opts=3DVariable "opts" is not available. ) at sx.h:148 148 error =3D _sx_xlock_hard(sx, tid, opts, file, = line); (kgdb) up #11 0xffffffff8104c82d in dnode_hold_impl (os=3D0xffffff002964f400, = object=3DVariable "object" is not available. ) at = /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/= dnode.c:607 607 mutex_enter(&dn->dn_mtx); (kgdb) list 602 dnode_destroy(dn); 603 dn =3D winner; 604 } 605 } 606 607 mutex_enter(&dn->dn_mtx); 608 type =3D dn->dn_type; 609 if (dn->dn_free_txg || 610 ((flag & DNODE_MUST_BE_ALLOCATED) && type =3D=3D = DMU_OT_NONE) || 611 ((flag & DNODE_MUST_BE_FREE) && type !=3D = DMU_OT_NONE)) { (kgdb) print dn $4 =3D (dnode_t *) 0xffffff008040e000 (kgdb) print *dn $5 =3D {dn_struct_rwlock =3D {lock_object =3D {lo_name =3D = 0xffffffff81100b7c "dn->dn_struct_rwlock", lo_flags =3D 40960000, = lo_data =3D 0, lo_witness =3D 0x0},=20 sx_lock =3D 1}, dn_link =3D {list_next =3D 0xffffff14dcffd630, = list_prev =3D 0xffffff14e5f64c40}, dn_objset =3D 0xffffff002964f400,=20 dn_object =3D 4611409328054323539, dn_dbuf =3D 0x10c837, dn_phys =3D = 0x3f, dn_type =3D 538976288, dn_bonuslen =3D 8224, dn_bonustype =3D 75 = 'K',=20 dn_nblkptr =3D 77 'M', dn_checksum =3D 51 '3', dn_compress =3D 48 '0', = dn_nlevels =3D 49 '1', dn_indblkshift =3D 51 '3', dn_datablkshift =3D 72 = 'H',=20 dn_datablkszsec =3D 18247, dn_datablksz =3D 1145130323, dn_maxblkid =3D = 4698733595085963320, dn_next_nblkptr =3D "08iH", dn_next_nlevels =3D = "athc",=20 dn_next_indblkshift =3D " iUH", dn_next_bonuslen =3D {16695, 12851, = 12339, 12353}, dn_next_blksz =3D {909397057, 538980384, 538976288, = 538976288},=20 dn_dirty_link =3D {{list_next =3D 0x2f00400080102020, list_prev =3D = 0x7020002004000}, {list_next =3D 0xfc10003f00103fff, list_prev =3D = 0xfffffff010000fb}, { list_next =3D 0x78000300070000, list_prev =3D 0x7800780078}, = {list_next =3D 0x0, list_prev =3D 0x6170e001f0000}}, dn_mtx =3D = {lock_object =3D { lo_name =3D 0x2901fc0044005e <Address 0x2901fc0044005e out of = bounds>, lo_flags =3D 2104063083, lo_data =3D 1953056627,=20 lo_witness =3D 0x154407f4763bc41}, sx_lock =3D 281466386776064}, = dn_dirty_records =3D {{list_size =3D 70088136784871424,=20 list_offset =3D 6724054219973732112, list_head =3D {list_next =3D = 0xca00000001, list_prev =3D 0xcca250005a874000}}, {list_size =3D = 2807702982,=20 list_offset =3D 0, list_head =3D {list_next =3D 0x401c409c0000, = list_prev =3D 0x0}}, {list_size =3D 3096229038784512,=20 list_offset =3D 1004619926008233984, list_head =3D {list_next =3D = 0x200440000001fa20, list_prev =3D 0x7000000000230}}, {list_size =3D = 2258422671148295,=20 list_offset =3D 74596396452349191, list_head =3D {list_next =3D = 0x4235413700060308, list_prev =3D 0x5db90000180a0000}}}, dn_ranges =3D = {{ avl_root =3D 0x8000ffff, avl_compar =3D 0, avl_offset =3D = 8589934592, avl_numnodes =3D 0, avl_size =3D 0}, {avl_root =3D 0x0, = avl_compar =3D 0,=20 avl_offset =3D 0, avl_numnodes =3D 0, avl_size =3D 0}, {avl_root =3D= 0x0, avl_compar =3D 0, avl_offset =3D 61, avl_numnodes =3D 0,=20 avl_size =3D 2026619832316723200}, {avl_root =3D 0x0, avl_compar =3D= 0x21101f, avl_offset =3D 0, avl_numnodes =3D 0, avl_size =3D = 65011713}},=20 dn_allocated_txg =3D 0, dn_free_txg =3D 0, dn_assigned_txg =3D 0, = dn_notxholds =3D {cv_description =3D 0x0, cv_waiters =3D 1503985664},=20 dn_dirtyctx =3D DN_UNDIRTIED, dn_dirtyctx_firstset =3D 0x0, = dn_tx_holds =3D {rc_count =3D 0}, dn_holds =3D {rc_count =3D 1}, = dn_dbufs_mtx =3D {lock_object =3D { lo_name =3D 0xffffffff81100b9e "dn->dn_dbufs_mtx", lo_flags =3D = 40960000, lo_data =3D 0, lo_witness =3D 0x0}, sx_lock =3D 1}, dn_dbufs =3D= { list_size =3D 224, list_offset =3D 176, list_head =3D {list_next =3D = 0xffffff07072075f0, list_prev =3D 0xffffff07072075f0}},=20 dn_bonus =3D 0xffffff0abe353700, dn_zio =3D 0x0, dn_zfetch =3D = {zf_rwlock =3D {lock_object =3D {lo_name =3D 0xffffffff81100ddf = "zf->zf_rwlock",=20 lo_flags =3D 40960000, lo_data =3D 0, lo_witness =3D 0x0}, = sx_lock =3D 1}, zf_stream =3D {list_size =3D 112, list_offset =3D 88, = list_head =3D { list_next =3D 0xffffff159f7c5dd8, list_prev =3D = 0xffffff159f7c5dd8}}, zf_dnode =3D 0xffffff008040e000, zf_stream_cnt =3D = 1, zf_alloc_fail =3D 1}} (kgdb)=20=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?C58DCBA1-30E0-488F-AAC8-44477C748241>
