Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 23 Aug 2011 13:50:56 -0700
From:      David P Discher <dpd@bitgravity.com>
To:        freebsd-fs@freebsd.org
Subject:   _sx_xlock_hard panic - is this zfs_zget panic deadlock ? (8.1-RELEASE)
Message-ID:  <C58DCBA1-30E0-488F-AAC8-44477C748241@bitgravity.com>

next in thread | raw e-mail | index | archive | help

Hey FreeBSD FS - I got a new one, well, new for us, possibly already =
fixed.

I got the following panic -=20

	Fatal trap 9: general protection fault while in kernel mode
	cpuid =3D 13; apic id =3D 21
	instruction pointer     =3D 0x20:0xffffffff80514848
	stack pointer           =3D 0x28:0xffffff9be057f230
	frame pointer           =3D 0x28:0xffffff9be057f2b0
	code segment            =3D base 0x0, limit 0xfffff, type 0x1b
							=3D DPL 0, pres =
1, long 1, def32 0, gran 1
	processor eflags        =3D interrupt enabled, resume, IOPL =3D =
0
	current process         =3D 94019 (rsync)
	[thread pid 94019 tid 102556 ]
	Stopped at      _sx_xlock_hard+0xd8:    movl    0x290(%r12),%r8d

More kgdb output below.  It looks like the operation in kern_sx panic'ed =
:

#9  0xffffffff80514848 in _sx_xlock_hard (sx=3D0xffffff008040e0d8, =
tid=3D18446742975725269984, opts=3DVariable "opts" is not available.
) at /usr/src/sys/kern/kern_sx.c:513
513                                     x =3D SX_OWNER(x);


The calling chain looks like=20

  zfs_zget -> dmu_bonus_hold -> dnode_hold_impl -> _sx_xlock -> =
_sx_xlock_hard

The best I can tell, is this looks like a race or deadlock issue, leaked =
lock, etc,
something along those lines. My searching has under covered a patch in =
-head/-stable
that may address this particular issue:

	=
------------------------------------------------------------------------
	r209097 | mm | 2010-06-12 04:22:45 -0700 (Sat, 12 Jun 2010) | 8 =
lines
	Changed paths:
	   M =
/head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c
=09
	Fix ZFS panic deadlock: cycle in blocking chain via zfs_zget
=09
	OpenSolaris onnv-revision:      9774:0bb234ab2287
=09
	Obtained from:  OpenSolaris (Bug ID 6788152)
	Approved by:    pjd, delphij (mentor)
	MFC after:      3 days
=09
	=
------------------------------------------------------------------------


I'd like a little bit of confirmation or validation or otherwise if this =
panic I'm
looking at could in fact be fixed by r209097, or is this something =
entirely different ?

Thanks !

---
David P. Discher
dpd@bitgravity.com * AIM: bgDavidDPD
BITGRAVITY * http://www.bitgravity.com


(kgdb) bt
#0  doadump () at pcpu.h:223
#1  0xffffffff801f0e5c in db_fncall (dummy1=3DVariable "dummy1" is not =
available.
) at /usr/src/sys/ddb/db_command.c:548
#2  0xffffffff801f1191 in db_command (last_cmdp=3D0xffffffff80b105e0, =
cmd_table=3DVariable "cmd_table" is not available.
) at /usr/src/sys/ddb/db_command.c:445
#3  0xffffffff801f13e0 in db_command_loop () at =
/usr/src/sys/ddb/db_command.c:498
#4  0xffffffff801f3339 in db_trap (type=3DVariable "type" is not =
available.
) at /usr/src/sys/ddb/db_main.c:229
#5  0xffffffff8053aff5 in kdb_trap (type=3D9, code=3D0, =
tf=3D0xffffff9be057f180) at /usr/src/sys/kern/subr_kdb.c:535
#6  0xffffffff8079519d in trap_fatal (frame=3D0xffffff9be057f180, =
eva=3DVariable "eva" is not available.
) at /usr/src/sys/amd64/amd64/trap.c:772
#7  0xffffffff80795a9a in trap (frame=3D0xffffff9be057f180) at =
/usr/src/sys/amd64/amd64/trap.c:588
#8  0xffffffff8077c827 in calltrap () at =
/usr/src/sys/amd64/amd64/exception.S:223
#9  0xffffffff80514848 in _sx_xlock_hard (sx=3D0xffffff008040e0d8, =
tid=3D18446742975725269984, opts=3DVariable "opts" is not available.
) at /usr/src/sys/kern/kern_sx.c:513
#10 0xffffffff80514c89 in _sx_xlock (sx=3D0xffffff008040e0d8, =
opts=3DVariable "opts" is not available.
) at sx.h:148
#11 0xffffffff8104c82d in dnode_hold_impl (os=3D0xffffff002964f400, =
object=3DVariable "object" is not available.
)
    at =
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/=
dnode.c:607
#12 0xffffffff81042c7a in dmu_bonus_hold (os=3DVariable "os" is not =
available.
) at =
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/=
dmu.c:147
#13 0xffffffff81081e77 in zfs_zget (zfsvfs=3D0xffffff00295de000, =
obj_num=3D10078547, zpp=3D0xffffff9be057f518)
    at =
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/=
zfs_znode.c:869
#14 0xffffffff81092fb3 in zfs_dirent_lock (dlpp=3D0xffffff9be057f520, =
dzp=3D0xffffff13942938d0, name=3D0xffffff9be057f5f0 "n", =
zpp=3D0xffffff9be057f518,=20
    flag=3DVariable "flag" is not available.
) at =
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/=
zfs_dir.c:321
#15 0xffffffff81093209 in zfs_dirlook (dzp=3D0xffffff13942938d0, =
name=3D0xffffff9be057f5f0 "n", vpp=3D0xffffff9be057f970, flags=3DVariable =
"flags" is not available.
)
    at =
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/=
zfs_dir.c:413
#16 0xffffffff810a2850 in zfs_lookup (dvp=3D0xffffff08daee6760, =
nm=3D0xffffff9be057f5f0 "n", vpp=3D0xffffff9be057f970, =
cnp=3D0xffffff9be057f998, nameiop=3D0,=20
    cr=3D0xffffff028c6a5200, td=3D0xffffff005b0973e0, flags=3DVariable =
"flags" is not available.
)
    at =
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/=
zfs_vnops.c:1171
#17 0xffffffff810a3791 in zfs_freebsd_lookup (ap=3D0xffffff9be057f750)
    at =
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/=
zfs_vnops.c:4059
#18 0xffffffff807e08f4 in VOP_CACHEDLOOKUP_APV (vop=3D0xffffffff81107b20, =
a=3D0xffffff9be057f750) at vnode_if.c:187
#19 0xffffffff80580190 in vfs_cache_lookup (ap=3DVariable "ap" is not =
available.
) at vnode_if.h:80
#20 0xffffffff807e33ec in VOP_LOOKUP_APV (vop=3D0xffffffff81107b20, =
a=3D0xffffff9be057f810) at vnode_if.c:123
#21 0xffffffff80586864 in lookup (ndp=3D0xffffff9be057f940) at =
vnode_if.h:54
#22 0xffffffff80587797 in namei (ndp=3D0xffffff9be057f940) at =
/usr/src/sys/kern/vfs_lookup.c:269
#23 0xffffffff80594e02 in kern_statat_vnhook (td=3D0xffffff005b0973e0, =
flag=3DVariable "flag" is not available.
) at /usr/src/sys/kern/vfs_syscalls.c:2346
#24 0xffffffff80595025 in kern_statat (td=3DVariable "td" is not =
available.
) at /usr/src/sys/kern/vfs_syscalls.c:2327
#25 0xffffffff805950ea in lstat (td=3DVariable "td" is not available.
) at /usr/src/sys/kern/vfs_syscalls.c:2390
#26 0xffffffff8079564b in syscall (frame=3D0xffffff9be057fc80) at =
/usr/src/sys/amd64/amd64/trap.c:945
#27 0xffffffff8077cb01 in Xfast_syscall () at =
/usr/src/sys/amd64/amd64/exception.S:374
#28 0x00000008007810fc in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) up
#1  0xffffffff801f0e5c in db_fncall (dummy1=3DVariable "dummy1" is not =
available.
) at /usr/src/sys/ddb/db_command.c:548
548             *rv =3D (*f)(args[0], args[1], args[2], args[3], =
args[4], args[5],
(kgdb) up
#2  0xffffffff801f1191 in db_command (last_cmdp=3D0xffffffff80b105e0, =
cmd_table=3DVariable "cmd_table" is not available.
) at /usr/src/sys/ddb/db_command.c:445
445                 (*cmd->fcn)(addr, have_addr, count, modif);
(kgdb) up
#3  0xffffffff801f13e0 in db_command_loop () at =
/usr/src/sys/ddb/db_command.c:498
498                 db_command(&db_last_command, &db_cmd_table, /* =
dopager */ 1);
(kgdb) up
#4  0xffffffff801f3339 in db_trap (type=3DVariable "type" is not =
available.
) at /usr/src/sys/ddb/db_main.c:229
229                     db_command_loop();
(kgdb) up
#5  0xffffffff8053aff5 in kdb_trap (type=3D9, code=3D0, =
tf=3D0xffffff9be057f180) at /usr/src/sys/kern/subr_kdb.c:535
535             handled =3D kdb_dbbe->dbbe_trap(type, code);
(kgdb) up
#6  0xffffffff8079519d in trap_fatal (frame=3D0xffffff9be057f180, =
eva=3DVariable "eva" is not available.
) at /usr/src/sys/amd64/amd64/trap.c:772
772                     if (kdb_trap(type, 0, frame))
(kgdb) up
#7  0xffffffff80795a9a in trap (frame=3D0xffffff9be057f180) at =
/usr/src/sys/amd64/amd64/trap.c:588
588                     trap_fatal(frame, 0);
(kgdb) up
#8  0xffffffff8077c827 in calltrap () at =
/usr/src/sys/amd64/amd64/exception.S:223
223             call    trap
Current language:  auto; currently asm
(kgdb) up
#9  0xffffffff80514848 in _sx_xlock_hard (sx=3D0xffffff008040e0d8, =
tid=3D18446742975725269984, opts=3DVariable "opts" is not available.
) at /usr/src/sys/kern/kern_sx.c:513
513                                     x =3D SX_OWNER(x);
Current language:  auto; currently c
(kgdb) p x
$1 =3D 281466386776064
(kgdb) p sx
$2 =3D (struct sx *) 0xffffff008040e0d8
(kgdb) p *sx
$3 =3D {lock_object =3D {lo_name =3D 0x2901fc0044005e <Address =
0x2901fc0044005e out of bounds>, lo_flags =3D 2104063083, lo_data =3D =
1953056627,=20
    lo_witness =3D 0x154407f4763bc41}, sx_lock =3D 281466386776064}
(kgdb) list
508                      * running or the state of the lock changes.
509                      */
510                     x =3D sx->sx_lock;
511                     if ((sx->lock_object.lo_flags & SX_NOADAPTIVE) =
=3D=3D 0) {
512                             if ((x & SX_LOCK_SHARED) =3D=3D 0) {
513                                     x =3D SX_OWNER(x);
514                                     owner =3D (struct thread *)x;
515                                     if (TD_IS_RUNNING(owner)) {
516                                             if =
(LOCK_LOG_TEST(&sx->lock_object, 0))
517                                                     CTR3(KTR_LOCK,
(kgdb) up=20
#10 0xffffffff80514c89 in _sx_xlock (sx=3D0xffffff008040e0d8, =
opts=3DVariable "opts" is not available.
) at sx.h:148
148                     error =3D _sx_xlock_hard(sx, tid, opts, file, =
line);
(kgdb) up
#11 0xffffffff8104c82d in dnode_hold_impl (os=3D0xffffff002964f400, =
object=3DVariable "object" is not available.
)
    at =
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/=
dnode.c:607
607             mutex_enter(&dn->dn_mtx);
(kgdb) list
602                             dnode_destroy(dn);
603                             dn =3D winner;
604                     }
605             }
606
607             mutex_enter(&dn->dn_mtx);
608             type =3D dn->dn_type;
609             if (dn->dn_free_txg ||
610                 ((flag & DNODE_MUST_BE_ALLOCATED) && type =3D=3D =
DMU_OT_NONE) ||
611                 ((flag & DNODE_MUST_BE_FREE) && type !=3D =
DMU_OT_NONE)) {
(kgdb) print dn
$4 =3D (dnode_t *) 0xffffff008040e000
(kgdb) print *dn
$5 =3D {dn_struct_rwlock =3D {lock_object =3D {lo_name =3D =
0xffffffff81100b7c "dn->dn_struct_rwlock", lo_flags =3D 40960000, =
lo_data =3D 0, lo_witness =3D 0x0},=20
    sx_lock =3D 1}, dn_link =3D {list_next =3D 0xffffff14dcffd630, =
list_prev =3D 0xffffff14e5f64c40}, dn_objset =3D 0xffffff002964f400,=20
  dn_object =3D 4611409328054323539, dn_dbuf =3D 0x10c837, dn_phys =3D =
0x3f, dn_type =3D 538976288, dn_bonuslen =3D 8224, dn_bonustype =3D 75 =
'K',=20
  dn_nblkptr =3D 77 'M', dn_checksum =3D 51 '3', dn_compress =3D 48 '0', =
dn_nlevels =3D 49 '1', dn_indblkshift =3D 51 '3', dn_datablkshift =3D 72 =
'H',=20
  dn_datablkszsec =3D 18247, dn_datablksz =3D 1145130323, dn_maxblkid =3D =
4698733595085963320, dn_next_nblkptr =3D "08iH", dn_next_nlevels =3D =
"athc",=20
  dn_next_indblkshift =3D " iUH", dn_next_bonuslen =3D {16695, 12851, =
12339, 12353}, dn_next_blksz =3D {909397057, 538980384, 538976288, =
538976288},=20
  dn_dirty_link =3D {{list_next =3D 0x2f00400080102020, list_prev =3D =
0x7020002004000}, {list_next =3D 0xfc10003f00103fff, list_prev =3D =
0xfffffff010000fb}, {
      list_next =3D 0x78000300070000, list_prev =3D 0x7800780078}, =
{list_next =3D 0x0, list_prev =3D 0x6170e001f0000}}, dn_mtx =3D =
{lock_object =3D {
      lo_name =3D 0x2901fc0044005e <Address 0x2901fc0044005e out of =
bounds>, lo_flags =3D 2104063083, lo_data =3D 1953056627,=20
      lo_witness =3D 0x154407f4763bc41}, sx_lock =3D 281466386776064}, =
dn_dirty_records =3D {{list_size =3D 70088136784871424,=20
      list_offset =3D 6724054219973732112, list_head =3D {list_next =3D =
0xca00000001, list_prev =3D 0xcca250005a874000}}, {list_size =3D =
2807702982,=20
      list_offset =3D 0, list_head =3D {list_next =3D 0x401c409c0000, =
list_prev =3D 0x0}}, {list_size =3D 3096229038784512,=20
      list_offset =3D 1004619926008233984, list_head =3D {list_next =3D =
0x200440000001fa20, list_prev =3D 0x7000000000230}}, {list_size =3D =
2258422671148295,=20
      list_offset =3D 74596396452349191, list_head =3D {list_next =3D =
0x4235413700060308, list_prev =3D 0x5db90000180a0000}}}, dn_ranges =3D =
{{
      avl_root =3D 0x8000ffff, avl_compar =3D 0, avl_offset =3D =
8589934592, avl_numnodes =3D 0, avl_size =3D 0}, {avl_root =3D 0x0, =
avl_compar =3D 0,=20
      avl_offset =3D 0, avl_numnodes =3D 0, avl_size =3D 0}, {avl_root =3D=
 0x0, avl_compar =3D 0, avl_offset =3D 61, avl_numnodes =3D 0,=20
      avl_size =3D 2026619832316723200}, {avl_root =3D 0x0, avl_compar =3D=
 0x21101f, avl_offset =3D 0, avl_numnodes =3D 0, avl_size =3D =
65011713}},=20
  dn_allocated_txg =3D 0, dn_free_txg =3D 0, dn_assigned_txg =3D 0, =
dn_notxholds =3D {cv_description =3D 0x0, cv_waiters =3D 1503985664},=20
  dn_dirtyctx =3D DN_UNDIRTIED, dn_dirtyctx_firstset =3D 0x0, =
dn_tx_holds =3D {rc_count =3D 0}, dn_holds =3D {rc_count =3D 1}, =
dn_dbufs_mtx =3D {lock_object =3D {
      lo_name =3D 0xffffffff81100b9e "dn->dn_dbufs_mtx", lo_flags =3D =
40960000, lo_data =3D 0, lo_witness =3D 0x0}, sx_lock =3D 1}, dn_dbufs =3D=
 {
    list_size =3D 224, list_offset =3D 176, list_head =3D {list_next =3D =
0xffffff07072075f0, list_prev =3D 0xffffff07072075f0}},=20
  dn_bonus =3D 0xffffff0abe353700, dn_zio =3D 0x0, dn_zfetch =3D =
{zf_rwlock =3D {lock_object =3D {lo_name =3D 0xffffffff81100ddf =
"zf->zf_rwlock",=20
        lo_flags =3D 40960000, lo_data =3D 0, lo_witness =3D 0x0}, =
sx_lock =3D 1}, zf_stream =3D {list_size =3D 112, list_offset =3D 88, =
list_head =3D {
        list_next =3D 0xffffff159f7c5dd8, list_prev =3D =
0xffffff159f7c5dd8}}, zf_dnode =3D 0xffffff008040e000, zf_stream_cnt =3D =
1, zf_alloc_fail =3D 1}}
(kgdb)=20=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?C58DCBA1-30E0-488F-AAC8-44477C748241>