Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 31 Dec 2009 13:00:27 -0500
From:      John Baldwin <jhb@freebsd.org>
To:        Marcel Moolenaar <xcllnt@mac.com>
Cc:        freebsd-current@freebsd.org
Subject:   Re: panic: mutex Giant owned at /tank/usr/src/sys/kern/kern_thread.c:357
Message-ID:  <200912311300.27475.jhb@freebsd.org>
In-Reply-To: <8B63E862-5A0D-42D6-80F4-FDA25CFDC837@mac.com>
References:  <4C83129A-00FE-4E93-8F65-BFAE4B6F6BC7@mac.com> <200912310849.07829.jhb@freebsd.org> <8B63E862-5A0D-42D6-80F4-FDA25CFDC837@mac.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday 31 December 2009 12:36:10 pm Marcel Moolenaar wrote:
> 
> On Dec 31, 2009, at 5:49 AM, John Baldwin wrote:
> 
> > On Wednesday 30 December 2009 4:55:44 pm Marcel Moolenaar wrote:
> >> All,
> >> 
> >> We still have a ZFS-triggerable panic. The conditions under which the panic
> >> happens are "simple":
> >> 
> >> 1.  Create a mount-point /dos, and mount a MS-DOS file system
> >>    there.
> >> 2.  Create directory /dos/zfs
> >> 3.  Make /boot/zfs a symlink to /dos/zfs
> >> 4.  create or import a pool, like "zpool import tank"
> >> 
> >> ZFS will create/update the zpool cache (/boot/zfs/zpool.cache)
> >> and when done exits the zfskern/solthread thread, at which time
> >> the panic happens:
> >> 
> >> panic: mutex Giant owned at /tank/usr/src/sys/kern/kern_thread.c:357
> >> cpuid = 0
> >> KDB: enter: panic
> >> [thread pid 8 tid 100147 ]
> >> Stopped at      kdb_enter+0x92: [I2]    addl r14=0xffffffffffe1f3f0,gp ;;
> >> db> show alllocks
> >> Process 8 (zfskern) thread 0xe000000010df4a20 (100147)
> >> exclusive sleep mutex process lock (process lock) r = 0 (0xe000000010407660) 
> > locked @ /tank/usr/src/sys/kern/kern_kthread.c:326
> >> exclusive sleep mutex Giant (Giant) r = 1 (0xe0000000048f8da8) locked @ 
> > /tank/usr/src/sys/kern/vfs_lookup.c:755
> >> 
> >> It looks to me that this is a bug in vfs_lookup.c, but I'm not
> >> savvy enough to know this for sure or fix it fast myself. Help
> >> is welcome, because this particular bug hits ia64 hard: /boot
> >> is a symlink to /efi/boot, where /efi is a msdosfs mount point.
> > 
> > Can you get a stack trace?  The bug is probably that ZFS isn't properly 
> > honoring NDHASGIANT() someplace.  Hmm, it certainly doesn't honor it
> > in lookupnameat().  You could maybe have it unlock Giant there, but I
> > believe that will result in ZFS not acquiring Giant for any vnode
> > operations on a returned vnode from a !MPSAFE filesystem.
> 
> The backtrace is rather useless:
> 
> # zpool import tank
> panic: mutex Giant owned at /tank/usr/src/sys/kern/kern_thread.c:357
> cpuid = 1
> KDB: enter: panic
> [thread pid 8 tid 100105 ]
> Stopped at      kdb_enter+0x92: [I2]    addl r14=0xffffffffffe1fab8,gp ;;
> db> bt
> Tracing pid 8 tid 100105 td 0xe0000000109e1560
> kdb_enter(0xe0000000047984c0, 0xe0000000047984c0, 0xe00000000439bb70, 0x793) at kdb_enter+0x92
> panic(0xe000000004796058, 0xe000000004796728, 0xe000000004799b48, 0x165) at panic+0x2f0
> _mtx_assert(0xe000000004911828, 0x0, 0xe000000004799b48, 0x165) at _mtx_assert+0x200
> thread_exit(0xe000000004799b48, 0x0, 0xe000000004793480, 0xe0000000109e1560) at thread_exit+0x70
> kthread_exit(0xe000000004793480, 0xe000000010407568, 0xe000000004c7bb80, 0x58f) at kthread_exit+0xd0
> spa_async_thread(0xe000000010e59000, 0x1, 0xe000000004791aa8, 0x343) at spa_async_thread+0x1a0

I mostly cared to see what the main routine for this kthread was, but the
call down to namei() isn't obvious in this routine.

> I traced the locks (with a tweak to get Giant included) and it
> looks like vfs_lookup.c is fine: there are as many unlocks of
> Giant as there are locks.

Well, namei() will purposefully return with Giant held if it returns a locked
vnode from a !MPSAFE filesystem.  The caller is supposed to use NDHASGIANT()
to detect that case and later drop Giant after it unlocks the vnode.

If you look at the ZFS lookupnameat() routine (which is probably where namei()
is getting invoked from this thread), it doesn't check NDHASGIANT() at all.
You could perhaps add a VFS_UNLOCK_GIANT(NDHASGIANT()) there to fix the leak.
However, I suspect that ZFS will not lock Giant when doing operations on the
returned vnode.
 
> BTW: Xin LI gave me a patch with 2 missing unlocks of Giant in zfs_dir.c
> It seem ZFS is rather sloppy WRT to Giant :-/

It doesn't handle Giant at all and doesn't safely interact with !MPSAFE
filesystems like msdosfs as a result.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200912311300.27475.jhb>