Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 7 Jul 2010 16:42:28 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        Nathaniel W Filardo <nwf@cs.jhu.edu>
Cc:        alc@freebsd.org, freebsd-fs@freebsd.org
Subject:   Re: [sparc64] [ZFS] panic: mutex vnode interlock not owned
Message-ID:  <201007071642.28847.jhb@freebsd.org>
In-Reply-To: <20100703085516.GH21929@gradx.cs.jhu.edu>
References:  <20100609212747.GF21929@gradx.cs.jhu.edu> <AANLkTim71FSw51tyzFE6EVwnPCT_b4JnMAdLdF_IkSWT@mail.gmail.com> <20100703085516.GH21929@gradx.cs.jhu.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Saturday, July 03, 2010 4:55:16 am Nathaniel W Filardo wrote:
> (hello freebsd-fs@; I'm cc:ing you since the latest part of my story
> involves a ZFS-related panic and I hear you're the right place to go with
> those.  It began attempting to debug a VM locking panic and has moved a
> little...)
> 
> On Thu, Jun 10, 2010 at 12:23:24PM -0500, Alan Cox wrote:
> > On Thu, Jun 10, 2010 at 7:16 AM, John Baldwin <jhb@freebsd.org> wrote:
> > 
> > > On Wednesday 09 June 2010 5:27:47 pm Nathaniel W Filardo wrote:
> > > > Attempting to boot on (2-way SMP; SUN Fire V240) sparc64 a 9.0-CURRENT
> > > > kernel built on Jun 9 at 14:41, and fully csup'd before building (I 
don't
> > > > have the SVN revision number, sorry) yields, surprisingly late in the
> > > boot
> > > > process, this panic:
> > > >
> > > > panic: mutex vm object not owned at 
/systank/src/sys/vm/vm_object.c:1692
> > > > cpuid = 0
> > > > KDB: stack backtrace:
> > > > panic() at panic+0x1c8
> > > > _mtx_assert() at _mtx_assert+0xb0
> > > > vm_object_collapse() at vm_object_collapse+0x28
> > > > vm_object_deallocate() at vm_object_deallocate+0x538
> > > > _vm_map_unlock() at _vm_map_unlock+0x64
> > > > vm_map_remove() at vm_map_remove+0x64
> > > > vmspace_exit() at vmspace_exit+0x100
> > > > exit1() at exit1+0x788
> > > > sys_exit() at sys_exit+0x10
> > > > syscallenter() at syscallenter+0x268
> > > > syscall() at syscall+0x74
> > > > -- syscall (1, FreeBSD ELF64, sys_exit) %o7=0x11980c --
> > > > userland() at 0x406fe8c8
> > > > user trace: trap %o7=0x11980c
> > > > pc 0x406fe8c8, sp 0x7fdffff7611
> > > > done
> > > > Uptime: 4m7s
> > > >
> > > > The system was, at the time, attempting to bring up its jails.
> > > >
> > > > Anything else that would be helpful to know?
> > >
> > > Can you get a crashdump?  If so, it would be good to pull up gdb and 
check
> > > the
> > > value sof 'object' and 'robject' in the vm_object_deallocate() frame.
> > >
> > >
> > That would be useful.  None of the locking changes of the last few weeks
> > have altered the vm object locking, so this assertion failure and stack
> > trace come as something of a surprise.
> > 
> > Alan
> 
> Well, I thought that no longer delegating ZFS (with "zfs jail") to the jail
> whose startup was causing the above panic might solve the problem and indeed
> the system made it slightly further.  A few minutes after reaching the
> login: prompt, though, it produced
> 
> panic: mutex vnode interlock not owned at 
/systank/src/sys/kern/kern_mutex.c:223
> cpuid = 0
> KDB: stack backtrace:
> panic() at panic+0x1c8
> _mtx_assert() at _mtx_assert+0xb0
> _mtx_unlock_flags() at _mtx_unlock_flags+0x144
> vnlru_free() at vnlru_free+0x500
> getnewvnode() at getnewvnode+0x7c
> zfs_znode_cache_constructor() at zfs_znode_cache_constructor+0x4c
> zfs_znode_alloc() at zfs_znode_alloc+0x34
> zfs_zget() at zfs_zget+0x2b8
> zfs_dirent_lock() at zfs_dirent_lock+0x508
> zfs_dirlook() at zfs_dirlook+0x50
> zfs_lookup() at zfs_lookup+0x1bc
> zfs_freebsd_lookup() at zfs_freebsd_lookup+0x6c
> VOP_CACHEDLOOKUP_APV() at VOP_CACHEDLOOKUP_APV+0x108
> vfs_cache_lookup() at vfs_cache_lookup+0xfc
> VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x110
> lookup() at lookup+0x7d0
> namei() at namei+0x69c
> kern_statat_vnhook() at kern_statat_vnhook+0x48
> kern_statat() at kern_statat+0x1c
> kern_lstat() at kern_lstat+0x18
> lstat() at lstat+0x14
> syscallenter() at syscallenter+0x27c
> syscall() at syscall+0x74
> -- syscall (190, FreeBSD ELF64, lstat) %o7=0x12b830 --
> ...
> 
> which at least is consistent with my hunch that the original panic had
> something to do with ZFS.  The system is as of svn 209653 (git c65b199...)
> with http://people.freebsd.org/~marius/sparc64_pin_ipis.diff applied.  The
> old kernel has uname
>   FreeBSD hydra.priv.oc.ietfng.org 9.0-CURRENT FreeBSD 9.0-CURRENT #20: Sun
>   Apr  4 20:31:58 EDT 2010
>   root@hydra.priv.oc.ietfng.org:/systank/obj/systank/src/sys/NWFKERN  
sparc64
> which is probably too old to be of use to anybody, but just in case, there
> it is.  I don't suspect the machine of having bad hardware since this old
> kernel runs apparently fine on it and zpool scrubs haven't found anything
> yet.
> 
> I can't easily get a crash dump on the system (if somebody could tell me how
> to get one from a ddb(4) prompt, I could try that, but otherwise the system
> just ceases to do anything after panic; I have swap and dump set, so I'm not
> sure what's not happening there...).
> 
> Anything more I should do?

I really think you might have some sort of hardware issue as all of your 
reported panics have been weird "can't happen" cases.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201007071642.28847.jhb>