Date: Wed, 7 Jul 2010 16:42:28 -0400 From: John Baldwin <jhb@freebsd.org> To: Nathaniel W Filardo <nwf@cs.jhu.edu> Cc: alc@freebsd.org, freebsd-fs@freebsd.org Subject: Re: [sparc64] [ZFS] panic: mutex vnode interlock not owned Message-ID: <201007071642.28847.jhb@freebsd.org> In-Reply-To: <20100703085516.GH21929@gradx.cs.jhu.edu> References: <20100609212747.GF21929@gradx.cs.jhu.edu> <AANLkTim71FSw51tyzFE6EVwnPCT_b4JnMAdLdF_IkSWT@mail.gmail.com> <20100703085516.GH21929@gradx.cs.jhu.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Saturday, July 03, 2010 4:55:16 am Nathaniel W Filardo wrote: > (hello freebsd-fs@; I'm cc:ing you since the latest part of my story > involves a ZFS-related panic and I hear you're the right place to go with > those. It began attempting to debug a VM locking panic and has moved a > little...) > > On Thu, Jun 10, 2010 at 12:23:24PM -0500, Alan Cox wrote: > > On Thu, Jun 10, 2010 at 7:16 AM, John Baldwin <jhb@freebsd.org> wrote: > > > > > On Wednesday 09 June 2010 5:27:47 pm Nathaniel W Filardo wrote: > > > > Attempting to boot on (2-way SMP; SUN Fire V240) sparc64 a 9.0-CURRENT > > > > kernel built on Jun 9 at 14:41, and fully csup'd before building (I don't > > > > have the SVN revision number, sorry) yields, surprisingly late in the > > > boot > > > > process, this panic: > > > > > > > > panic: mutex vm object not owned at /systank/src/sys/vm/vm_object.c:1692 > > > > cpuid = 0 > > > > KDB: stack backtrace: > > > > panic() at panic+0x1c8 > > > > _mtx_assert() at _mtx_assert+0xb0 > > > > vm_object_collapse() at vm_object_collapse+0x28 > > > > vm_object_deallocate() at vm_object_deallocate+0x538 > > > > _vm_map_unlock() at _vm_map_unlock+0x64 > > > > vm_map_remove() at vm_map_remove+0x64 > > > > vmspace_exit() at vmspace_exit+0x100 > > > > exit1() at exit1+0x788 > > > > sys_exit() at sys_exit+0x10 > > > > syscallenter() at syscallenter+0x268 > > > > syscall() at syscall+0x74 > > > > -- syscall (1, FreeBSD ELF64, sys_exit) %o7=0x11980c -- > > > > userland() at 0x406fe8c8 > > > > user trace: trap %o7=0x11980c > > > > pc 0x406fe8c8, sp 0x7fdffff7611 > > > > done > > > > Uptime: 4m7s > > > > > > > > The system was, at the time, attempting to bring up its jails. > > > > > > > > Anything else that would be helpful to know? > > > > > > Can you get a crashdump? If so, it would be good to pull up gdb and check > > > the > > > value sof 'object' and 'robject' in the vm_object_deallocate() frame. > > > > > > > > That would be useful. None of the locking changes of the last few weeks > > have altered the vm object locking, so this assertion failure and stack > > trace come as something of a surprise. > > > > Alan > > Well, I thought that no longer delegating ZFS (with "zfs jail") to the jail > whose startup was causing the above panic might solve the problem and indeed > the system made it slightly further. A few minutes after reaching the > login: prompt, though, it produced > > panic: mutex vnode interlock not owned at /systank/src/sys/kern/kern_mutex.c:223 > cpuid = 0 > KDB: stack backtrace: > panic() at panic+0x1c8 > _mtx_assert() at _mtx_assert+0xb0 > _mtx_unlock_flags() at _mtx_unlock_flags+0x144 > vnlru_free() at vnlru_free+0x500 > getnewvnode() at getnewvnode+0x7c > zfs_znode_cache_constructor() at zfs_znode_cache_constructor+0x4c > zfs_znode_alloc() at zfs_znode_alloc+0x34 > zfs_zget() at zfs_zget+0x2b8 > zfs_dirent_lock() at zfs_dirent_lock+0x508 > zfs_dirlook() at zfs_dirlook+0x50 > zfs_lookup() at zfs_lookup+0x1bc > zfs_freebsd_lookup() at zfs_freebsd_lookup+0x6c > VOP_CACHEDLOOKUP_APV() at VOP_CACHEDLOOKUP_APV+0x108 > vfs_cache_lookup() at vfs_cache_lookup+0xfc > VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x110 > lookup() at lookup+0x7d0 > namei() at namei+0x69c > kern_statat_vnhook() at kern_statat_vnhook+0x48 > kern_statat() at kern_statat+0x1c > kern_lstat() at kern_lstat+0x18 > lstat() at lstat+0x14 > syscallenter() at syscallenter+0x27c > syscall() at syscall+0x74 > -- syscall (190, FreeBSD ELF64, lstat) %o7=0x12b830 -- > ... > > which at least is consistent with my hunch that the original panic had > something to do with ZFS. The system is as of svn 209653 (git c65b199...) > with http://people.freebsd.org/~marius/sparc64_pin_ipis.diff applied. The > old kernel has uname > FreeBSD hydra.priv.oc.ietfng.org 9.0-CURRENT FreeBSD 9.0-CURRENT #20: Sun > Apr 4 20:31:58 EDT 2010 > root@hydra.priv.oc.ietfng.org:/systank/obj/systank/src/sys/NWFKERN sparc64 > which is probably too old to be of use to anybody, but just in case, there > it is. I don't suspect the machine of having bad hardware since this old > kernel runs apparently fine on it and zpool scrubs haven't found anything > yet. > > I can't easily get a crash dump on the system (if somebody could tell me how > to get one from a ddb(4) prompt, I could try that, but otherwise the system > just ceases to do anything after panic; I have swap and dump set, so I'm not > sure what's not happening there...). > > Anything more I should do? I really think you might have some sort of hardware issue as all of your reported panics have been weird "can't happen" cases. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201007071642.28847.jhb>