Date: Mon, 4 Feb 2013 14:49:04 +0300 From: Sergey Kandaurov <pluknet@gmail.com> To: Rick Macklem <rmacklem@uoguelph.ca> Cc: Konstantin Belousov <kostikbel@gmail.com>, FreeBSD Current <freebsd-current@freebsd.org>, Andriy Gapon <avg@freebsd.org> Subject: Re: panic: LK_RETRY set with incompatible flags Message-ID: <CAE-mSO%2BJMk=SuYr7=g6MdT_=44c7%2BB00FF2YBaiUkJJaTgVi3Q@mail.gmail.com> In-Reply-To: <1515954355.2640466.1359940065810.JavaMail.root@erie.cs.uoguelph.ca> References: <510E9877.5000701@FreeBSD.org> <1515954355.2640466.1359940065810.JavaMail.root@erie.cs.uoguelph.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
On 4 February 2013 05:07, Rick Macklem <rmacklem@uoguelph.ca> wrote: > Andriy Gapon wrote: >> on 03/02/2013 18:36 Rick Macklem said the following: >> > I can think of two possibilities: >> > 1 - ZFS isn't setting VV_ROOT on the root vnode under some >> > condition. >> > or >> > 2 - The vnode was left locked from some previous operation that >> > happened >> > to be done by this thread. Doesn't seem likely, but??? >> > >> > Maybe Sergey could try the change to line#1451 and see if the panic >> > still happens. If not, that would suggest possibility #1, I think. >> >> If the kernel is configured with witness, then it should be easy to >> check where >> the exclusive lock was taken (file and line number). >> > Yep. If Sergey can reproduce this using a kernel with witness, > doing "show witness" to see where the lock on the directory vnode > was acquired, could be helpful. Hi, Rick! Here is the requested info regarding witness, and a bit more. The triggered KASSERT is now different though. Full witness is at http://people.freebsd.org/~pluknet/witness-zfs-20130204.txt shared lock of (lockmgr) zfs @ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1452 while exclusively locked from /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1747 panic: share->excl cpuid = 2 KDB: enter: panic [ thread pid 812 tid 100884 ] Stopped at kdb_enter+0x3e: movq $0,kdb_why The 1st line is at zfs_lookup(): if (error == 0 && (nm[0] != '.' || nm[1] != '\0')) { int ltype = 0; if (cnp->cn_flags & ISDOTDOT) { ltype = VOP_ISLOCKED(dvp); VOP_UNLOCK(dvp, 0); } ZFS_EXIT(zfsvfs); error = zfs_vnode_lock(*vpp, cnp->cn_lkflags); if (cnp->cn_flags & ISDOTDOT) ==> vn_lock(dvp, ltype | LK_RETRY); if (error != 0) { VN_RELE(*vpp); *vpp = NULL; return (error); } } else { ZFS_EXIT(zfsvfs); } The 2nd line is at zfs_vnode_lock(): int zfs_vnode_lock(vnode_t *vp, int flags) { int error; ASSERT(vp != NULL); error = vn_lock(vp, flags); return (error); } db> show locks exclusive lockmgr zfs (zfs) r = 0 (0xfffffe00a1b44240) locked @ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1747 db> show alllocks Process 812 (nfsd) thread 0xfffffe00a1198000 (100884) exclusive lockmgr zfs (zfs) r = 0 (0xfffffe00a1b44240) locked @ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1747 Process 750 (syslogd) thread 0xfffffe0015a4c480 (100706) exclusive lockmgr ufs (ufs) r = 0 (0xfffffe00a1962d50) locked @ /usr/src/sys/kern/vfs_syscalls.c:3433 Process 12 (intr) thread 0xfffffe0006813000 (100033) exclusive sleep mutex AAC I/O lock (AAC I/O lock) r = 0 (0xffffff8001bfb210) locked @ /usr/src/sys/dev/aac/aac.c:827 db> show lock 0xfffffe00a1b44240 class: lockmgr name: zfs state: XLOCK: 0xfffffe00a1198000 (tid 100884, pid 812, "nfsd") waiters: none spinners: none As KASSERT is different: db> bt Tracing pid 812 tid 100884 td 0xfffffe00a1198000 kdb_enter() at kdb_enter+0x3e/frame 0xffffff848e6bfd60 vpanic() at vpanic+0x147/frame 0xffffff848e6bfda0 kassert_panic() at kassert_panic+0x136/frame 0xffffff848e6bfe10 witness_checkorder() at witness_checkorder+0x289/frame 0xffffff848e6bfe90 __lockmgr_args() at __lockmgr_args+0x43e/frame 0xffffff848e6bffc0 vop_stdlock() at vop_stdlock+0x3c/frame 0xffffff848e6bffe0 VOP_LOCK1_APV() at VOP_LOCK1_APV+0xd0/frame 0xffffff848e6c0000 _vn_lock() at _vn_lock+0xab/frame 0xffffff848e6c0070 zfs_lookup() at zfs_lookup+0x392/frame 0xffffff848e6c0100 zfs_freebsd_lookup() at zfs_freebsd_lookup+0x6d/frame 0xffffff848e6c0240 VOP_CACHEDLOOKUP_APV() at VOP_CACHEDLOOKUP_APV+0xc2/frame 0xffffff848e6c0260 vfs_cache_lookup() at vfs_cache_lookup+0xcf/frame 0xffffff848e6c02b0 VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0xc2/frame 0xffffff848e6c02d0 lookup() at lookup+0x548/frame 0xffffff848e6c0350 nfsvno_namei() at nfsvno_namei+0x1a5/frame 0xffffff848e6c0400 nfsrvd_lookup() at nfsrvd_lookup+0x13a/frame 0xffffff848e6c06b0 nfsrvd_dorpc() at nfsrvd_dorpc+0xca5/frame 0xffffff848e6c08a0 nfssvc_program() at nfssvc_program+0x482/frame 0xffffff848e6c0a00 svc_run_internal() at svc_run_internal+0x1e9/frame 0xffffff848e6c0ba0 svc_thread_start() at svc_thread_start+0xb/frame 0xffffff848e6c0bb0 fork_exit() at fork_exit+0x84/frame 0xffffff848e6c0bf0 fork_trampoline() at fork_trampoline+0xe/frame 0xffffff848e6c0bf0 --- trap 0xc, rip = 0x800883b7a, rsp = 0x7fffffffd6c8, rbp = 0x7fffffffd970 --- -- wbr, pluknet
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAE-mSO%2BJMk=SuYr7=g6MdT_=44c7%2BB00FF2YBaiUkJJaTgVi3Q>