From owner-freebsd-current@FreeBSD.ORG Tue Feb 18 13:18:18 2014 Return-Path: Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id F0CEA62F; Tue, 18 Feb 2014 13:18:17 +0000 (UTC) Received: from caravan.chchile.org (caravan.chchile.org [178.32.125.136]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 86485138F; Tue, 18 Feb 2014 13:18:17 +0000 (UTC) Received: by caravan.chchile.org (Postfix, from userid 1000) id 066A81024CB; Tue, 18 Feb 2014 13:18:15 +0000 (UTC) Date: Tue, 18 Feb 2014 14:18:15 +0100 From: Jeremie Le Hen To: Andriy Gapon Subject: Re: panic: LK_RETRY set with incompatible flags (0x200400) or an error occured (11) Message-ID: <20140218131815.GF3783@caravan.chchile.org> Mail-Followup-To: Andriy Gapon , freebsd-current@FreeBSD.org References: <20140210205607.GA3783@caravan.chchile.org> <52F94923.60102@FreeBSD.org> <20140211093529.GB3783@caravan.chchile.org> <20140214191858.GC3783@caravan.chchile.org> <52FF59B8.1080206@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <52FF59B8.1080206@FreeBSD.org> User-Agent: Mutt/1.5.22 (2013-10-16) Cc: freebsd-current@FreeBSD.org X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Feb 2014 13:18:18 -0000 On Sat, Feb 15, 2014 at 02:12:40PM +0200, Andriy Gapon wrote: > on 14/02/2014 21:18 Jeremie Le Hen said the following: > > I've just got another occurence of the exact same panic. Any clue how > > to debug this? > > Could you please obtain *vp from frame 12 ? Sure: $1 = {v_tag = 0xffffffff815019a5 "zfs", v_op = 0xffffffff815164a0, v_data = 0xfffff80010dcb2e0, v_mount = 0xfffff80010dcd660, v_nmntvnodes = {tqe_next = 0xfffff80010dc7ce8, tqe_prev = 0xfffff80010dcd6c0}, v_un = {vu_mount = 0x0, vu_socket = 0x0, vu_cdev = 0x0, vu_fifoinfo = 0x0}, v_hashlist = {le_next = 0x0, le_prev = 0x0}, v_cache_src = { lh_first = 0xfffff8005aeefcb0}, v_cache_dst = {tqh_first = 0x0, tqh_last = 0xfffff80010dc8050}, v_cache_dd = 0x0, v_lock = { lock_object = {lo_name = 0xffffffff815019a5 "zfs", lo_flags = 117112832, lo_data = 0, lo_witness = 0x0}, lk_lock = 18446735277920538624, lk_exslpfail = 0, lk_timo = 51, lk_pri = 96}, v_interlock = {lock_object = { lo_name = 0xffffffff80b46085 "vnode interlock", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, v_vnlock = 0xfffff80010dc8068, v_actfreelist = { tqe_next = 0x0, tqe_prev = 0xfffff80010dc7da8}, v_bufobj = { bo_lock = {lock_object = { lo_name = 0xffffffff80b4e613 "bufobj interlock", lo_flags = 86179840, lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, bo_ops = 0xffffffff80e2d440, bo_object = 0xfffff800a30bbd00, bo_synclist = {le_next = 0x0, le_prev = 0x0}, bo_private = 0xfffff80010dc8000, __bo_vnode = 0xfffff80010dc8000, bo_clean = {bv_hd = { tqh_first = 0x0, tqh_last = 0xfffff80010dc8120}, bv_root = { pt_root = 0}, bv_cnt = 0}, bo_dirty = {bv_hd = { tqh_first = 0x0, tqh_last = 0xfffff80010dc8140}, bv_root = { pt_root = 0}, bv_cnt = 0}, bo_numoutput = 0, bo_flag = 0, bo_bsize = 131072}, v_pollinfo = 0x0, v_label = 0x0, v_lockf = 0x0, v_rl = {rl_waiters = {tqh_first = 0x0, tqh_last = 0xfffff80010dc8188}, rl_currdep = 0x0}, v_cstart = 0, v_lasta = 0, v_lastw = 0, v_clen = 0, v_holdcnt = 7, v_usecount = 6, v_iflag = 512, v_vflag = 1, v_writecount = 0, v_hash = 3, v_type = VDIR} > The problem seems to be happening in this piece of ZFS code: > if (cnp->cn_flags & ISDOTDOT) { > ltype = VOP_ISLOCKED(dvp); > VOP_UNLOCK(dvp, 0); > } > ZFS_EXIT(zfsvfs); > error = vn_lock(*vpp, cnp->cn_lkflags); > if (cnp->cn_flags & ISDOTDOT) > vn_lock(dvp, ltype | LK_RETRY); > > ltype is apparently LK_SHARED and the assertion is apparently triggered by > EDEADLK error. The error can occur only if a thread tries to obtain a lock in a > shared mode when it already has the lock exclusively. > My only explanation of how this could happen is that dvp == *vpp and cn_lkflags > is LK_EXCLUSIVE. In other words, this is a dot-dot lookup that results in the > same vnode. I think that this is only possible if dvp is the root vnode. > I am not sure if my theory is correct though. > Also, I am not sure if zfs_lookup() should be prepared to handle such a lookup > or if this kind of lookup should be handled by upper/other layers. In this case > these would be VFS lookup code and nullfs code. -- Jeremie Le Hen Scientists say the world is made up of Protons, Neutrons and Electrons. They forgot to mention Morons.