From owner-cvs-all@FreeBSD.ORG Mon Jan 22 11:25:22 2007 Return-Path: X-Original-To: cvs-all@FreeBSD.org Delivered-To: cvs-all@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BF45216A404; Mon, 22 Jan 2007 11:25:22 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repoman.freebsd.org (repoman.freebsd.org [69.147.83.41]) by mx1.freebsd.org (Postfix) with ESMTP id AEEDA13C45B; Mon, 22 Jan 2007 11:25:22 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repoman.freebsd.org (localhost [127.0.0.1]) by repoman.freebsd.org (8.13.6/8.13.6) with ESMTP id l0MBPMMo041568; Mon, 22 Jan 2007 11:25:22 GMT (envelope-from kib@repoman.freebsd.org) Received: (from kib@localhost) by repoman.freebsd.org (8.13.6/8.13.4/Submit) id l0MBPMS4041567; Mon, 22 Jan 2007 11:25:22 GMT (envelope-from kib) Message-Id: <200701221125.l0MBPMS4041567@repoman.freebsd.org> From: Konstantin Belousov Date: Mon, 22 Jan 2007 11:25:22 +0000 (UTC) To: src-committers@FreeBSD.org, cvs-src@FreeBSD.org, cvs-all@FreeBSD.org X-FreeBSD-CVS-Branch: HEAD Cc: Subject: cvs commit: src/sys/fs/deadfs dead_vnops.c src/sys/kern vfs_lookup.c X-BeenThere: cvs-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the entire tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Jan 2007 11:25:22 -0000 kib 2007-01-22 11:25:22 UTC FreeBSD src repository Modified files: sys/fs/deadfs dead_vnops.c sys/kern vfs_lookup.c Log: Below is slightly edited description of the LOR by Tor Egge: -------------------------- [Deadlock] is caused by a lock order reversal in vfs_lookup(), where [some] process is trying to lock a directory vnode, that is the parent directory of covered vnode) while holding an exclusive vnode lock on covering vnode. A simplified scenario: root fs var fs / A / (/var) D /var B /log (/var/log) E vfs lock C vfs lock F Within each file system, the lock order is clear: C->A->B and F->D->E When traversing across mounts, the system can choose between two lock orders, but everything must then follow that lock order: L1: C->A->B | +->F->D->E L2: F->D->E | +->C->A->B The lookup() process for namei("/var") mixes those two lock orders: VOP_LOOKUP() obtains B while A is held vfs_busy() obtains a shared lock on F while A and B are held (follows L1, violates L2) vput() releases lock on B VOP_UNLOCK() releases lock on A VFS_ROOT() obtains lock on D while shared lock on F is held vfs_unbusy() releases shared lock on F vn_lock() obtains lock on A while D is held (violates L1, follows L2) dounmount() follows L1 (B is locked while F is drained). Without unmount activity, vfs_busy() will always succeed without blocking and the deadlock isn't triggered (the system behaves as if L2 is followed). With unmount, you can get 4 processes in a deadlock: p1: holds D, want A (in lookup()) p2: holds shared lock on F, want D (in VFS_ROOT()) p3: holds B, want drain lock on F (in dounmount()) p4: holds A, want B (in VOP_LOOKUP()) You can have more than one instance of p2. The reversal was introduced in revision 1.81 of src/sys/kern/vfs_lookup.c and MFCed to revision 1.80.2.1, probably to avoid a cascade of vnode locks when nfs servers are dead (VFS_ROOT() just hangs) spreading to the root fs root vnode. - Tor Egge To fix the LOR, ups@ noted that when crossing the mount point, ni_dvp is actually not used by the callers of namei. Thus, placeholder deadfs vnode vp_crossmp is introduced that is filled into ni_dvp. Idea by: ups Reviewed by: tegge, ups, jeff, rwatson (mac interaction) Tested by: Peter Holm MFC after: 2 weeks Revision Changes Path 1.50 +24 -1 src/sys/fs/deadfs/dead_vnops.c 1.97 +17 -4 src/sys/kern/vfs_lookup.c