From owner-svn-src-all@FreeBSD.ORG Fri Dec 28 23:08:31 2012 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BE9CFD08; Fri, 28 Dec 2012 23:08:31 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) by mx1.freebsd.org (Postfix) with ESMTP id A1FB98FC0A; Fri, 28 Dec 2012 23:08:31 +0000 (UTC) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.5/8.14.5) with ESMTP id qBSN8VjO022991; Fri, 28 Dec 2012 23:08:31 GMT (envelope-from kib@svn.freebsd.org) Received: (from kib@localhost) by svn.freebsd.org (8.14.5/8.14.5/Submit) id qBSN8Vk5022988; Fri, 28 Dec 2012 23:08:31 GMT (envelope-from kib@svn.freebsd.org) Message-Id: <201212282308.qBSN8Vk5022988@svn.freebsd.org> From: Konstantin Belousov Date: Fri, 28 Dec 2012 23:08:31 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r244795 - in head/sys: kern sys ufs/ffs X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Dec 2012 23:08:31 -0000 Author: kib Date: Fri Dec 28 23:08:30 2012 New Revision: 244795 URL: http://svnweb.freebsd.org/changeset/base/244795 Log: Make it possible to atomically resume writes on the mount and account the write start, by adding a variation of the vfs_write_resume(9) which accepts flags. Use the new function to prevent a deadlock between parallel suspension and snapshotting a UFS mount. The ffs_snapshot() code performed vfs_write_resume() followed by vn_start_write() while owning the snaplock. If the suspension intervene between resume and vn_start_write(), the deadlock occured after the suspending thread tried to lock the snaplock, most typically during the write in the ffs_copyonwrite(). Reported and tested by: Andreas Longwitz Reviewed by: mckusick MFC after: 2 weeks X-MFC-note: make the vfs_write_resume(9) function a macro after the MFC, in HEAD Modified: head/sys/kern/vfs_vnops.c head/sys/sys/vnode.h head/sys/ufs/ffs/ffs_snapshot.c Modified: head/sys/kern/vfs_vnops.c ============================================================================== --- head/sys/kern/vfs_vnops.c Fri Dec 28 22:21:25 2012 (r244794) +++ head/sys/kern/vfs_vnops.c Fri Dec 28 23:08:30 2012 (r244795) @@ -1434,6 +1434,40 @@ vn_closefile(fp, td) * proceed. If a suspend request is in progress, we wait until the * suspension is over, and then proceed. */ +static int +vn_start_write_locked(struct mount *mp, int flags) +{ + int error; + + mtx_assert(MNT_MTX(mp), MA_OWNED); + error = 0; + + /* + * Check on status of suspension. + */ + if ((curthread->td_pflags & TDP_IGNSUSP) == 0 || + mp->mnt_susp_owner != curthread) { + while ((mp->mnt_kern_flag & MNTK_SUSPEND) != 0) { + if (flags & V_NOWAIT) { + error = EWOULDBLOCK; + goto unlock; + } + error = msleep(&mp->mnt_flag, MNT_MTX(mp), + (PUSER - 1) | (flags & PCATCH), "suspfs", 0); + if (error) + goto unlock; + } + } + if (flags & V_XSLEEP) + goto unlock; + mp->mnt_writeopcount++; +unlock: + if (error != 0 || (flags & V_XSLEEP) != 0) + MNT_REL(mp); + MNT_IUNLOCK(mp); + return (error); +} + int vn_start_write(vp, mpp, flags) struct vnode *vp; @@ -1470,30 +1504,7 @@ vn_start_write(vp, mpp, flags) if (vp == NULL) MNT_REF(mp); - /* - * Check on status of suspension. - */ - if ((curthread->td_pflags & TDP_IGNSUSP) == 0 || - mp->mnt_susp_owner != curthread) { - while ((mp->mnt_kern_flag & MNTK_SUSPEND) != 0) { - if (flags & V_NOWAIT) { - error = EWOULDBLOCK; - goto unlock; - } - error = msleep(&mp->mnt_flag, MNT_MTX(mp), - (PUSER - 1) | (flags & PCATCH), "suspfs", 0); - if (error) - goto unlock; - } - } - if (flags & V_XSLEEP) - goto unlock; - mp->mnt_writeopcount++; -unlock: - if (error != 0 || (flags & V_XSLEEP) != 0) - MNT_REL(mp); - MNT_IUNLOCK(mp); - return (error); + return (vn_start_write_locked(mp, flags)); } /* @@ -1639,8 +1650,7 @@ vfs_write_suspend(mp) * Request a filesystem to resume write operations. */ void -vfs_write_resume(mp) - struct mount *mp; +vfs_write_resume_flags(struct mount *mp, int flags) { MNT_ILOCK(mp); @@ -1652,10 +1662,25 @@ vfs_write_resume(mp) wakeup(&mp->mnt_writeopcount); wakeup(&mp->mnt_flag); curthread->td_pflags &= ~TDP_IGNSUSP; + if ((flags & VR_START_WRITE) != 0) { + MNT_REF(mp); + mp->mnt_writeopcount++; + } MNT_IUNLOCK(mp); VFS_SUSP_CLEAN(mp); - } else + } else if ((flags & VR_START_WRITE) != 0) { + MNT_REF(mp); + vn_start_write_locked(mp, 0); + } else { MNT_IUNLOCK(mp); + } +} + +void +vfs_write_resume(struct mount *mp) +{ + + vfs_write_resume_flags(mp, 0); } /* Modified: head/sys/sys/vnode.h ============================================================================== --- head/sys/sys/vnode.h Fri Dec 28 22:21:25 2012 (r244794) +++ head/sys/sys/vnode.h Fri Dec 28 23:08:30 2012 (r244795) @@ -392,6 +392,8 @@ extern int vttoif_tab[]; #define V_NOWAIT 0x0002 /* vn_start_write: don't sleep for suspend */ #define V_XSLEEP 0x0004 /* vn_start_write: just return after sleep */ +#define VR_START_WRITE 0x0001 /* vfs_write_resume: start write atomically */ + #define VREF(vp) vref(vp) #ifdef DIAGNOSTIC @@ -701,6 +703,7 @@ int vn_io_fault_uiomove(char *data, int int vfs_cache_lookup(struct vop_lookup_args *ap); void vfs_timestamp(struct timespec *); void vfs_write_resume(struct mount *mp); +void vfs_write_resume_flags(struct mount *mp, int flags); int vfs_write_suspend(struct mount *mp); int vop_stdbmap(struct vop_bmap_args *); int vop_stdfsync(struct vop_fsync_args *); Modified: head/sys/ufs/ffs/ffs_snapshot.c ============================================================================== --- head/sys/ufs/ffs/ffs_snapshot.c Fri Dec 28 22:21:25 2012 (r244794) +++ head/sys/ufs/ffs/ffs_snapshot.c Fri Dec 28 23:08:30 2012 (r244795) @@ -687,8 +687,7 @@ out1: /* * Resume operation on filesystem. */ - vfs_write_resume(vp->v_mount); - vn_start_write(NULL, &wrtmp, V_WAIT); + vfs_write_resume_flags(vp->v_mount, VR_START_WRITE); if (collectsnapstats && starttime.tv_sec > 0) { nanotime(&endtime); timespecsub(&endtime, &starttime);