From owner-svn-src-head@FreeBSD.ORG Fri May 31 00:43:45 2013 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id B3906F13; Fri, 31 May 2013 00:43:45 +0000 (UTC) (envelope-from jeff@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) by mx1.freebsd.org (Postfix) with ESMTP id A2EA78DE; Fri, 31 May 2013 00:43:45 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.6/8.14.6) with ESMTP id r4V0hj0Y033056; Fri, 31 May 2013 00:43:45 GMT (envelope-from jeff@svn.freebsd.org) Received: (from jeff@localhost) by svn.freebsd.org (8.14.6/8.14.5/Submit) id r4V0hgZn033028; Fri, 31 May 2013 00:43:42 GMT (envelope-from jeff@svn.freebsd.org) Message-Id: <201305310043.r4V0hgZn033028@svn.freebsd.org> From: Jeff Roberson Date: Fri, 31 May 2013 00:43:42 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r251171 - in head/sys: fs/ext2fs fs/nandfs fs/nfsclient fs/nfsserver kern nfsclient nfsserver sys ufs/ffs X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 May 2013 00:43:45 -0000 Author: jeff Date: Fri May 31 00:43:41 2013 New Revision: 251171 URL: http://svnweb.freebsd.org/changeset/base/251171 Log: - Convert the bufobj lock to rwlock. - Use a shared bufobj lock in getblk() and inmem(). - Convert softdep's lk to rwlock to match the bufobj lock. - Move INFREECNT to b_flags and protect it with the buf lock. - Remove unnecessary locking around bremfree() and BKGRDINPROG. Sponsored by: EMC / Isilon Storage Division Discussed with: mckusick, kib, mdf Modified: head/sys/fs/ext2fs/ext2_inode.c head/sys/fs/nandfs/nandfs_segment.c head/sys/fs/nandfs/nandfs_vnops.c head/sys/fs/nfsclient/nfs_clvnops.c head/sys/fs/nfsserver/nfs_nfsdport.c head/sys/kern/vfs_bio.c head/sys/kern/vfs_cluster.c head/sys/kern/vfs_default.c head/sys/kern/vfs_subr.c head/sys/nfsclient/nfs_subs.c head/sys/nfsclient/nfs_vnops.c head/sys/nfsserver/nfs_serv.c head/sys/sys/buf.h head/sys/sys/bufobj.h head/sys/ufs/ffs/ffs_inode.c head/sys/ufs/ffs/ffs_snapshot.c head/sys/ufs/ffs/ffs_softdep.c head/sys/ufs/ffs/ffs_vfsops.c Modified: head/sys/fs/ext2fs/ext2_inode.c ============================================================================== --- head/sys/fs/ext2fs/ext2_inode.c Fri May 31 00:31:45 2013 (r251170) +++ head/sys/fs/ext2fs/ext2_inode.c Fri May 31 00:43:41 2013 (r251171) @@ -43,6 +43,7 @@ #include #include #include +#include #include #include Modified: head/sys/fs/nandfs/nandfs_segment.c ============================================================================== --- head/sys/fs/nandfs/nandfs_segment.c Fri May 31 00:31:45 2013 (r251170) +++ head/sys/fs/nandfs/nandfs_segment.c Fri May 31 00:43:41 2013 (r251171) @@ -38,6 +38,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include #include @@ -702,7 +703,7 @@ nandfs_save_buf(struct buf *bp, uint64_t if (bp->b_bufobj != bo) { BO_LOCK(bp->b_bufobj); BUF_LOCK(bp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK, - BO_MTX(bp->b_bufobj)); + BO_LOCKPTR(bp->b_bufobj)); KASSERT(BUF_ISLOCKED(bp), ("Problem with locking buffer")); } Modified: head/sys/fs/nandfs/nandfs_vnops.c ============================================================================== --- head/sys/fs/nandfs/nandfs_vnops.c Fri May 31 00:31:45 2013 (r251170) +++ head/sys/fs/nandfs/nandfs_vnops.c Fri May 31 00:43:41 2013 (r251171) @@ -46,6 +46,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include @@ -556,7 +557,7 @@ restart_locked: continue; if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK, - BO_MTX(bo)) == ENOLCK) + BO_LOCKPTR(bo)) == ENOLCK) goto restart; bp->b_flags |= (B_INVAL | B_RELBUF); bp->b_flags &= ~(B_ASYNC | B_MANAGED); Modified: head/sys/fs/nfsclient/nfs_clvnops.c ============================================================================== --- head/sys/fs/nfsclient/nfs_clvnops.c Fri May 31 00:31:45 2013 (r251170) +++ head/sys/fs/nfsclient/nfs_clvnops.c Fri May 31 00:43:41 2013 (r251171) @@ -2852,7 +2852,7 @@ loop: error = BUF_TIMELOCK(bp, LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK, - BO_MTX(bo), "nfsfsync", slpflag, slptimeo); + BO_LOCKPTR(bo), "nfsfsync", slpflag, slptimeo); if (error == 0) { BUF_UNLOCK(bp); goto loop; Modified: head/sys/fs/nfsserver/nfs_nfsdport.c ============================================================================== --- head/sys/fs/nfsserver/nfs_nfsdport.c Fri May 31 00:31:45 2013 (r251170) +++ head/sys/fs/nfsserver/nfs_nfsdport.c Fri May 31 00:43:41 2013 (r251171) @@ -1321,7 +1321,7 @@ nfsvno_fsync(struct vnode *vp, u_int64_t */ if ((bp = gbincore(&vp->v_bufobj, lblkno)) != NULL) { if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_SLEEPFAIL | - LK_INTERLOCK, BO_MTX(bo)) == ENOLCK) { + LK_INTERLOCK, BO_LOCKPTR(bo)) == ENOLCK) { BO_LOCK(bo); continue; /* retry */ } Modified: head/sys/kern/vfs_bio.c ============================================================================== --- head/sys/kern/vfs_bio.c Fri May 31 00:31:45 2013 (r251170) +++ head/sys/kern/vfs_bio.c Fri May 31 00:43:41 2013 (r251171) @@ -418,11 +418,9 @@ bufcountwakeup(struct buf *bp) { int old; - KASSERT((bp->b_vflags & BV_INFREECNT) == 0, + KASSERT((bp->b_flags & B_INFREECNT) == 0, ("buf %p already counted as free", bp)); - if (bp->b_bufobj != NULL) - mtx_assert(BO_MTX(bp->b_bufobj), MA_OWNED); - bp->b_vflags |= BV_INFREECNT; + bp->b_flags |= B_INFREECNT; old = atomic_fetchadd_int(&numfreebuffers, 1); KASSERT(old >= 0 && old < nbuf, ("numfreebuffers climbed to %d", old + 1)); @@ -670,11 +668,10 @@ bufinit(void) for (i = 0; i < nbuf; i++) { bp = &buf[i]; bzero(bp, sizeof *bp); - bp->b_flags = B_INVAL; /* we're just an empty header */ + bp->b_flags = B_INVAL | B_INFREECNT; bp->b_rcred = NOCRED; bp->b_wcred = NOCRED; bp->b_qindex = QUEUE_EMPTY; - bp->b_vflags = BV_INFREECNT; /* buf is counted as free */ bp->b_xflags = 0; LIST_INIT(&bp->b_dep); BUF_LOCKINIT(bp); @@ -848,16 +845,14 @@ bremfree(struct buf *bp) ("bremfree: buffer %p already marked for delayed removal.", bp)); KASSERT(bp->b_qindex != QUEUE_NONE, ("bremfree: buffer %p not on a queue.", bp)); - BUF_ASSERT_HELD(bp); + BUF_ASSERT_XLOCKED(bp); bp->b_flags |= B_REMFREE; /* Fixup numfreebuffers count. */ if ((bp->b_flags & B_INVAL) || (bp->b_flags & B_DELWRI) == 0) { - KASSERT((bp->b_vflags & BV_INFREECNT) != 0, + KASSERT((bp->b_flags & B_INFREECNT) != 0, ("buf %p not counted in numfreebuffers", bp)); - if (bp->b_bufobj != NULL) - mtx_assert(BO_MTX(bp->b_bufobj), MA_OWNED); - bp->b_vflags &= ~BV_INFREECNT; + bp->b_flags &= ~B_INFREECNT; old = atomic_fetchadd_int(&numfreebuffers, -1); KASSERT(old > 0, ("numfreebuffers dropped to %d", old - 1)); } @@ -892,7 +887,7 @@ bremfreel(struct buf *bp) bp, bp->b_vp, bp->b_flags); KASSERT(bp->b_qindex != QUEUE_NONE, ("bremfreel: buffer %p not on a queue.", bp)); - BUF_ASSERT_HELD(bp); + BUF_ASSERT_XLOCKED(bp); mtx_assert(&bqlock, MA_OWNED); TAILQ_REMOVE(&bufqueues[bp->b_qindex], bp, b_freelist); @@ -916,11 +911,9 @@ bremfreel(struct buf *bp) * numfreebuffers. */ if ((bp->b_flags & B_INVAL) || (bp->b_flags & B_DELWRI) == 0) { - KASSERT((bp->b_vflags & BV_INFREECNT) != 0, + KASSERT((bp->b_flags & B_INFREECNT) != 0, ("buf %p not counted in numfreebuffers", bp)); - if (bp->b_bufobj != NULL) - mtx_assert(BO_MTX(bp->b_bufobj), MA_OWNED); - bp->b_vflags &= ~BV_INFREECNT; + bp->b_flags &= ~B_INFREECNT; old = atomic_fetchadd_int(&numfreebuffers, -1); KASSERT(old > 0, ("numfreebuffers dropped to %d", old - 1)); } @@ -1476,15 +1469,10 @@ brelse(struct buf *bp) bp->b_flags &= ~B_RELBUF; else if (buf_vm_page_count_severe()) { /* - * The locking of the BO_LOCK is not necessary since - * BKGRDINPROG cannot be set while we hold the buf - * lock, it can only be cleared if it is already - * pending. - */ - if (bp->b_vp) { - if (!(bp->b_vflags & BV_BKGRDINPROG)) - bp->b_flags |= B_RELBUF; - } else + * BKGRDINPROG can only be set with the buf and bufobj + * locks both held. We tolerate a race to clear it here. + */ + if (!(bp->b_vflags & BV_BKGRDINPROG)) bp->b_flags |= B_RELBUF; } @@ -1603,16 +1591,9 @@ brelse(struct buf *bp) /* enqueue */ mtx_lock(&bqlock); /* Handle delayed bremfree() processing. */ - if (bp->b_flags & B_REMFREE) { - struct bufobj *bo; - - bo = bp->b_bufobj; - if (bo != NULL) - BO_LOCK(bo); + if (bp->b_flags & B_REMFREE) bremfreel(bp); - if (bo != NULL) - BO_UNLOCK(bo); - } + if (bp->b_qindex != QUEUE_NONE) panic("brelse: free buffer onto another queue???"); @@ -1676,16 +1657,8 @@ brelse(struct buf *bp) * if B_INVAL is set ). */ - if (!(bp->b_flags & B_DELWRI)) { - struct bufobj *bo; - - bo = bp->b_bufobj; - if (bo != NULL) - BO_LOCK(bo); + if (!(bp->b_flags & B_DELWRI)) bufcountwakeup(bp); - if (bo != NULL) - BO_UNLOCK(bo); - } /* * Something we can maybe free or reuse @@ -1730,11 +1703,7 @@ bqrelse(struct buf *bp) if (bp->b_flags & B_MANAGED) { if (bp->b_flags & B_REMFREE) { mtx_lock(&bqlock); - if (bo != NULL) - BO_LOCK(bo); bremfreel(bp); - if (bo != NULL) - BO_UNLOCK(bo); mtx_unlock(&bqlock); } bp->b_flags &= ~(B_ASYNC | B_NOCACHE | B_AGE | B_RELBUF); @@ -1744,13 +1713,9 @@ bqrelse(struct buf *bp) mtx_lock(&bqlock); /* Handle delayed bremfree() processing. */ - if (bp->b_flags & B_REMFREE) { - if (bo != NULL) - BO_LOCK(bo); + if (bp->b_flags & B_REMFREE) bremfreel(bp); - if (bo != NULL) - BO_UNLOCK(bo); - } + if (bp->b_qindex != QUEUE_NONE) panic("bqrelse: free buffer onto another queue???"); /* buffers with stale but valid contents */ @@ -1762,13 +1727,11 @@ bqrelse(struct buf *bp) #endif } else { /* - * The locking of the BO_LOCK for checking of the - * BV_BKGRDINPROG is not necessary since the - * BV_BKGRDINPROG cannot be set while we hold the buf - * lock, it can only be cleared if it is already - * pending. + * BKGRDINPROG can only be set with the buf and bufobj + * locks both held. We tolerate a race to clear it here. */ - if (!buf_vm_page_count_severe() || (bp->b_vflags & BV_BKGRDINPROG)) { + if (!buf_vm_page_count_severe() || + (bp->b_vflags & BV_BKGRDINPROG)) { bp->b_qindex = QUEUE_CLEAN; TAILQ_INSERT_TAIL(&bufqueues[QUEUE_CLEAN], bp, b_freelist); @@ -1788,13 +1751,8 @@ bqrelse(struct buf *bp) } mtx_unlock(&bqlock); - if ((bp->b_flags & B_INVAL) || !(bp->b_flags & B_DELWRI)) { - if (bo != NULL) - BO_LOCK(bo); + if ((bp->b_flags & B_INVAL) || !(bp->b_flags & B_DELWRI)) bufcountwakeup(bp); - if (bo != NULL) - BO_UNLOCK(bo); - } /* * Something we can maybe free or reuse. @@ -1940,7 +1898,7 @@ vfs_bio_awrite(struct buf *bp) size = vp->v_mount->mnt_stat.f_iosize; maxcl = MAXPHYS / size; - BO_LOCK(bo); + BO_RLOCK(bo); for (i = 1; i < maxcl; i++) if (vfs_bio_clcheck(vp, size, lblkno + i, bp->b_blkno + ((i * size) >> DEV_BSHIFT)) == 0) @@ -1950,7 +1908,7 @@ vfs_bio_awrite(struct buf *bp) if (vfs_bio_clcheck(vp, size, lblkno - j, bp->b_blkno - ((j * size) >> DEV_BSHIFT)) == 0) break; - BO_UNLOCK(bo); + BO_RUNLOCK(bo); --j; ncl = i + j; /* @@ -2145,7 +2103,7 @@ getnewbuf_reuse_bp(struct buf *bp, int q bp->b_flags &= B_UNMAPPED | B_KVAALLOC; bp->b_ioflags = 0; bp->b_xflags = 0; - KASSERT((bp->b_vflags & BV_INFREECNT) == 0, + KASSERT((bp->b_flags & B_INFREECNT) == 0, ("buf %p still counted as free?", bp)); bp->b_vflags = 0; bp->b_vp = NULL; @@ -2293,24 +2251,19 @@ restart: */ if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_NOWAIT, NULL) != 0) continue; - if (bp->b_vp) { - BO_LOCK(bp->b_bufobj); - if (bp->b_vflags & BV_BKGRDINPROG) { - BO_UNLOCK(bp->b_bufobj); - BUF_UNLOCK(bp); - continue; - } - BO_UNLOCK(bp->b_bufobj); + /* + * BKGRDINPROG can only be set with the buf and bufobj + * locks both held. We tolerate a race to clear it here. + */ + if (bp->b_vflags & BV_BKGRDINPROG) { + BUF_UNLOCK(bp); + continue; } KASSERT(bp->b_qindex == qindex, ("getnewbuf: inconsistent queue %d bp %p", qindex, bp)); - if (bp->b_bufobj != NULL) - BO_LOCK(bp->b_bufobj); bremfreel(bp); - if (bp->b_bufobj != NULL) - BO_UNLOCK(bp->b_bufobj); mtx_unlock(&bqlock); /* * NOTE: nbp is now entirely invalid. We can only restart @@ -2653,14 +2606,15 @@ flushbufqueues(struct vnode *lvp, int qu BUF_UNLOCK(bp); continue; } - BO_LOCK(bp->b_bufobj); + /* + * BKGRDINPROG can only be set with the buf and bufobj + * locks both held. We tolerate a race to clear it here. + */ if ((bp->b_vflags & BV_BKGRDINPROG) != 0 || (bp->b_flags & B_DELWRI) == 0) { - BO_UNLOCK(bp->b_bufobj); BUF_UNLOCK(bp); continue; } - BO_UNLOCK(bp->b_bufobj); if (bp->b_flags & B_INVAL) { bremfreel(bp); mtx_unlock(&bqlock); @@ -2737,9 +2691,9 @@ incore(struct bufobj *bo, daddr_t blkno) { struct buf *bp; - BO_LOCK(bo); + BO_RLOCK(bo); bp = gbincore(bo, blkno); - BO_UNLOCK(bo); + BO_RUNLOCK(bo); return (bp); } @@ -3053,7 +3007,7 @@ loop: mtx_unlock(&nblock); } - BO_LOCK(bo); + BO_RLOCK(bo); bp = gbincore(bo, blkno); if (bp != NULL) { int lockflags; @@ -3067,7 +3021,7 @@ loop: lockflags |= LK_NOWAIT; error = BUF_TIMELOCK(bp, lockflags, - BO_MTX(bo), "getblk", slpflag, slptimeo); + BO_LOCKPTR(bo), "getblk", slpflag, slptimeo); /* * If we slept and got the lock we have to restart in case @@ -3094,11 +3048,8 @@ loop: bp->b_flags |= B_CACHE; if (bp->b_flags & B_MANAGED) MPASS(bp->b_qindex == QUEUE_NONE); - else { - BO_LOCK(bo); + else bremfree(bp); - BO_UNLOCK(bo); - } /* * check for size inconsistencies for non-VMIO case. @@ -3193,7 +3144,7 @@ loop: * returned by getnewbuf() is locked. Note that the returned * buffer is also considered valid (not marked B_INVAL). */ - BO_UNLOCK(bo); + BO_RUNLOCK(bo); /* * If the user does not want us to create the buffer, bail out * here. @@ -4400,7 +4351,7 @@ bufobj_wrefl(struct bufobj *bo) { KASSERT(bo != NULL, ("NULL bo in bufobj_wref")); - ASSERT_BO_LOCKED(bo); + ASSERT_BO_WLOCKED(bo); bo->bo_numoutput++; } @@ -4434,11 +4385,11 @@ bufobj_wwait(struct bufobj *bo, int slpf int error; KASSERT(bo != NULL, ("NULL bo in bufobj_wwait")); - ASSERT_BO_LOCKED(bo); + ASSERT_BO_WLOCKED(bo); error = 0; while (bo->bo_numoutput) { bo->bo_flag |= BO_WWAIT; - error = msleep(&bo->bo_numoutput, BO_MTX(bo), + error = msleep(&bo->bo_numoutput, BO_LOCKPTR(bo), slpflag | (PRIBIO + 1), "bo_wwait", timeo); if (error) break; @@ -4596,7 +4547,7 @@ DB_COMMAND(countfreebufs, db_coundfreebu for (i = 0; i < nbuf; i++) { bp = &buf[i]; - if ((bp->b_vflags & BV_INFREECNT) != 0) + if ((bp->b_flags & B_INFREECNT) != 0) nfree++; else used++; Modified: head/sys/kern/vfs_cluster.c ============================================================================== --- head/sys/kern/vfs_cluster.c Fri May 31 00:31:45 2013 (r251170) +++ head/sys/kern/vfs_cluster.c Fri May 31 00:43:41 2013 (r251171) @@ -133,7 +133,7 @@ cluster_read(struct vnode *vp, u_quad_t return 0; } else { bp->b_flags &= ~B_RAM; - BO_LOCK(bo); + BO_RLOCK(bo); for (i = 1; i < maxra; i++) { /* * Stop if the buffer does not exist or it @@ -156,7 +156,7 @@ cluster_read(struct vnode *vp, u_quad_t BUF_UNLOCK(rbp); } } - BO_UNLOCK(bo); + BO_RUNLOCK(bo); if (i >= maxra) { return 0; } @@ -396,17 +396,16 @@ cluster_rbuild(struct vnode *vp, u_quad_ * (marked B_CACHE), or locked (may be doing a * background write), or if the buffer is not * VMIO backed. The clustering code can only deal - * with VMIO-backed buffers. + * with VMIO-backed buffers. The bo lock is not + * required for the BKGRDINPROG check since it + * can not be set without the buf lock. */ - BO_LOCK(bo); if ((tbp->b_vflags & BV_BKGRDINPROG) || (tbp->b_flags & B_CACHE) || (tbp->b_flags & B_VMIO) == 0) { - BO_UNLOCK(bo); bqrelse(tbp); break; } - BO_UNLOCK(bo); /* * The buffer must be completely invalid in order to @@ -790,7 +789,7 @@ cluster_wbuild(struct vnode *vp, long si continue; } if (BUF_LOCK(tbp, - LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK, BO_MTX(bo))) { + LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK, BO_LOCKPTR(bo))) { ++start_lbn; --len; continue; @@ -891,7 +890,7 @@ cluster_wbuild(struct vnode *vp, long si */ if (BUF_LOCK(tbp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK, - BO_MTX(bo))) + BO_LOCKPTR(bo))) break; if ((tbp->b_flags & (B_VMIO | B_CLUSTEROK | Modified: head/sys/kern/vfs_default.c ============================================================================== --- head/sys/kern/vfs_default.c Fri May 31 00:31:45 2013 (r251170) +++ head/sys/kern/vfs_default.c Fri May 31 00:43:41 2013 (r251171) @@ -662,7 +662,7 @@ loop2: continue; if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_INTERLOCK | LK_SLEEPFAIL, - BO_MTX(bo)) != 0) { + BO_LOCKPTR(bo)) != 0) { BO_LOCK(bo); goto loop1; } Modified: head/sys/kern/vfs_subr.c ============================================================================== --- head/sys/kern/vfs_subr.c Fri May 31 00:31:45 2013 (r251170) +++ head/sys/kern/vfs_subr.c Fri May 31 00:43:41 2013 (r251171) @@ -1073,7 +1073,7 @@ alloc: */ bo = &vp->v_bufobj; bo->__bo_vnode = vp; - mtx_init(BO_MTX(bo), "bufobj interlock", NULL, MTX_DEF); + rw_init(BO_LOCKPTR(bo), "bufobj interlock"); bo->bo_ops = &buf_ops_bio; bo->bo_private = vp; TAILQ_INIT(&bo->bo_clean.bv_hd); @@ -1331,7 +1331,7 @@ flushbuflist(struct bufv *bufv, int flag daddr_t lblkno; b_xflags_t xflags; - ASSERT_BO_LOCKED(bo); + ASSERT_BO_WLOCKED(bo); retval = 0; TAILQ_FOREACH_SAFE(bp, &bufv->bv_hd, b_bobufs, nbp) { @@ -1347,7 +1347,7 @@ flushbuflist(struct bufv *bufv, int flag } retval = EAGAIN; error = BUF_TIMELOCK(bp, - LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK, BO_MTX(bo), + LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK, BO_LOCKPTR(bo), "flushbuf", slpflag, slptimeo); if (error) { BO_LOCK(bo); @@ -1369,17 +1369,13 @@ flushbuflist(struct bufv *bufv, int flag */ if (((bp->b_flags & (B_DELWRI | B_INVAL)) == B_DELWRI) && (flags & V_SAVE)) { - BO_LOCK(bo); bremfree(bp); - BO_UNLOCK(bo); bp->b_flags |= B_ASYNC; bwrite(bp); BO_LOCK(bo); return (EAGAIN); /* XXX: why not loop ? */ } - BO_LOCK(bo); bremfree(bp); - BO_UNLOCK(bo); bp->b_flags |= (B_INVAL | B_RELBUF); bp->b_flags &= ~B_ASYNC; brelse(bp); @@ -1426,12 +1422,10 @@ restart: continue; if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK, - BO_MTX(bo)) == ENOLCK) + BO_LOCKPTR(bo)) == ENOLCK) goto restart; - BO_LOCK(bo); bremfree(bp); - BO_UNLOCK(bo); bp->b_flags |= (B_INVAL | B_RELBUF); bp->b_flags &= ~B_ASYNC; brelse(bp); @@ -1452,11 +1446,9 @@ restart: continue; if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK, - BO_MTX(bo)) == ENOLCK) + BO_LOCKPTR(bo)) == ENOLCK) goto restart; - BO_LOCK(bo); bremfree(bp); - BO_UNLOCK(bo); bp->b_flags |= (B_INVAL | B_RELBUF); bp->b_flags &= ~B_ASYNC; brelse(bp); @@ -1484,15 +1476,13 @@ restartsync: */ if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK, - BO_MTX(bo)) == ENOLCK) { + BO_LOCKPTR(bo)) == ENOLCK) { goto restart; } VNASSERT((bp->b_flags & B_DELWRI), vp, ("buf(%p) on dirty queue without DELWRI", bp)); - BO_LOCK(bo); bremfree(bp); - BO_UNLOCK(bo); bawrite(bp); BO_LOCK(bo); goto restartsync; @@ -1512,7 +1502,7 @@ buf_vlist_remove(struct buf *bp) struct bufv *bv; KASSERT(bp->b_bufobj != NULL, ("No b_bufobj %p", bp)); - ASSERT_BO_LOCKED(bp->b_bufobj); + ASSERT_BO_WLOCKED(bp->b_bufobj); KASSERT((bp->b_xflags & (BX_VNDIRTY|BX_VNCLEAN)) != (BX_VNDIRTY|BX_VNCLEAN), ("buf_vlist_remove: Buf %p is on two lists", bp)); @@ -1538,7 +1528,7 @@ buf_vlist_add(struct buf *bp, struct buf struct buf *n; int error; - ASSERT_BO_LOCKED(bo); + ASSERT_BO_WLOCKED(bo); KASSERT((bp->b_xflags & (BX_VNDIRTY|BX_VNCLEAN)) == 0, ("buf_vlist_add: Buf %p has existing xflags %d", bp, bp->b_xflags)); bp->b_xflags |= xflags; @@ -1598,7 +1588,7 @@ bgetvp(struct vnode *vp, struct buf *bp) struct bufobj *bo; bo = &vp->v_bufobj; - ASSERT_BO_LOCKED(bo); + ASSERT_BO_WLOCKED(bo); VNASSERT(bp->b_vp == NULL, bp->b_vp, ("bgetvp: not free")); CTR3(KTR_BUF, "bgetvp(%p) vp %p flags %X", bp, vp, bp->b_flags); @@ -1657,7 +1647,7 @@ vn_syncer_add_to_worklist(struct bufobj { int slot; - ASSERT_BO_LOCKED(bo); + ASSERT_BO_WLOCKED(bo); mtx_lock(&sync_mtx); if (bo->bo_flag & BO_ONWORKLST) @@ -2422,7 +2412,7 @@ vdropl(struct vnode *vp) rangelock_destroy(&vp->v_rl); lockdestroy(vp->v_vnlock); mtx_destroy(&vp->v_interlock); - mtx_destroy(BO_MTX(bo)); + rw_destroy(BO_LOCKPTR(bo)); uma_zfree(vnode_zone, vp); } Modified: head/sys/nfsclient/nfs_subs.c ============================================================================== --- head/sys/nfsclient/nfs_subs.c Fri May 31 00:31:45 2013 (r251170) +++ head/sys/nfsclient/nfs_subs.c Fri May 31 00:43:41 2013 (r251171) @@ -56,6 +56,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include #include Modified: head/sys/nfsclient/nfs_vnops.c ============================================================================== --- head/sys/nfsclient/nfs_vnops.c Fri May 31 00:31:45 2013 (r251170) +++ head/sys/nfsclient/nfs_vnops.c Fri May 31 00:43:41 2013 (r251171) @@ -3177,7 +3177,7 @@ loop: error = BUF_TIMELOCK(bp, LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK, - BO_MTX(bo), "nfsfsync", slpflag, slptimeo); + BO_LOCKPTR(bo), "nfsfsync", slpflag, slptimeo); if (error == 0) { BUF_UNLOCK(bp); goto loop; Modified: head/sys/nfsserver/nfs_serv.c ============================================================================== --- head/sys/nfsserver/nfs_serv.c Fri May 31 00:31:45 2013 (r251170) +++ head/sys/nfsserver/nfs_serv.c Fri May 31 00:43:41 2013 (r251171) @@ -3387,7 +3387,7 @@ nfsrv_commit(struct nfsrv_descript *nfsd */ if ((bp = gbincore(&vp->v_bufobj, lblkno)) != NULL) { if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_SLEEPFAIL | - LK_INTERLOCK, BO_MTX(bo)) == ENOLCK) { + LK_INTERLOCK, BO_LOCKPTR(bo)) == ENOLCK) { BO_LOCK(bo); continue; /* retry */ } Modified: head/sys/sys/buf.h ============================================================================== --- head/sys/sys/buf.h Fri May 31 00:31:45 2013 (r251170) +++ head/sys/sys/buf.h Fri May 31 00:43:41 2013 (r251171) @@ -215,7 +215,7 @@ struct buf { #define B_RELBUF 0x00400000 /* Release VMIO buffer. */ #define B_00800000 0x00800000 /* Available flag. */ #define B_NOCOPY 0x01000000 /* Don't copy-on-write this buf. */ -#define B_02000000 0x02000000 /* Available flag. */ +#define B_INFREECNT 0x02000000 /* buf is counted in numfreebufs */ #define B_PAGING 0x04000000 /* volatile paging I/O -- bypass VMIO */ #define B_MANAGED 0x08000000 /* Managed by FS. */ #define B_RAM 0x10000000 /* Read ahead mark (flag) */ @@ -224,7 +224,7 @@ struct buf { #define B_REMFREE 0x80000000 /* Delayed bremfree */ #define PRINT_BUF_FLAGS "\20\40remfree\37cluster\36vmio\35ram\34managed" \ - "\33paging\32needsgiant\31nocopy\30b23\27relbuf\26dirty\25b20" \ + "\33paging\32infreecnt\31nocopy\30b23\27relbuf\26dirty\25b20" \ "\24b19\23b18\22clusterok\21malloc\20nocache\17b14\16inval" \ "\15b12\14b11\13eintr\12done\11persist\10delwri\7validsuspwrt" \ "\6cache\5deferred\4direct\3async\2needcommit\1age" @@ -248,9 +248,8 @@ struct buf { #define BV_SCANNED 0x00000001 /* VOP_FSYNC funcs mark written bufs */ #define BV_BKGRDINPROG 0x00000002 /* Background write in progress */ #define BV_BKGRDWAIT 0x00000004 /* Background write waiting */ -#define BV_INFREECNT 0x80000000 /* buf is counted in numfreebufs */ -#define PRINT_BUF_VFLAGS "\20\40infreecnt\3bkgrdwait\2bkgrdinprog\1scanned" +#define PRINT_BUF_VFLAGS "\20\3bkgrdwait\2bkgrdinprog\1scanned" #ifdef _KERNEL /* @@ -271,7 +270,7 @@ extern const char *buf_wmesg; /* Defaul * Get a lock sleeping non-interruptably until it becomes available. */ #define BUF_LOCK(bp, locktype, interlock) \ - _lockmgr_args(&(bp)->b_lock, (locktype), (interlock), \ + _lockmgr_args_rw(&(bp)->b_lock, (locktype), (interlock), \ LK_WMESG_DEFAULT, LK_PRIO_DEFAULT, LK_TIMO_DEFAULT, \ LOCK_FILE, LOCK_LINE) @@ -279,7 +278,7 @@ extern const char *buf_wmesg; /* Defaul * Get a lock sleeping with specified interruptably and timeout. */ #define BUF_TIMELOCK(bp, locktype, interlock, wmesg, catch, timo) \ - _lockmgr_args(&(bp)->b_lock, (locktype) | LK_TIMELOCK, \ + _lockmgr_args_rw(&(bp)->b_lock, (locktype) | LK_TIMELOCK, \ (interlock), (wmesg), (PRIBIO + 4) | (catch), (timo), \ LOCK_FILE, LOCK_LINE) Modified: head/sys/sys/bufobj.h ============================================================================== --- head/sys/sys/bufobj.h Fri May 31 00:31:45 2013 (r251170) +++ head/sys/sys/bufobj.h Fri May 31 00:43:41 2013 (r251171) @@ -53,7 +53,7 @@ #include #include -#include +#include #include struct bufobj; @@ -89,7 +89,7 @@ struct buf_ops { #define BO_BDFLUSH(bo, bp) ((bo)->bo_ops->bop_bdflush((bo), (bp))) struct bufobj { - struct mtx bo_mtx; /* Mutex which protects "i" things */ + struct rwlock bo_lock; /* Lock which protects "i" things */ struct buf_ops *bo_ops; /* - Buffer operations */ struct vm_object *bo_object; /* v Place to store VM object */ LIST_ENTRY(bufobj) bo_synclist; /* S dirty vnode list */ @@ -113,11 +113,14 @@ struct bufobj { #define BO_ONWORKLST (1 << 0) /* On syncer work-list */ #define BO_WWAIT (1 << 1) /* Wait for output to complete */ -#define BO_MTX(bo) (&(bo)->bo_mtx) -#define BO_LOCK(bo) mtx_lock(BO_MTX((bo))) -#define BO_UNLOCK(bo) mtx_unlock(BO_MTX((bo))) -#define ASSERT_BO_LOCKED(bo) mtx_assert(BO_MTX((bo)), MA_OWNED) -#define ASSERT_BO_UNLOCKED(bo) mtx_assert(BO_MTX((bo)), MA_NOTOWNED) +#define BO_LOCKPTR(bo) (&(bo)->bo_lock) +#define BO_LOCK(bo) rw_wlock(BO_LOCKPTR((bo))) +#define BO_UNLOCK(bo) rw_wunlock(BO_LOCKPTR((bo))) +#define BO_RLOCK(bo) rw_rlock(BO_LOCKPTR((bo))) +#define BO_RUNLOCK(bo) rw_runlock(BO_LOCKPTR((bo))) +#define ASSERT_BO_WLOCKED(bo) rw_assert(BO_LOCKPTR((bo)), RA_WLOCKED) +#define ASSERT_BO_LOCKED(bo) rw_assert(BO_LOCKPTR((bo)), RA_LOCKED) +#define ASSERT_BO_UNLOCKED(bo) rw_assert(BO_LOCKPTR((bo)), RA_UNLOCKED) void bufobj_wdrop(struct bufobj *bo); void bufobj_wref(struct bufobj *bo); Modified: head/sys/ufs/ffs/ffs_inode.c ============================================================================== --- head/sys/ufs/ffs/ffs_inode.c Fri May 31 00:31:45 2013 (r251170) +++ head/sys/ufs/ffs/ffs_inode.c Fri May 31 00:43:41 2013 (r251171) @@ -43,6 +43,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include Modified: head/sys/ufs/ffs/ffs_snapshot.c ============================================================================== --- head/sys/ufs/ffs/ffs_snapshot.c Fri May 31 00:31:45 2013 (r251170) +++ head/sys/ufs/ffs/ffs_snapshot.c Fri May 31 00:43:41 2013 (r251171) @@ -53,6 +53,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include @@ -2204,10 +2205,8 @@ ffs_bdflush(bo, bp) if (bp_bdskip) { VI_LOCK(devvp); if (!ffs_bp_snapblk(vp, nbp)) { - if (BO_MTX(bo) != VI_MTX(vp)) { - VI_UNLOCK(devvp); - BO_LOCK(bo); - } + VI_UNLOCK(devvp); + BO_LOCK(bo); BUF_UNLOCK(nbp); continue; } Modified: head/sys/ufs/ffs/ffs_softdep.c ============================================================================== --- head/sys/ufs/ffs/ffs_softdep.c Fri May 31 00:31:45 2013 (r251170) +++ head/sys/ufs/ffs/ffs_softdep.c Fri May 31 00:43:41 2013 (r251171) @@ -69,6 +69,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include #include @@ -554,7 +555,7 @@ softdep_check_suspend(struct mount *mp, (void) softdep_accdeps; bo = &devvp->v_bufobj; - ASSERT_BO_LOCKED(bo); + ASSERT_BO_WLOCKED(bo); MNT_ILOCK(mp); while (mp->mnt_secondary_writes != 0) { @@ -808,7 +809,7 @@ struct jextent { */ static void softdep_error(char *, int); static void drain_output(struct vnode *); -static struct buf *getdirtybuf(struct buf *, struct mtx *, int); +static struct buf *getdirtybuf(struct buf *, struct rwlock *, int); static void clear_remove(void); static void clear_inodedeps(void); static void unlinked_inodedep(struct mount *, struct inodedep *); @@ -1030,12 +1031,12 @@ static void softdep_disk_write_complete( static void softdep_deallocate_dependencies(struct buf *); static int softdep_count_dependencies(struct buf *bp, int); -static struct mtx lk; -MTX_SYSINIT(softdep_lock, &lk, "Softdep Lock", MTX_DEF); +static struct rwlock lk; +RW_SYSINIT(softdep_lock, &lk, "Softdep Lock"); -#define TRY_ACQUIRE_LOCK(lk) mtx_trylock(lk) -#define ACQUIRE_LOCK(lk) mtx_lock(lk) -#define FREE_LOCK(lk) mtx_unlock(lk) +#define TRY_ACQUIRE_LOCK(lk) rw_try_wlock(lk) +#define ACQUIRE_LOCK(lk) rw_wlock(lk) +#define FREE_LOCK(lk) rw_wunlock(lk) #define BUF_AREC(bp) lockallowrecurse(&(bp)->b_lock) #define BUF_NOREC(bp) lockdisablerecurse(&(bp)->b_lock) @@ -1073,7 +1074,7 @@ worklist_insert(head, item, locked) { if (locked) - mtx_assert(&lk, MA_OWNED); + rw_assert(&lk, RA_WLOCKED); if (item->wk_state & ONWORKLIST) panic("worklist_insert: %p %s(0x%X) already on list", item, TYPENAME(item->wk_type), item->wk_state); @@ -1088,7 +1089,7 @@ worklist_remove(item, locked) { if (locked) - mtx_assert(&lk, MA_OWNED); + rw_assert(&lk, RA_WLOCKED); if ((item->wk_state & ONWORKLIST) == 0) panic("worklist_remove: %p %s(0x%X) not on list", item, TYPENAME(item->wk_type), item->wk_state); @@ -1161,7 +1162,7 @@ jwork_move(dst, src) freedep = freedep_merge(WK_FREEDEP(wk), freedep); } - mtx_assert(&lk, MA_OWNED); + rw_assert(&lk, RA_WLOCKED); while ((wk = LIST_FIRST(src)) != NULL) { WORKLIST_REMOVE(wk); WORKLIST_INSERT(dst, wk); @@ -1212,7 +1213,7 @@ workitem_free(item, type) int type; { struct ufsmount *ump; - mtx_assert(&lk, MA_OWNED); + rw_assert(&lk, RA_WLOCKED); #ifdef DEBUG if (item->wk_state & ONWORKLIST) @@ -1428,7 +1429,7 @@ softdep_flush(void) static void worklist_speedup(void) { - mtx_assert(&lk, MA_OWNED); + rw_assert(&lk, RA_WLOCKED); if (req_pending == 0) { req_pending = 1; wakeup(&req_pending); @@ -1462,7 +1463,7 @@ add_to_worklist(wk, flags) { struct ufsmount *ump; - mtx_assert(&lk, MA_OWNED); + rw_assert(&lk, RA_WLOCKED); ump = VFSTOUFS(wk->wk_mp); if (wk->wk_state & ONWORKLIST) panic("add_to_worklist: %s(0x%X) already on list", @@ -1604,7 +1605,7 @@ process_removes(vp) struct mount *mp; ino_t inum; - mtx_assert(&lk, MA_OWNED); + rw_assert(&lk, RA_WLOCKED); mp = vp->v_mount; inum = VTOI(vp)->i_number; @@ -1654,7 +1655,7 @@ process_truncates(vp) ino_t inum; int cgwait; - mtx_assert(&lk, MA_OWNED); + rw_assert(&lk, RA_WLOCKED); mp = vp->v_mount; inum = VTOI(vp)->i_number; @@ -1727,7 +1728,7 @@ process_worklist_item(mp, target, flags) int matchcnt; int error; - mtx_assert(&lk, MA_OWNED); + rw_assert(&lk, RA_WLOCKED); KASSERT(mp != NULL, ("process_worklist_item: NULL mp")); /* * If we are being called because of a process doing a @@ -2061,7 +2062,7 @@ pagedep_lookup(mp, bp, ino, lbn, flags, int ret; int i; - mtx_assert(&lk, MA_OWNED); + rw_assert(&lk, RA_WLOCKED); if (bp) { LIST_FOREACH(wk, &bp->b_dep, wk_list) { if (wk->wk_type == D_PAGEDEP) { @@ -2150,7 +2151,7 @@ inodedep_lookup(mp, inum, flags, inodede struct inodedep_hashhead *inodedephd; struct fs *fs; - mtx_assert(&lk, MA_OWNED); + rw_assert(&lk, RA_WLOCKED); fs = VFSTOUFS(mp)->um_fs; inodedephd = INODEDEP_HASH(fs, inum); @@ -2704,7 +2705,7 @@ add_to_journal(wk) { struct ufsmount *ump; - mtx_assert(&lk, MA_OWNED); + rw_assert(&lk, RA_WLOCKED); ump = VFSTOUFS(wk->wk_mp); if (wk->wk_state & ONWORKLIST) panic("add_to_journal: %s(0x%X) already on list", @@ -2730,7 +2731,7 @@ remove_from_journal(wk) { struct ufsmount *ump; - mtx_assert(&lk, MA_OWNED); + rw_assert(&lk, RA_WLOCKED); ump = VFSTOUFS(wk->wk_mp); #ifdef SUJ_DEBUG { @@ -2898,7 +2899,7 @@ softdep_prelink(dvp, vp) struct ufsmount *ump; ump = VFSTOUFS(dvp->v_mount); - mtx_assert(&lk, MA_OWNED); + rw_assert(&lk, RA_WLOCKED); /* * Nothing to do if we have sufficient journal space. * If we currently hold the snapshot lock, we must avoid @@ -4986,7 +4987,7 @@ bmsafemap_lookup(mp, bp, cg, newbmsafema struct worklist *wk; struct fs *fs; - mtx_assert(&lk, MA_OWNED); + rw_assert(&lk, RA_WLOCKED); KASSERT(bp != NULL, ("bmsafemap_lookup: missing buffer")); LIST_FOREACH(wk, &bp->b_dep, wk_list) { if (wk->wk_type == D_BMSAFEMAP) { @@ -5257,7 +5258,7 @@ allocdirect_merge(adphead, newadp, oldad struct freefrag *freefrag; freefrag = NULL; - mtx_assert(&lk, MA_OWNED); + rw_assert(&lk, RA_WLOCKED); if (newadp->ad_oldblkno != oldadp->ad_newblkno || newadp->ad_oldsize != oldadp->ad_newsize || newadp->ad_offset >= NDADDR) @@ -5718,7 +5719,7 @@ indirdep_lookup(mp, ip, bp) struct fs *fs; ufs2_daddr_t blkno; - mtx_assert(&lk, MA_OWNED); + rw_assert(&lk, RA_WLOCKED); indirdep = NULL; newindirdep = NULL; fs = ip->i_fs; @@ -5797,7 +5798,7 @@ setup_allocindir_phase2(bp, ip, inodedep struct freefrag *freefrag; struct mount *mp; - mtx_assert(&lk, MA_OWNED); + rw_assert(&lk, RA_WLOCKED); mp = UFSTOVFS(ip->i_ump); fs = ip->i_fs; if (bp->b_lblkno >= 0) @@ -6130,7 +6131,7 @@ complete_trunc_indir(freework) BUF_UNLOCK(bp); ACQUIRE_LOCK(&lk); } - mtx_assert(&lk, MA_OWNED); + rw_assert(&lk, RA_WLOCKED); freework->fw_state |= DEPCOMPLETE; TAILQ_REMOVE(&indirdep->ir_trunc, freework, fw_next); /* @@ -6874,7 +6875,7 @@ restart: *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***