From owner-svn-src-stable@FreeBSD.ORG Fri Mar 18 17:15:10 2011 Return-Path: Delivered-To: svn-src-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 797941065677; Fri, 18 Mar 2011 17:15:10 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id 6B5228FC22; Fri, 18 Mar 2011 17:15:10 +0000 (UTC) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.3/8.14.3) with ESMTP id p2IHFAeZ074878; Fri, 18 Mar 2011 17:15:10 GMT (envelope-from jhb@svn.freebsd.org) Received: (from jhb@localhost) by svn.freebsd.org (8.14.3/8.14.3/Submit) id p2IHFATc074875; Fri, 18 Mar 2011 17:15:10 GMT (envelope-from jhb@svn.freebsd.org) Message-Id: <201103181715.p2IHFATc074875@svn.freebsd.org> From: John Baldwin Date: Fri, 18 Mar 2011 17:15:10 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r219744 - stable/8/sys/ufs/ufs X-BeenThere: svn-src-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for all the -stable branches of the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Mar 2011 17:15:10 -0000 Author: jhb Date: Fri Mar 18 17:15:10 2011 New Revision: 219744 URL: http://svn.freebsd.org/changeset/base/219744 Log: MFC 219384: The UFS dirhash code was attempting to update shared state in the dirhash from multiple threads while holding a shared lock during a lookup operation. This could result in incorrect ENOENT failures which could then be permanently stored in the name cache. Specifically, the dirhash code optimizes the case that a single thread is walking a directory sequentially opening (or stat'ing) each file. It uses state in the dirhash structure to determine if a given lookup is using the optimization. If the optimization fails, it disables it and restarts the lookup. The problem arises when two threads both attempt the optimization and fail. The first thread will restart the loop, but the second thread will incorrectly think that it did not try the optimization and will only examine a subset of the directory entires in its hash chain. As a result, it may fail to find its directory entry and incorrectly fail with ENOENT. To make this safe for use with shared locks, simplify the state stored in the dirhash and move some of the state (the part that determines if the current thread is trying the optimization) into a local variable. One result is that we will now try the optimization more often. We still update the value under the shared lock, but it is a single atomic store similar to i_diroff that is stored in UFS directory i-nodes for the non-dirhash lookup. Modified: stable/8/sys/ufs/ufs/dirhash.h stable/8/sys/ufs/ufs/ufs_dirhash.c Directory Properties: stable/8/sys/ (props changed) stable/8/sys/amd64/include/xen/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) stable/8/sys/contrib/dev/acpica/ (props changed) stable/8/sys/contrib/pf/ (props changed) Modified: stable/8/sys/ufs/ufs/dirhash.h ============================================================================== --- stable/8/sys/ufs/ufs/dirhash.h Fri Mar 18 16:13:08 2011 (r219743) +++ stable/8/sys/ufs/ufs/dirhash.h Fri Mar 18 17:15:10 2011 (r219744) @@ -98,7 +98,6 @@ struct dirhash { int dh_dirblks; /* number of DIRBLKSIZ blocks in dir */ int dh_firstfree[DH_NFSTATS + 1]; /* first blk with N words free */ - int dh_seqopt; /* sequential access optimisation enabled */ doff_t dh_seqoff; /* sequential access optimisation offset */ int dh_score; /* access count for this dirhash */ Modified: stable/8/sys/ufs/ufs/ufs_dirhash.c ============================================================================== --- stable/8/sys/ufs/ufs/ufs_dirhash.c Fri Mar 18 16:13:08 2011 (r219743) +++ stable/8/sys/ufs/ufs/ufs_dirhash.c Fri Mar 18 17:15:10 2011 (r219744) @@ -403,8 +403,7 @@ ufsdirhash_build(struct inode *ip) dh->dh_firstfree[i] = -1; dh->dh_firstfree[DH_NFSTATS] = 0; dh->dh_hused = 0; - dh->dh_seqopt = 0; - dh->dh_seqoff = 0; + dh->dh_seqoff = -1; dh->dh_score = DH_SCOREINIT; dh->dh_lastused = time_second; @@ -552,7 +551,7 @@ ufsdirhash_lookup(struct inode *ip, char struct direct *dp; struct vnode *vp; struct buf *bp; - doff_t blkoff, bmask, offset, prevoff; + doff_t blkoff, bmask, offset, prevoff, seqoff; int i, slot; int error; @@ -592,29 +591,30 @@ ufsdirhash_lookup(struct inode *ip, char bmask = vp->v_mount->mnt_stat.f_iosize - 1; blkoff = -1; bp = NULL; + seqoff = dh->dh_seqoff; restart: slot = ufsdirhash_hash(dh, name, namelen); - if (dh->dh_seqopt) { + if (seqoff != -1) { /* - * Sequential access optimisation. dh_seqoff contains the + * Sequential access optimisation. seqoff contains the * offset of the directory entry immediately following * the last entry that was looked up. Check if this offset * appears in the hash chain for the name we are looking for. */ for (i = slot; (offset = DH_ENTRY(dh, i)) != DIRHASH_EMPTY; i = WRAPINCR(i, dh->dh_hlen)) - if (offset == dh->dh_seqoff) + if (offset == seqoff) break; - if (offset == dh->dh_seqoff) { + if (offset == seqoff) { /* * We found an entry with the expected offset. This * is probably the entry we want, but if not, the - * code below will turn off seqopt and retry. + * code below will retry. */ slot = i; - } else - dh->dh_seqopt = 0; + } else + seqoff = -1; } for (; (offset = DH_ENTRY(dh, slot)) != DIRHASH_EMPTY; @@ -655,9 +655,7 @@ restart: *prevoffp = prevoff; } - /* Check for sequential access, and update offset. */ - if (dh->dh_seqopt == 0 && dh->dh_seqoff == offset) - dh->dh_seqopt = 1; + /* Update offset. */ dh->dh_seqoff = offset + DIRSIZ(0, dp); *bpp = bp; *offp = offset; @@ -666,11 +664,11 @@ restart: } /* - * When the name doesn't match in the seqopt case, go back - * and search normally. + * When the name doesn't match in the sequential + * optimization case, go back and search normally. */ - if (dh->dh_seqopt) { - dh->dh_seqopt = 0; + if (seqoff != -1) { + seqoff = -1; goto restart; } }