From owner-svn-src-all@FreeBSD.ORG Fri Oct 23 19:53:05 2009 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 12C6B1065670; Fri, 23 Oct 2009 19:53:05 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id 008F78FC18; Fri, 23 Oct 2009 19:53:05 +0000 (UTC) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.3/8.14.3) with ESMTP id n9NJr4pA070676; Fri, 23 Oct 2009 19:53:04 GMT (envelope-from jhb@svn.freebsd.org) Received: (from jhb@localhost) by svn.freebsd.org (8.14.3/8.14.3/Submit) id n9NJr4gW070674; Fri, 23 Oct 2009 19:53:04 GMT (envelope-from jhb@svn.freebsd.org) Message-Id: <200910231953.n9NJr4gW070674@svn.freebsd.org> From: John Baldwin Date: Fri, 23 Oct 2009 19:53:04 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-7@freebsd.org X-SVN-Group: stable-7 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r198425 - in stable/7/sys: . contrib/pf nfsclient X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Oct 2009 19:53:05 -0000 Author: jhb Date: Fri Oct 23 19:53:04 2009 New Revision: 198425 URL: http://svn.freebsd.org/changeset/base/198425 Log: MFC 198174: Close a race with caching of -ve name lookups in the NFS client. Specifically, clients only trust -ve cache entries while the directory remains unchanged and discard any -ve cache entries for a directory when they notice that the modification time of a directory entry changes. The race involves two concurrent lookups as follows: - Thread A does a lookup for file 'foo' which sends a lookup RPC to the server. The lookup fails and the server replies. - The 'foo' file is created (either by the same client or a different client) updating the modification time on the parent directory of 'foo'. - Thread B does a lookup for a different file 'bar' which updates the cached attributes of the parent directory of 'foo' to reflect the new modification time after 'foo' was created. - Thread A finally resumes execution to parse the reply from the NFS server. It adds a -ve cache entry and sets the cached value of the directory's modification time that is used for invalidating -ve cached lookups to the new modification time set by thread B. At this point, future lookups of 'foo' will honor the -ve cached entry until the cached entry is pushed out of the name cache's LRU or the modification time of the parent directory is changed again by some other change. The fix is to read the directory's modification time before sending the lookup RPC and use that cached modification time when setting the directory's cached modification time. Also, we do not add a -ve cache entry if another thread has added -ve cache entry that set the directory's cached modification time to a newer value than the value we read before sending the lookup RPC. Modified: stable/7/sys/ (props changed) stable/7/sys/contrib/pf/ (props changed) stable/7/sys/nfsclient/nfs_vnops.c Modified: stable/7/sys/nfsclient/nfs_vnops.c ============================================================================== --- stable/7/sys/nfsclient/nfs_vnops.c Fri Oct 23 19:52:29 2009 (r198424) +++ stable/7/sys/nfsclient/nfs_vnops.c Fri Oct 23 19:53:04 2009 (r198425) @@ -863,6 +863,7 @@ nfs_lookup(struct vop_lookup_args *ap) struct vnode *dvp = ap->a_dvp; struct vnode **vpp = ap->a_vpp; struct vattr vattr; + time_t dmtime; int flags = cnp->cn_flags; struct vnode *newvp; struct nfsmount *nmp; @@ -874,7 +875,7 @@ nfs_lookup(struct vop_lookup_args *ap) int error = 0, attrflag, fhsize; int v3 = NFS_ISV3(dvp); struct thread *td = cnp->cn_thread; - + *vpp = NULLVP; if ((flags & ISLASTCN) && (dvp->v_mount->mnt_flag & MNT_RDONLY) && (cnp->cn_nameiop == DELETE || cnp->cn_nameiop == RENAME)) @@ -929,6 +930,19 @@ nfs_lookup(struct vop_lookup_args *ap) np->n_dmtime = 0; mtx_unlock(&np->n_mtx); } + + /* + * Cache the modification time of the parent directory in case + * the lookup fails and results in adding the first negative + * name cache entry for the directory. Since this is reading + * a single time_t, don't bother with locking. The + * modification time may be a bit stale, but it must be read + * before performing the lookup RPC to prevent a race where + * another lookup updates the timestamp on the directory after + * the lookup RPC has been performed on the server but before + * n_dmtime is set at the end of this function. + */ + dmtime = np->n_vattr.va_mtime.tv_sec; error = 0; newvp = NULLVP; nfsstats.lookupcache_misses++; @@ -1036,13 +1050,25 @@ nfsmout: * Maintain n_dmtime as the modification time * of the parent directory when the oldest -ve * name cache entry for this directory was - * added. + * added. If a -ve cache entry has already + * been added with a newer modification time + * by a concurrent lookup, then don't bother + * adding a cache entry. The modification + * time of the directory might have changed + * due to the file this lookup failed to find + * being created. In that case a subsequent + * lookup would incorrectly use the entry + * added here instead of doing an extra + * lookup. */ mtx_lock(&np->n_mtx); - if (np->n_dmtime == 0) - np->n_dmtime = np->n_vattr.va_mtime.tv_sec; - mtx_unlock(&np->n_mtx); - cache_enter(dvp, NULL, cnp); + if (np->n_dmtime <= dmtime) { + if (np->n_dmtime == 0) + np->n_dmtime = dmtime; + mtx_unlock(&np->n_mtx); + cache_enter(dvp, NULL, cnp); + } else + mtx_unlock(&np->n_mtx); } return (ENOENT); }