From owner-freebsd-fs@FreeBSD.ORG Sat Mar 10 09:50:07 2007 Return-Path: X-Original-To: fs@freebsd.org Delivered-To: freebsd-fs@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3D3A316A401; Sat, 10 Mar 2007 09:50:07 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailout1.pacific.net.au (mailout1-3.pacific.net.au [61.8.2.210]) by mx1.freebsd.org (Postfix) with ESMTP id D050C13C442; Sat, 10 Mar 2007 09:50:06 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailproxy1.pacific.net.au (mailproxy1.pacific.net.au [61.8.2.162]) by mailout1.pacific.net.au (Postfix) with ESMTP id 739705A0D8C; Sat, 10 Mar 2007 20:50:05 +1100 (EST) Received: from besplex.bde.org (katana.zip.com.au [61.8.7.246]) by mailproxy1.pacific.net.au (Postfix) with ESMTP id 0BDA38C0E; Sat, 10 Mar 2007 20:50:03 +1100 (EST) Date: Sat, 10 Mar 2007 20:50:02 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Mohan Srinivasan In-Reply-To: <200703090402.l2942cow003127@repoman.freebsd.org> Message-ID: <20070310184944.H9950@besplex.bde.org> References: <200703090402.l2942cow003127@repoman.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: src-committers@FreeBSD.org, fs@freebsd.org Subject: Re: cvs commit: src/sys/amd64/amd64 trap.c src/sys/i386/i386 trap.c src/sys/ia64/ia64 trap.c src/sys/kern kern_thread.c src/sys/nfsclient nfs_socket.c nfs_subs.c nfs_vnops.c nfsnode.h src/sys/powerpc/powerpc trap.c src/sys/sparc64/sparc64 trap.c ... X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Mar 2007 09:50:07 -0000 On Fri, 9 Mar 2007, Mohan Srinivasan wrote: > mohans 2007-03-09 04:02:38 UTC > > FreeBSD src repository > > Modified files: > sys/amd64/amd64 trap.c > sys/i386/i386 trap.c > sys/ia64/ia64 trap.c > sys/kern kern_thread.c > sys/nfsclient nfs_socket.c nfs_subs.c nfs_vnops.c > nfsnode.h > sys/powerpc/powerpc trap.c > sys/sparc64/sparc64 trap.c > sys/sys proc.h > Log: > Over NFS, an open() call could result in multiple over-the-wire > GETATTRs being generated - one from lookup()/namei() and the other > from nfs_open() (for cto consistency). This change eliminates the > GETATTR in nfs_open() if an otw GETATTR was done from the namei() > path. Instead of extending the vop interface, we timestamp each attr > load, and use this to detect whether a GETATTR was done from namei() > for this syscall. Introduces a thread-local variable that counts the > syscalls made by the thread and uses as > the attrload timestamp. Thanks to jhb@ and peter@ for a discussion on > thread state that could be used as the timestamp with minimal overhead. > > Revision Changes Path > 1.314 +2 -0 src/sys/amd64/amd64/trap.c > 1.299 +2 -0 src/sys/i386/i386/trap.c > 1.126 +2 -0 src/sys/ia64/ia64/trap.c > 1.241 +1 -0 src/sys/kern/kern_thread.c > 1.151 +1 -1 src/sys/nfsclient/nfs_socket.c > 1.145 +9 -0 src/sys/nfsclient/nfs_subs.c > 1.274 +10 -1 src/sys/nfsclient/nfs_vnops.c > 1.60 +11 -0 src/sys/nfsclient/nfsnode.h > 1.64 +2 -0 src/sys/powerpc/powerpc/trap.c > 1.87 +3 -0 src/sys/sparc64/sparc64/trap.c > 1.473 +1 -0 src/sys/sys/proc.h Er, I posted beter fixes for this to freebsd-fs last October after I first reported the problem. Preliminary testing indicates that the above changes result in the same attribute cache clearings in nfs_open() as in my version, but this may be because attribute cache clearings in nfs_open() should be, and now are, so rare that they haven't happened yet. My version clears the attribute cache in nfs_lookup() for open(). Clearing it in nfs_close() is too early and clearing it in nfs_open() is too late. However, clearing it in nfs_open() is necessary in some rare cases that don't go through nfs_lookup() for open(). After nfs_lookup() or something else fetches the attributes and sets the attribute flag(s) for open(), there is a race completing the open(). There may be remote activity on the file, or, if the open() is non-exclusive, then there may be local activity on the file. I don't see the point of using a fancy timestamp just to _detect_ that the attributes have changed after nfs_lookup() fetched them and then waste time fetching them again: - If the open() is exclusive, then the external attributes may have changed, but we won't see such changes unless the open() takes several seconds to complete. We will complete the open() using consistent attributes. - If the open() is non-exclusive, then the attributes might change underneath us. (Shared locking should prevent all local changes to the file except atimes, so except for atimes the attributes can only change if another thread fetches attributes that have changed remotely.) Then refetching the attributes just wastes time and increases any problems caused by this, by giving a newer set of attributes that may have changed more. In particular, the attributes may have changed so that the permissions checks passed in lookup() are no longer valid. If we care about this, then we should restart the whole open(), and my version needs a generation count to detect it. However, we can't reasonably detect late changes to attributes of all components of the file's pathname, so I think this doesn't matter. - The case of non-exclusive probably hasn't been tested much and has lots of bugs in it, since it is not the default (options LOOKUP_SHARED). Bruce