From owner-freebsd-fs@FreeBSD.ORG Thu Jan 19 14:06:25 2012 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 332631065674; Thu, 19 Jan 2012 14:06:25 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id BB3F98FC1C; Thu, 19 Jan 2012 14:06:24 +0000 (UTC) Received: from skuns.kiev.zoral.com.ua (localhost [127.0.0.1]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id q0JE6GGZ029500 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 19 Jan 2012 16:06:16 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5) with ESMTP id q0JE6F2w078052; Thu, 19 Jan 2012 16:06:15 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5/Submit) id q0JE6Fkn078051; Thu, 19 Jan 2012 16:06:15 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 19 Jan 2012 16:06:13 +0200 From: Kostik Belousov To: John Baldwin Message-ID: <20120119140613.GD31224@deviant.kiev.zoral.com.ua> References: <201201181707.21293.jhb@freebsd.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="VetWPALTXTOCTxkp" Content-Disposition: inline In-Reply-To: <201201181707.21293.jhb@freebsd.org> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-3.9 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: Rick Macklem , fs@freebsd.org, Peter Wemm Subject: Re: Race in NFS lookup can result in stale namecache entries X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Jan 2012 14:06:25 -0000 --VetWPALTXTOCTxkp Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jan 18, 2012 at 05:07:21PM -0500, John Baldwin wrote: =2E.. > What I concluded is that it would really be far simpler and more > obvious if the cached timestamps were stored in the namecache entry > directly rather than having multiple name cache entries validated by > shared state in the nfsnode. This does mean allowing the name cache > to hold some filesystem-specific state. However, I felt this was much > cleaner than adding a lot more complexity to nfs_lookup(). Also, this > turns out to be fairly non-invasive to implement since nfs_lookup() > calls cache_lookup() directly, but other filesystems only call it > indirectly via vfs_cache_lookup(). I considered letting filesystems > store a void * cookie in the name cache entry and having them provide > a destructor, etc. However, that would require extra allocations for > NFS lookups. Instead, I just adjusted the name cache API to > explicitly allow the filesystem to store a single timestamp in a name > cache entry by adding a new 'cache_enter_time()' that accepts a struct > timespec that is copied into the entry. 'cache_enter_time()' also > saves the current value of 'ticks' in the entry. 'cache_lookup()' is > modified to add two new arguments used to return the timespec and > ticks value used for a namecache entry when a hit in the cache occurs. >=20 > One wrinkle with this is that the name cache does not create actual > entries for ".", and thus it would not store any timestamps for those > lookups. To fix this I changed the NFS client to explicitly fast-path > lookups of "." by always returning the current directory as setup by > cache_lookup() and never bothering to do a LOOKUP or check for stale > attributes in that case. >=20 > The current patch against 8 is at > http://www.FreeBSD.org/~jhb/patches/nfs_lookup.patch =2E.. So now you add 8*2+4 bytes to each namecache entry on amd64 unconditionally. Current size of the struct namecache invariant part on amd64 is 72 bytes, so addition of 20 bytes looks slightly excessive. I am not sure about typical distribution of the namecache nc_name length, so it is unobvious does the change changes the memory usage significantly. A flag could be added to nc_flags to indicate the presence of timestamp. The timestamps would be conditionally placed after nc_nlen, we probably could use union to ease the access. Then, the direct dereferences of nc_name would need to be converted to some inline function. I can do this after your patch is committed, if you consider the memory usage saving worth it. --VetWPALTXTOCTxkp Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (FreeBSD) iEYEARECAAYFAk8YI1UACgkQC3+MBN1Mb4gaBgCeM1EsgbmWanasw8Mk4UO03o6J oikAnikR7N6x4S9ePHlDOrYNc0u2ihqc =7esr -----END PGP SIGNATURE----- --VetWPALTXTOCTxkp--