Date: Sun, 6 Nov 2011 19:47:16 -0500 (EST) From: Rick Macklem <rmacklem@uoguelph.ca> To: Josh Paetzel <josh@tcbug.org> Cc: Josh Paetzel <jpaetzel@freebsd.org>, freebsd-fs@freebsd.org, zkirsch@freebsd.org, Ronald Klop <ronald-freebsd8@klop.yi.org> Subject: Re: [RFC] Should vfs.nfsrv.async be implemented for new NFS server? Message-ID: <1093662212.1257099.1320626836299.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <4EB6B4E9.1000804@tcbug.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Josh Paetzel wrote: > On 11/06/11 07:34, Rick Macklem wrote: > > Ronald Klop wrote: > >> On Sun, 06 Nov 2011 02:18:05 +0100, Rick Macklem > >> <rmacklem@uoguelph.ca> > >> wrote: > >> > >>> Hi, > >>> > >>> Josh Paetzel pointed out that vfs.nfsrv.async doesn't exist > >>> for the new NFS server. > >>> > >>> I don't think I had spotted this before, but when I looked I > >>> saw that, when vfs.nfsrv.async is set non-zero in the old server, > >>> it returns FILESYNC (which means the write has been committed to > >>> non-volatile storage) even when it hasn't actually done that. > >>> > >>> This can improve performance, but has some negative implications: > >>> - If the server crashes before the write is committed to > >>> non-volatile storage, the file modification will be lost. > >>> (When a server replies UNSTABLE to a write, the client holds > >>> onto the data in its cache and does the write again if the > >>> server crashes/reboots before the client does a Commit RPC > >>> for the file. However, a reply of FILESYNC tells the client > >>> it can forget about the write, because it is done.) > >>> - Because of the above, replying FILESYNC when the data is not > >>> yet committed to non-volatile (also referred to as stable) > >>> storage, this is a violation of RFC1813. > >> > >> Just out of curiosity. Why would lying about FILESYNC improve > >> performance > >> over UNSTABLE? The server does the same work. Only the client holds > >> data > >> longer in memory. I only see impact if the client has just a little > >> bit of > >> memory. > >> > >> Ronald. > > Well, I'm not sure I have an answer. Josh noted that it makes a big > > difference for them. Maybe he can elaborate? > > > > I'll test it out and report back in the next week or so. > > In 8.x, setting the async sysctl was the difference between > 80-100MB/sec > and 800 MB/sec (Yes, MegaBytes!) using a variety of different clients, > including the VMWare ESXi 4.x client, Xen 5.6 client, various linux > clients and the FreeBSD client. I'll note that 800MB/sec is getting > close to the underlying filesystem performance, so it's likely that > the > gate to performance is in the filesystem in that case. 80-100MB/sec is > basically gigE performance. > Just wondering...are these tests writing a file larger than the buffer cache can hold? rick > I can make hardware available if anyone is curious at poking at this, > we > have the ability to set up tests with quad gigE LACP, 10 gigE, and > numerous clients. > > > One additional effect is that the client in head must do a > > synchronous > > write (with FILESYNC and waiting for the RPC reply) before it can > > modify a non-continuous region of the same buffer with respect to > > the old dirty byte region. (This happens > > frequently during builds, done mostly by the loader, I think?) > > If the server replies FILESYNC, then the old dirty byte region is > > done > > (ie. no longer a dirty byte region) so the client doesn't > > have to do the synchronous write described above. > > I hope that the experimental patch I made available a few days ago, > > along with work jhb@ is doing will eventually fix this for the > > FreeBSD > > client, but it won't be in head anytime soon (and who knows what > > other clients do?). > > > > rick > > > > > -- > Thanks, > > Josh Paetzel
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1093662212.1257099.1320626836299.JavaMail.root>