Date: Sun, 06 Nov 2011 08:25:13 -0800 From: Josh Paetzel <josh@tcbug.org> To: Rick Macklem <rmacklem@uoguelph.ca> Cc: Josh Paetzel <jpaetzel@freebsd.org>, freebsd-fs@freebsd.org, zkirsch@freebsd.org, Ronald Klop <ronald-freebsd8@klop.yi.org> Subject: Re: [RFC] Should vfs.nfsrv.async be implemented for new NFS server? Message-ID: <4EB6B4E9.1000804@tcbug.org> In-Reply-To: <1391798614.1239830.1320593648931.JavaMail.root@erie.cs.uoguelph.ca> References: <1391798614.1239830.1320593648931.JavaMail.root@erie.cs.uoguelph.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
On 11/06/11 07:34, Rick Macklem wrote: > Ronald Klop wrote: >> On Sun, 06 Nov 2011 02:18:05 +0100, Rick Macklem >> <rmacklem@uoguelph.ca> >> wrote: >> >>> Hi, >>> >>> Josh Paetzel pointed out that vfs.nfsrv.async doesn't exist >>> for the new NFS server. >>> >>> I don't think I had spotted this before, but when I looked I >>> saw that, when vfs.nfsrv.async is set non-zero in the old server, >>> it returns FILESYNC (which means the write has been committed to >>> non-volatile storage) even when it hasn't actually done that. >>> >>> This can improve performance, but has some negative implications: >>> - If the server crashes before the write is committed to >>> non-volatile storage, the file modification will be lost. >>> (When a server replies UNSTABLE to a write, the client holds >>> onto the data in its cache and does the write again if the >>> server crashes/reboots before the client does a Commit RPC >>> for the file. However, a reply of FILESYNC tells the client >>> it can forget about the write, because it is done.) >>> - Because of the above, replying FILESYNC when the data is not >>> yet committed to non-volatile (also referred to as stable) >>> storage, this is a violation of RFC1813. >> >> Just out of curiosity. Why would lying about FILESYNC improve >> performance >> over UNSTABLE? The server does the same work. Only the client holds >> data >> longer in memory. I only see impact if the client has just a little >> bit of >> memory. >> >> Ronald. > Well, I'm not sure I have an answer. Josh noted that it makes a big > difference for them. Maybe he can elaborate? > I'll test it out and report back in the next week or so. In 8.x, setting the async sysctl was the difference between 80-100MB/sec and 800 MB/sec (Yes, MegaBytes!) using a variety of different clients, including the VMWare ESXi 4.x client, Xen 5.6 client, various linux clients and the FreeBSD client. I'll note that 800MB/sec is getting close to the underlying filesystem performance, so it's likely that the gate to performance is in the filesystem in that case. 80-100MB/sec is basically gigE performance. I can make hardware available if anyone is curious at poking at this, we have the ability to set up tests with quad gigE LACP, 10 gigE, and numerous clients. > One additional effect is that the client in head must do a synchronous > write (with FILESYNC and waiting for the RPC reply) before it can > modify a non-continuous region of the same buffer with respect to > the old dirty byte region. (This happens > frequently during builds, done mostly by the loader, I think?) > If the server replies FILESYNC, then the old dirty byte region is done > (ie. no longer a dirty byte region) so the client doesn't > have to do the synchronous write described above. > I hope that the experimental patch I made available a few days ago, > along with work jhb@ is doing will eventually fix this for the FreeBSD > client, but it won't be in head anytime soon (and who knows what > other clients do?). > > rick > -- Thanks, Josh Paetzel
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4EB6B4E9.1000804>