From owner-freebsd-fs@FreeBSD.ORG Sun Nov 6 16:41:22 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C6C29106566C for ; Sun, 6 Nov 2011 16:41:22 +0000 (UTC) (envelope-from josh@tcbug.org) Received: from out3.smtp.messagingengine.com (out3.smtp.messagingengine.com [66.111.4.27]) by mx1.freebsd.org (Postfix) with ESMTP id 7EE158FC0A for ; Sun, 6 Nov 2011 16:41:22 +0000 (UTC) Received: from compute2.internal (compute2.nyi.mail.srv.osa [10.202.2.42]) by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id F41F721294; Sun, 6 Nov 2011 11:25:16 -0500 (EST) Received: from frontend1.nyi.mail.srv.osa ([10.202.2.160]) by compute2.internal (MEProxy); Sun, 06 Nov 2011 11:25:17 -0500 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=message-id:date:from:mime-version:to:cc :subject:references:in-reply-to:content-type :content-transfer-encoding; s=smtpout; bh=zhO4xGW00KmwljTsF7moQB k4K10=; b=JRuE8aN5aYOmcscGQf+nr7/B1WdDUzfsfnrdo9qeRJTmU3gR0ND6rX 8EyuPhozU9IRSBbcDdlEDNIhm5RyOwwuFozhOgNb/1oppuwxFGSxSV3t3/J9u71J yUtXmMfUxDy7Odr7o5NmGv32HhTDklv0dWbHHl9tOmBkjrbzAWyB8= X-Sasl-enc: Z2nFPsszflqr+i2nNYkDV6r3FTeD3y4O0lmS1+C6XyJ1 1320596716 Received: from roadrash.tcbug.org (unknown [216.139.7.151]) by mail.messagingengine.com (Postfix) with ESMTPSA id 2C86F8E105F; Sun, 6 Nov 2011 11:25:16 -0500 (EST) Message-ID: <4EB6B4E9.1000804@tcbug.org> Date: Sun, 06 Nov 2011 08:25:13 -0800 From: Josh Paetzel User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:7.0.1) Gecko/20111003 Thunderbird/7.0.1 MIME-Version: 1.0 To: Rick Macklem References: <1391798614.1239830.1320593648931.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <1391798614.1239830.1320593648931.JavaMail.root@erie.cs.uoguelph.ca> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: Josh Paetzel , freebsd-fs@freebsd.org, zkirsch@freebsd.org, Ronald Klop Subject: Re: [RFC] Should vfs.nfsrv.async be implemented for new NFS server? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Nov 2011 16:41:22 -0000 On 11/06/11 07:34, Rick Macklem wrote: > Ronald Klop wrote: >> On Sun, 06 Nov 2011 02:18:05 +0100, Rick Macklem >> >> wrote: >> >>> Hi, >>> >>> Josh Paetzel pointed out that vfs.nfsrv.async doesn't exist >>> for the new NFS server. >>> >>> I don't think I had spotted this before, but when I looked I >>> saw that, when vfs.nfsrv.async is set non-zero in the old server, >>> it returns FILESYNC (which means the write has been committed to >>> non-volatile storage) even when it hasn't actually done that. >>> >>> This can improve performance, but has some negative implications: >>> - If the server crashes before the write is committed to >>> non-volatile storage, the file modification will be lost. >>> (When a server replies UNSTABLE to a write, the client holds >>> onto the data in its cache and does the write again if the >>> server crashes/reboots before the client does a Commit RPC >>> for the file. However, a reply of FILESYNC tells the client >>> it can forget about the write, because it is done.) >>> - Because of the above, replying FILESYNC when the data is not >>> yet committed to non-volatile (also referred to as stable) >>> storage, this is a violation of RFC1813. >> >> Just out of curiosity. Why would lying about FILESYNC improve >> performance >> over UNSTABLE? The server does the same work. Only the client holds >> data >> longer in memory. I only see impact if the client has just a little >> bit of >> memory. >> >> Ronald. > Well, I'm not sure I have an answer. Josh noted that it makes a big > difference for them. Maybe he can elaborate? > I'll test it out and report back in the next week or so. In 8.x, setting the async sysctl was the difference between 80-100MB/sec and 800 MB/sec (Yes, MegaBytes!) using a variety of different clients, including the VMWare ESXi 4.x client, Xen 5.6 client, various linux clients and the FreeBSD client. I'll note that 800MB/sec is getting close to the underlying filesystem performance, so it's likely that the gate to performance is in the filesystem in that case. 80-100MB/sec is basically gigE performance. I can make hardware available if anyone is curious at poking at this, we have the ability to set up tests with quad gigE LACP, 10 gigE, and numerous clients. > One additional effect is that the client in head must do a synchronous > write (with FILESYNC and waiting for the RPC reply) before it can > modify a non-continuous region of the same buffer with respect to > the old dirty byte region. (This happens > frequently during builds, done mostly by the loader, I think?) > If the server replies FILESYNC, then the old dirty byte region is done > (ie. no longer a dirty byte region) so the client doesn't > have to do the synchronous write described above. > I hope that the experimental patch I made available a few days ago, > along with work jhb@ is doing will eventually fix this for the FreeBSD > client, but it won't be in head anytime soon (and who knows what > other clients do?). > > rick > -- Thanks, Josh Paetzel