From owner-freebsd-fs@FreeBSD.ORG Thu Aug 25 23:03:05 2011 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BFDBE1065766; Thu, 25 Aug 2011 23:03:05 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74]) by mx1.freebsd.org (Postfix) with ESMTP id 67F8A8FC1F; Thu, 25 Aug 2011 23:03:05 +0000 (UTC) Received: from freddy.simplesystems.org (freddy.simplesystems.org [65.66.246.65]) by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id p7PMpFtX007607; Thu, 25 Aug 2011 17:51:15 -0500 (CDT) Date: Thu, 25 Aug 2011 17:51:15 -0500 (CDT) From: Bob Friesenhahn X-X-Sender: bfriesen@freddy.simplesystems.org To: John Baldwin In-Reply-To: <201108251347.45460.jhb@freebsd.org> Message-ID: References: <201108251347.45460.jhb@freebsd.org> User-Agent: Alpine 2.01 (GSO 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (blade.simplesystems.org [65.66.246.90]); Thu, 25 Aug 2011 17:51:15 -0500 (CDT) Cc: Rick Macklem , fs@freebsd.org Subject: Re: Fixes to allow write clustering of NFS writes from a FreeBSD NFS client X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Aug 2011 23:03:05 -0000 On Thu, 25 Aug 2011, John Baldwin wrote: > I was doing some analysis of compiles over NFS at work recently and noticed > from 'iostat 1' on the NFS server that all my NFS writes were always 16k > writes (meaning that writes were never being clustered). I added some > debugging sysctls to the NFS client and server code as well as the FFS write > VOP to figure out the various kind of write requests that were being sent. I > found that during the NFS compile, the NFS client was sending a lot of > FILESYNC writes even though nothing in the compile process uses fsync(). A fundamental principle of NFS is that writes are synchronous so that if the server spontaneously reboots, all the acknowledged writes will still be present on disk and the client just continues (after a delay) without loss/corruption of data. NFSv3 added the ability to send uncommitted data to the server, with the agreement that the client would agree to re-send any uncommitted data if the server spontaneously rebooted. Most clients are not responsibly prepared to participate in this since it would require some non-volatile local storage on the client. I don't know if your changes would harm these expectations. Regardless, there is little doubt that the default client NFS in FreeBSD 8.2 suffers quite a lot in sequential write performance as compared with an OS like Solaris. Hopefully the new NFS that Rick Macklem has been working on (and is apparently ready for general use) will perform much better. Since FreeBSD is switching to the new implementation it seems like that is where the efforts should be going. Bob -- Bob Friesenhahn bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/