From owner-freebsd-fs@FreeBSD.ORG Thu May 12 23:19:51 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9DB19106566B for ; Thu, 12 May 2011 23:19:51 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74]) by mx1.freebsd.org (Postfix) with ESMTP id 6522B8FC19 for ; Thu, 12 May 2011 23:19:51 +0000 (UTC) Received: from freddy.simplesystems.org (freddy.simplesystems.org [65.66.246.65]) by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id p4CNJo53017906; Thu, 12 May 2011 18:19:50 -0500 (CDT) Date: Thu, 12 May 2011 18:19:50 -0500 (CDT) From: Bob Friesenhahn X-X-Sender: bfriesen@freddy.simplesystems.org To: Rick Macklem In-Reply-To: <1700693186.266759.1305241371736.JavaMail.root@erie.cs.uoguelph.ca> Message-ID: References: <1700693186.266759.1305241371736.JavaMail.root@erie.cs.uoguelph.ca> User-Agent: Alpine 2.01 (GSO 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (blade.simplesystems.org [65.66.246.90]); Thu, 12 May 2011 18:19:50 -0500 (CDT) Cc: freebsd-fs@freebsd.org Subject: Re: ZFS: How to enable cache and logs. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 May 2011 23:19:51 -0000 On Thu, 12 May 2011, Rick Macklem wrote: >> The large write feature of the ZIL is a reason why we should >> appreciate modern NFS's large-write capability and avoid anchient NFS. >> > The size of a write for the new FreeBSD NFS server is limited to > MAX_BSIZE. It is currently 64K, but I would like to see it much larger. > I am going to try increasing MAX_BSIZE soon, to see what happens. Zfs would certainly appreciate 128K since that is its default block size. When existing file content is overwritten, writing in properly aligned 128K blocks is much faster due to ZFS's COW algorithm and not needing to read the existing block. With a partial "overwrite", if the existing block is not already cached in the ARC, then it would need to be read from underlying store before the replacement block can be written. This effect becomes readily apparent in benchmarks. In my own benchmarking I have found that 128K is sufficient and using larger multiples of 128K does not obtain much more performance. When creating a file from scratch, zfs performs well for async writes if a process writes data smaller than 128K. That might not be the case for sync writes. Bob -- Bob Friesenhahn bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/