From owner-freebsd-fs@FreeBSD.ORG Fri May 13 00:03:40 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 85066106566B for ; Fri, 13 May 2011 00:03:40 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 448F18FC13 for ; Fri, 13 May 2011 00:03:40 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApwEAK50zE2DaFvO/2dsb2JhbACEVqIaiHCte5ETgSuDY4EHBI98jwU X-IronPort-AV: E=Sophos;i="4.64,361,1301889600"; d="scan'208";a="120545285" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 12 May 2011 20:03:39 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 49C93B3F5B; Thu, 12 May 2011 20:03:39 -0400 (EDT) Date: Thu, 12 May 2011 20:03:39 -0400 (EDT) From: Rick Macklem To: Bob Friesenhahn Message-ID: <921935873.267812.1305245019197.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - IE7 (Win)/6.0.10_GA_2692) Cc: freebsd-fs@freebsd.org Subject: Re: ZFS: How to enable cache and logs. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 May 2011 00:03:40 -0000 > On Thu, 12 May 2011, Rick Macklem wrote: > >> The large write feature of the ZIL is a reason why we should > >> appreciate modern NFS's large-write capability and avoid anchient > >> NFS. > >> > > The size of a write for the new FreeBSD NFS server is limited to > > MAX_BSIZE. It is currently 64K, but I would like to see it much > > larger. > > I am going to try increasing MAX_BSIZE soon, to see what happens. > > Zfs would certainly appreciate 128K since that is its default block > size. When existing file content is overwritten, writing in properly > aligned 128K blocks is much faster due to ZFS's COW algorithm and not > needing to read the existing block. With a partial "overwrite", if > the existing block is not already cached in the ARC, then it would > need to be read from underlying store before the replacement block can > be written. This effect becomes readily apparent in benchmarks. In > my own benchmarking I have found that 128K is sufficient and using > larger multiples of 128K does not obtain much more performance. > > When creating a file from scratch, zfs performs well for async writes > if a process writes data smaller than 128K. That might not be the > case for sync writes. > Yep, I think sizes greater than 128K might only benefit WAN connections with a larger bandwidth * delay product. It also helps to find "not so great" network interfaces/drivers. When I used 128K on the Mac OS X port, it worked great for some Macs and horribly for others. Some Macs would drop packets when they would see a burst of read traffic (the Mac was a client and the server was Solaris10, which handles NFS read/write sizes up to 1Mbyte) and wouldn't perform well above 32Kbytes (for a now rather old port to Leopard). rick