From owner-freebsd-fs@FreeBSD.ORG Wed Mar 24 17:02:44 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 25794106566B for ; Wed, 24 Mar 2010 17:02:44 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74]) by mx1.freebsd.org (Postfix) with ESMTP id E03228FC1D for ; Wed, 24 Mar 2010 17:02:43 +0000 (UTC) Received: from freddy.simplesystems.org (freddy.simplesystems.org [65.66.246.65]) by blade.simplesystems.org (8.13.8+Sun/8.13.8) with ESMTP id o2OH2fLq013072; Wed, 24 Mar 2010 12:02:42 -0500 (CDT) Date: Wed, 24 Mar 2010 12:02:41 -0500 (CDT) From: Bob Friesenhahn X-X-Sender: bfriesen@freddy.simplesystems.org To: Andrew Snow In-Reply-To: <4BA954A6.9030505@modulus.org> Message-ID: References: <4BA8CD21.3000803@freebsd.org> <4BA954A6.9030505@modulus.org> User-Agent: Alpine 2.01 (GSO 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (blade.simplesystems.org [65.66.246.90]); Wed, 24 Mar 2010 12:02:42 -0500 (CDT) Cc: freebsd-fs@freebsd.org, Andriy Gapon Subject: Re: on st_blksize value X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Mar 2010 17:02:44 -0000 On Wed, 24 Mar 2010, Andrew Snow wrote: > Not strictly true: in ZFS the recordsize setting is for the maximum size of a > record, it can still write smaller than this. If you overwrite 1K in the > middle of a 128K record then it should just be writing a 1K block. Each > block has its own checksum attached to it so there's no need to recalculate > checksums for data that isn't changing. This is not true. In fact, simple testing will show that it is clearly not true. ZFS will always write recordsize blocks except that the tail block is allowed to be smaller. If compression is enabled, the block is stored in its compressed size, so the amount actually stored on disk may be less than the established recordsize. Due to ZFS's read-modify-write strategy, it is important to performance that the data to be modified be cached in the ARC. There will still be write amplification if the update size is smaller than the recordsize. Bob -- Bob Friesenhahn bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/