Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 24 Mar 2010 12:02:41 -0500 (CDT)
From:      Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
To:        Andrew Snow <als@modulus.org>
Cc:        freebsd-fs@freebsd.org, Andriy Gapon <avg@freebsd.org>
Subject:   Re: on st_blksize value
Message-ID:  <alpine.GSO.2.01.1003241156500.29281@freddy.simplesystems.org>
In-Reply-To: <4BA954A6.9030505@modulus.org>
References:  <4BA8CD21.3000803@freebsd.org> <4BA954A6.9030505@modulus.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 24 Mar 2010, Andrew Snow wrote:

> Not strictly true: in ZFS the recordsize setting is for the maximum size of a 
> record, it can still write smaller than this.  If you overwrite 1K in the 
> middle of a 128K record then it should just be writing a 1K block.  Each 
> block has its own checksum attached to it so there's no need to recalculate 
> checksums for data that isn't changing.

This is not true.  In fact, simple testing will show that it is 
clearly not true.

ZFS will always write recordsize blocks except that the tail block is 
allowed to be smaller.  If compression is enabled, the block is stored 
in its compressed size, so the amount actually stored on disk may be 
less than the established recordsize.

Due to ZFS's read-modify-write strategy, it is important to 
performance that the data to be modified be cached in the ARC.  There 
will still be write amplification if the update size is smaller than 
the recordsize.

Bob
--
Bob Friesenhahn
bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.GSO.2.01.1003241156500.29281>