From owner-freebsd-fs@FreeBSD.ORG Wed Mar 24 06:10:11 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9FB971065670 for ; Wed, 24 Mar 2010 06:10:11 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id E3B508FC17 for ; Wed, 24 Mar 2010 06:10:10 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id IAA05876; Wed, 24 Mar 2010 08:10:04 +0200 (EET) (envelope-from avg@freebsd.org) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1NuJn1-000LPC-Pu; Wed, 24 Mar 2010 08:10:03 +0200 Message-ID: <4BA9ACBA.4080608@freebsd.org> Date: Wed, 24 Mar 2010 08:10:02 +0200 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.24 (X11/20100321) MIME-Version: 1.0 To: Andrew Snow References: <4BA8CD21.3000803@freebsd.org> <4BA954A6.9030505@modulus.org> In-Reply-To: <4BA954A6.9030505@modulus.org> X-Enigmail-Version: 0.96.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: on st_blksize value X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Mar 2010 06:10:11 -0000 on 24/03/2010 01:54 Andrew Snow said the following: > Andriy Gapon wrote: > >> One practical benefit can be with ZFS: if a filesystem has recordsize >> > PAGE_SIZE >> (e.g. default 128K) and it has checksums or compression enabled, then >> (over-)writing in blocks smaller than recordsize would require reading >> of a whole >> record first. > > Not strictly true: in ZFS the recordsize setting is for the maximum size > of a record, it can still write smaller than this. If you overwrite 1K > in the middle of a 128K record then it should just be writing a 1K > block. Each block has its own checksum attached to it so there's no > need to recalculate checksums for data that isn't changing. I must admit that know almost zero about ZFS internals, but I see a logical problem in your explanation - if the original data was written as a single 128K block, and if changing a 1K range within it would result in a new 1K block, then the original data is still affected as it needs to account that the range is now stored in a different block. Perhaps, I am just misunderstanding what you said. But you perhaps you were referring to the case of (over)writing a small _file_ as opposed to the case of overwriting a small range within a large file? -- Andriy Gapon