From owner-freebsd-geom@FreeBSD.ORG Wed Mar 31 14:54:11 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5257D106566C; Wed, 31 Mar 2010 14:54:11 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 636398FC1A; Wed, 31 Mar 2010 14:54:10 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id RAA21421; Wed, 31 Mar 2010 17:54:08 +0300 (EEST) (envelope-from avg@freebsd.org) Message-ID: <4BB36210.5040102@freebsd.org> Date: Wed, 31 Mar 2010 17:54:08 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.24 (X11/20100319) MIME-Version: 1.0 To: freebsd-fs@freebsd.org, freebsd-geom@freebsd.org References: <4BA8CD21.3000803@freebsd.org> In-Reply-To: <4BA8CD21.3000803@freebsd.org> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: Subject: Re: on st_blksize value X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 Mar 2010 14:54:11 -0000 on 23/03/2010 16:16 Andriy Gapon said the following: > First, what I am proposing: > --- a/sys/kern/vfs_vnops.c > +++ b/sys/kern/vfs_vnops.c > @@ -790,11 +790,11 @@ vn_stat(vp, sb, active_cred, file_cred, td) > * to file" > * Default to PAGE_SIZE after much discussion. > * XXX: min(PAGE_SIZE, vp->v_bufobj.bo_bsize) may be more correct. > */ > > - sb->st_blksize = PAGE_SIZE; > + sb->st_blksize = max(PAGE_SIZE, vap->va_blocksize); If no one has objections, suggestions or opinions, I am going to commit this. I will probably change the scary comment too. > > sb->st_flags = vap->va_flags; > if (priv_check(td, PRIV_VFS_GENERATION)) > sb->st_gen = 0; > else > > Explanation: > 1. IMO it is not nice that we totally ignore va_blocksize value that can be set by > a filesystem. This takes away flexibility. That va_blocksize value might really > turn out to be optimal given the filesystem implementation. > 2. As currently st_blksize is always PAGE_SIZE, it is playing safe to not use any > smaller value. For some case this might not be optimal (which I personally > doubt), but at least nothing should get broken. > > One practical benefit can be with ZFS: if a filesystem has recordsize > PAGE_SIZE > (e.g. default 128K) and it has checksums or compression enabled, then > (over-)writing in blocks smaller than recordsize would require reading of a whole > record first. And some applications do use st_blksize as a hint (just for the > record: some other use f_iosize instead, and yet some use a hardcoded value). > BTW, some torrent-like applications can serve as a good example of applications > that overwrite chunks of existing files. > > Additionally, here's a little bit of history that explains the PAGE_SIZE ("much > discussion") comment in vn_stat. It seems that the comment may be misleading > nowadays. > It was introduced in r89784 and at that time it applied only to the case of > non-VREG and non-vn_isdisk vnodes. > Then, almost 3 years later, in revision 136966 code for VREG vnodes and vn_isdisk > vnodes was dropped, the XXX comment was introduced, and we ended up with the > current state of matters. > > BTW, I am not sure about the XXX comment either. > Using bo_bsize may be a nice shortcut, but it would also take away some > flexibility. Filesystems can already set bo_bsize and va_blocksize to the same > value, but there could be special cases where they not need be the same. > > Thanks a lot for opinions and suggestions! > > P.S. Yes, I have read the following interesting thread _completely_: > http://lists.freebsd.org/pipermail/freebsd-fs/2007-May/003155.html > And this one too: > http://freebsd.monkey.org/freebsd-fs/200810/msg00059.html > Unfortunately, the discussions didn't result in any action. > -- Andriy Gapon