From owner-freebsd-fs@FreeBSD.ORG Tue Feb 5 14:40:17 2008 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 43E4F16A498; Tue, 5 Feb 2008 14:40:17 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from falcon.cybervisiontech.com (falcon.cybervisiontech.com [217.20.163.9]) by mx1.freebsd.org (Postfix) with ESMTP id A74CC13C4E8; Tue, 5 Feb 2008 14:40:16 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from localhost (localhost [127.0.0.1]) by falcon.cybervisiontech.com (Postfix) with ESMTP id C63B443DC32; Tue, 5 Feb 2008 16:40:15 +0200 (EET) X-Virus-Scanned: Debian amavisd-new at falcon.cybervisiontech.com Received: from falcon.cybervisiontech.com ([127.0.0.1]) by localhost (falcon.cybervisiontech.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id F77-1TpguA4B; Tue, 5 Feb 2008 16:40:15 +0200 (EET) Received: from [10.2.1.87] (gateway.cybervisiontech.com.ua [88.81.251.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by falcon.cybervisiontech.com (Postfix) with ESMTP id 46AF743D33F; Tue, 5 Feb 2008 16:40:14 +0200 (EET) Message-ID: <47A8754C.5010607@icyb.net.ua> Date: Tue, 05 Feb 2008 16:40:12 +0200 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.9 (X11/20080123) MIME-Version: 1.0 To: pav@FreeBSD.org References: <200612221824.kBMIOhfM049471@freefall.freebsd.org> <47A2EDB0.8000801@icyb.net.ua> <47A2F404.7010208@icyb.net.ua> <47A735A4.3060506@icyb.net.ua> <47A75B47.2040604@elischer.org> <1202155663.62432.0.camel@ikaros.oook.cz> In-Reply-To: <1202155663.62432.0.camel@ikaros.oook.cz> Content-Type: multipart/mixed; boundary="------------060803080903090508090302" Cc: Bruce Evans , freebsd-hackers@FreeBSD.org, scottl@FreeBSD.org, freebsd-fs@FreeBSD.org, Julian Elischer , Remko Lodder Subject: Re: fs/udf: vm pages "overlap" while reading large dir [patch] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Feb 2008 14:40:17 -0000 This is a multi-part message in MIME format. --------------060803080903090508090302 Content-Type: text/plain; charset=ISO-8859-2 Content-Transfer-Encoding: 8bit on 04/02/2008 22:07 Pav Lucistnik said the following: > Julian Elischer píše v po 04. 02. 2008 v 10:36 -0800: >> Andriy Gapon wrote: >>> More on the problem with reading big directories on UDF. >> You do realise that you have now made yourself the official >> maintainer of the UDF file system by submitting a competent >> and insightful analysis of the problem? > > Yay, and can you fix the sequential read performance while you're at it? > Kthx! > Pav, this was almost trivial :-) See the attached patch, first hunk is just for consistency. The code was borrowed from cd9660, only field/variable names are adjusted. But there is another issue that I also mentioned in the email about directory reading. It is UDF_INVALID_BMAP case of udf_bmap_internal, i.e. the case when file data is embedded into a file entry. This is a special case that needs to be handled differently. udf_readatoffset() handles it, but the latest udf_read code doesn't. I have a real UDF filesystem where this type of allocation is used for small files and those files can not be read. This is described in Part 4, section 14.6.8 of ECMA-167. -- Andriy Gapon --------------060803080903090508090302 Content-Type: text/x-patch; name="udf_ra.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="udf_ra.patch" --- udf_vnops.c.orig 2008-01-29 23:50:49.000000000 +0200 +++ udf_vnops.c 2008-02-05 01:30:23.000000000 +0200 @@ -851,7 +846,7 @@ udf_bmap(struct vop_bmap_args *a) if (a->a_runb) *a->a_runb = 0; - error = udf_bmap_internal(node, a->a_bn * node->udfmp->bsize, &lsector, + error = udf_bmap_internal(node, a->a_bn << node->udfmp->bshift, &lsector, &max_size); if (error) return (error); @@ -859,9 +854,27 @@ udf_bmap(struct vop_bmap_args *a) /* Translate logical to physical sector number */ *a->a_bnp = lsector << (node->udfmp->bshift - DEV_BSHIFT); - /* Punt on read-ahead for now */ - if (a->a_runp) - *a->a_runp = 0; + /* + * Determine maximum number of readahead blocks following the + * requested block. + */ + if (a->a_runp) { + off_t fsize; + int nblk; + + fsize = le64toh(node->fentry->inf_len); + nblk = (fsize >> node->udfmp->bshift) - (a->a_bn + 1); + if (nblk <= 0) + *a->a_runp = 0; + else if (nblk >= (MAXBSIZE >> node->udfmp->bshift)) + *a->a_runp = (MAXBSIZE >> node->udfmp->bshift) - 1; + else + *a->a_runp = nblk; + } + + if (a->a_runb) { + *a->a_runb = 0; + } return (0); } --- udf_vfsops.c.orig 2007-03-13 03:50:24.000000000 +0200 +++ udf_vfsops.c 2008-02-05 01:29:10.000000000 +0200 @@ -330,6 +330,11 @@ udf_mountfs(struct vnode *devvp, struct bo = &devvp->v_bufobj; + if (devvp->v_rdev->si_iosize_max != 0) + mp->mnt_iosize_max = devvp->v_rdev->si_iosize_max; + if (mp->mnt_iosize_max > MAXPHYS) + mp->mnt_iosize_max = MAXPHYS; + /* XXX: should be M_WAITOK */ MALLOC(udfmp, struct udf_mnt *, sizeof(struct udf_mnt), M_UDFMOUNT, M_NOWAIT | M_ZERO); --------------060803080903090508090302--