Date: Tue, 12 Feb 2008 00:23:04 +0200 From: Andriy Gapon <avg@icyb.net.ua> To: Poul-Henning Kamp <phk@freebsd.org> Cc: freebsd-fs@FreeBSD.org, freebsd-hackers@FreeBSD.org, Scott Long <scottl@samsco.org>, Pav Lucistnik <pav@FreeBSD.org>, Bruce Evans <bde@zeta.org.au> Subject: Re: fs/udf: vm pages "overlap" while reading large dir Message-ID: <47B0CAC8.2050603@icyb.net.ua> In-Reply-To: <47A9E062.3040409@icyb.net.ua> References: <200612221824.kBMIOhfM049471@freefall.freebsd.org> <47A2EDB0.8000801@icyb.net.ua> <47A2F404.7010208@icyb.net.ua> <47A7339E.4070708@icyb.net.ua> <47A73562.4010309@samsco.org> <47A9E062.3040409@icyb.net.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
on 06/02/2008 18:29 Andriy Gapon said the following: > Small summary of the above long description. > For directory reading fs/udf performs bread() on a (underlying) device > vnode. It passes block number as if block size was 512 bytes (i.e. > byte_offset_within_dev/512). On the other hand vm page index calculation > code uses the following formula: (blkno*bo_bsize)/PAGE_SIZE. > For CD/DVD devices bo_bsize is set to 2048. This results in page index > overlap when reading sufficiently many blocks from adjacent starting blocks. > That is, it seems that in "normal" cases page index gets calculated as > byte_offset/PAGE_SIZE, but in this case page index gets to be > 4*byte_offset/PAGE_SIZE. More details are in the quoted above text. > > With a lot of help from Bruce Evance I found this commit which seems to > have made a big difference: > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/kern/vfs_bio.c.diff?r1=1.458;r2=1.459;f=h > > Before this change page index for device vnodes was always > (blkno*512)/PAGE_SIZE. This is because of the vn_isdisk() case. > > udf_mountfs device vnode is passed to g_vfs_open() (in geom_vfs.c) and > the latter has the following line of code: > bo->bo_bsize = pp->sectorsize; > Where pp is geom provider for the device in question. > > I am not sure if the mentioned above vfs_bio.c change broke something > else in reading from device vnodes or if it fixed something for that. I > am also not sure what would be the consequences of setting bo_bsize to > 512 for vnodes of "disk" devices. It would most probably fix current > code in UDF, but I am not sure if it might break anything else. > > Just wanted to let the more knowledgeable people know and make a decision. > > P.S. bo_bsize seems to be referenced only in a handful of files and in > most of them it seems that it is related to "file" vnodes (as opposed to > "device" vnodes). > Paul, do you have any comment or opinion on the above? [sorry if clamped too much of the previous context, but it's all in archives] And a little continuation: Just for the sake of experiment I tried to emulated the previous behavior by changing geom_vfs.c, function g_vfs_open(), so that the relevant code reads as follows: if (vn_isdisk(vp, NULL)) bo->bo_bsize = DEV_BSIZE; else bo->bo_bsize = pp->sectorsize; It didn't break anything for me and it re-enabled current (CVS) fs/udf code to work as before. (patch from kern/78987 still has to be applied for large directories to be handled). So udf (on DVD) is fixed, ufs (on HDD), msdosfs (on HDD) and cd9660 (on CD) are not broken. Of course, the change should have affected only filesystems on CD/DVD (where sector/block size is not 512 bytes). Of course, unlike Bruce I don't use msdosfs on CD/DVD media (e.g. DVD-RAM), so my test is somewhat incomplete. -- Andriy Gapon
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?47B0CAC8.2050603>