Date: Tue, 31 May 2005 13:05:26 +1000 (EST) From: Bruce Evans <bde@zeta.org.au> To: Dominic Marks <dom@goodforbusiness.co.uk> Cc: freebsd-fs@FreeBSD.org, freebsd-gnats-submit@FreeBSD.org, banhalmi@field.hu Subject: Re: i386/68719: [usb] USB 2.0 mobil rack+ fat32 performance problem Message-ID: <20050531115604.S91592@delplex.bde.org> In-Reply-To: <200505301609.11857.dom@goodforbusiness.co.uk> References: <200505271328.58072.dom@goodforbusiness.co.uk> <20050530155609.Q1473@epsplex.bde.org> <20050530193711.I843@epsplex.bde.org> <200505301609.11857.dom@goodforbusiness.co.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 30 May 2005, Dominic Marks wrote: > On Monday 30 May 2005 11:11, Bruce Evans wrote: >> The main problem is that VOP_BMAP() is not fully implemented for msdosfs. >> msdosfs_bmap() only has a stub which pretends that clustering ins never >> possible: > > If I understand what is supposed to be done here (I looked at cd9660 but > I don't know if the rules are different from msdos), a_runp should be set > to the extent of contiguous blocks from the current position within the > same region? I put some debugging into msdosfs_bmap and here it is copied: cd9660 is deceptively simple here because (I think) it allocates files in perfectly contiguous extents. msdosfs, ffs^ufs and ext2fs have to do considerable work to map even a single block. The details are in pcbmap() for msdosfs. (The name of this function dates from when msdosfs was named pcfs.) I think msdosfs_bmap() just needs to call this function for each block following the start block until a discontiguity is hit or a limit (*) is reached. ufs and ext2fs have an optimized and obfucsated version of this, with multiple blocks looked up at once and the single-block lookup implemented as a multiple-block lookup with a count of 1. I doubt that this optimization is significant even for ufs, at least now that CPUs are 10 to 100 times as fast relative to I/O as when it was implemented. However it is easier to optimize for msdosfs since there are no indirect blocks. All of cd9660, ufs and ext2fs have a whole file *_bmap.c for bmapping. ext2_bmaparray() is simplest, but bmapping in ext2fs and ufs is so similar that misspelling ext2_getlbns() as ufs_getlbns() in 1 caller is harmless. (*) The correct limit is mnt_iosize_max bytes. cd9660 uses the wrong limit of MAXBSIZE. > (fsz is dep->de_FileSize) > > msdosfs_bmap: fsz 81047 blkno 6374316 lblkno 5 > ... > msdosfs_bmap: fsz 81047 blkno 6374364 lblkno 11 > msdosfs_bmap: fsz 81047 blkno 6374372 lblkno 12 # A1 > msdosfs_bmap: fsz 81047 blkno 13146156 lblkno 13 # A2 > msdosfs_bmap: fsz 81047 blkno 13146156 lblkno 14 > ... > > I should compute the position of the boundary illustrated in A1 I should set > that to the read ahead value, until setting a new value at A2, perhaps this > should only be done for particularly large files? I will look at the other > _bmap routines to see what they do. Better to do it for all files. For small files there are just fewer blocks to check for contiguity. > I am still confused as to how reading blsize * 16 actually improved > the transfer rate after a long period of making it worse. Perhaps it > is related to the buffer resource problem you describe below. Could be. The buffer cache layer doesn't handle either overlapping buffers or variant buffer sizes very well. Buffer sizes of (blsize * 16) mixed with buffer sizes of blsize for msdosfs and 16K for ffs may excercise both of these. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050531115604.S91592>