From owner-freebsd-fs@FreeBSD.ORG Sat May 28 11:12:26 2005 Return-Path: X-Original-To: freebsd-fs@FreeBSD.org Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2145416A41C; Sat, 28 May 2005 11:12:26 +0000 (GMT) (envelope-from dom@goodforbusiness.co.uk) Received: from mail.helenmarks.co.uk (mail.helenmarks.co.uk [82.68.196.22]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6602943D48; Sat, 28 May 2005 11:12:25 +0000 (GMT) (envelope-from dom@goodforbusiness.co.uk) Received: from localhost (localhost [127.0.0.1]) by mail.helenmarks.co.uk (Postfix) with ESMTP id 324BA222404; Sat, 28 May 2005 12:12:22 +0100 (BST) Received: from mail.helenmarks.co.uk ([127.0.0.1]) by localhost (mail.helenmarks.co.uk [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 79244-07; Sat, 28 May 2005 12:12:19 +0100 (BST) Received: from egg.helenmarks.co.uk (egg.helenmarks.co.uk [192.168.15.3]) by mail.helenmarks.co.uk (Postfix) with ESMTP id 08E1C222403; Sat, 28 May 2005 12:12:19 +0100 (BST) From: Dominic Marks Organization: GoodforBusiness.co.uk To: Bruce Evans Date: Sat, 28 May 2005 12:13:41 +0100 User-Agent: KMail/1.8 References: <200505271328.58072.dom@goodforbusiness.co.uk> <20050528194126.W3563@epsplex.bde.org> In-Reply-To: <20050528194126.W3563@epsplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200505281213.42118.dom@goodforbusiness.co.uk> X-Virus-Scanned: By ClamAV 0.80 Cc: freebsd-fs@FreeBSD.org, freebsd-gnats-submit@FreeBSD.org, banhalmi@field.hu Subject: Re: i386/68719: [usb] USB 2.0 mobil rack+ fat32 performance problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 May 2005 11:12:26 -0000 On Saturday 28 May 2005 11:36, Bruce Evans wrote: > On Fri, 27 May 2005, Dominic Marks wrote: > > (Posted to freebsd-fs as the PR is assigned to freebsd-usb@, but it seems > > to be more related to the msdos filesystem than the USB system so perhaps > > it should be reassigned?) > > It should be. It is even less i386-specific than usb-specific. > > > I've been evaluating the performance of some usb2 hard discs with FreeBSD > > and I found this PR (68719). The submitter is correct that performance > > with msdosfs is severely limited. > > > > I tested a 'LaCie' USB2 disc: > > ... > > In test 1 I could not achieve any better than 5.1MB/s on an msdosfs > > filesystem. Using UFS2 and softupdates a transfer rate of 22~25MB/s was > > possible. Both test data sets were copied from the systems ATA-100 disc. > > In both tests at these peaks gstat reports the device is 100% busy. > > I use the following to improve transfer rates for msdosfs. The patch is > for an old version so it might not apply directly. > > %%% > Index: msdosfs_vnops.c > =================================================================== > RCS file: /home/ncvs/src/sys/fs/msdosfs/msdosfs_vnops.c,v > retrieving revision 1.147 > diff -u -2 -r1.147 msdosfs_vnops.c > --- msdosfs_vnops.c 4 Feb 2004 21:52:53 -0000 1.147 > +++ msdosfs_vnops.c 22 Feb 2004 07:27:15 -0000 > @@ -608,4 +622,5 @@ > int error = 0; > u_long count; > + int seqcount; > daddr_t bn, lastcn; > struct buf *bp; > @@ -693,4 +714,5 @@ > lastcn = de_clcount(pmp, osize) - 1; > > + seqcount = ioflag >> IO_SEQSHIFT; > do { > if (de_cluster(pmp, uio->uio_offset) > lastcn) { > @@ -718,5 +740,5 @@ > */ > bp = getblk(thisvp, bn, pmp->pm_bpcluster, 0, 0, 0); > - clrbuf(bp); > + vfs_bio_clrbuf(bp); > /* > * Do the bmap now, since pcbmap needs buffers > @@ -767,11 +789,19 @@ > * without delay. Otherwise do a delayed write because we > * may want to write somemore into the block later. > + * XXX comment not updated with code. > */ > + if ((vp->v_mount->mnt_flag & MNT_NOCLUSTERW) == 0) > + bp->b_flags |= B_CLUSTEROK; > if (ioflag & IO_SYNC) > - (void) bwrite(bp); > - else if (n + croffset == pmp->pm_bpcluster) > + (void)bwrite(bp); > + else if (vm_page_count_severe() || buf_dirty_count_severe()) > bawrite(bp); > - else > - bdwrite(bp); > + else if (n + croffset == pmp->pm_bpcluster) { > + if ((vp->v_mount->mnt_flag & MNT_NOCLUSTERW) == 0) > + cluster_write(bp, dep->de_FileSize, seqcount); > + else > + bawrite(bp); > + } else > + bdwrite(bp); > dep->de_flag |= DE_UPDATE; > } while (error == 0 && uio->uio_resid > 0); > %%% Thanks! I'll try my three tests again with this patch. > Notes: > - The xxx_count_severe() stuff doesn't work quite right and was observed > to work especially badly for msdosfs in some configurations. IIRC, > only configurations with a tiny block size (e.g., 512 bytes) showed > the problem, and the problem is more likely to be with tiny block sizes > actually exercising the "severe" case than with msdosfs or with the > tiny block sizes themselves. The behaviour was apparently that when > a severe page or buf shortage develops, the above handling makes the > problem worse by using bawrite() instead of cluster_write(). Falling > back to bawrite() may have made the resource shortage non-fatal, but > it made the resource shortage last much longer since bawrite() was much > slower, even on the reasonable fast ATA drive that I was testing on. > - Using cluster_write() in the above is not essential. bdwrite() works > almost as well, or perhaps even better than cluster_write() provided > write clustering is enabled by setting B_CLUSTEROK, since when this > flag is set the delayed writes are clustered when they are done > physically. > > > I have not made any tests of read performance but from looking at the > > results I do not expect that it will be significantly better than write > > performance. I may do some when I get more time to investigate and follow > > up if the results are unexpected. > > Try it. I would expect read performance to be much better. If not, don't > bother trying the above patch. msdosfs uses read-ahead for read(), and > this seems to work well so I haven't even tried changing it to use read > clustering (the above only changes it to use write clustering). This may > depend on the drive doing read caching and not handling small block sizes > too badly. I mostly use ATA drives that have these properties. Writing > tinygrams tends to have a relatively higher cost because write caching is > not enabled so clustering can only be done by the OS. Ok, I still have all the test equipment so I might as well do this today. I have ATA write caching enabled on my systems. > > Hopefully this will generate some interest in the problem, it is beyond > > my time and expertise but it would be very nice to be able to access > > MS-DOS formatted filesystems at a reasonable speed! > > Some other changes are needed for general use at a reasonable speed: > - use VMIO for metadata. > - don't use pessimal block allocation. The current allocator gives > large inter-file fragmentation by attempting to minimise intra-file > fragmentation, and when the file system becomes just 1/N full the > attempt backfires and gives intra-file fragmentation too (files with > more than N clusters are very likely to be fragmented). Is there anyone out there who is sufficently talented, with a strong desire to tackle this problem? I would be happy to make the first payment, or hardware donation into a development fund to see it get fixed. My resources are limited though, so if there are others who would like this feature perhaps we could combine to get a volunteer some really nice kit? > Bruce Thanks very much, -- Dominic GoodforBusiness.co.uk I.T. Services for SMEs in the UK.