Date: Sat, 22 Apr 2000 14:06:47 -0700 From: Kent Stewart <kstewart@3-cities.com> To: =?iso-8859-1?Q?G=E9rard?= Roudier <groudier@club-internet.fr> Cc: Matthew Dillon <dillon@apollo.backplane.com>, Michael Bacarella <mbac@nyct.net>, Alfred Perlstein <bright@wintelcom.net>, Kevin Day <toasty@dragondata.com>, hackers@FreeBSD.ORG Subject: Re: Double buffered cp(1) Message-ID: <39021467.BD1599C5@3-cities.com> References: <Pine.LNX.4.10.10004222103001.727-100000@linux.local>
next in thread | previous in thread | raw e-mail | index | archive | help
Gérard Roudier wrote: > > On Sat, 22 Apr 2000, Matthew Dillon wrote: > > > :> :extend (using truncate) and then mmap() the destination file, then > > :> :read() directly into the mmap()'d portion. > > :> : > > :> :I'd like to see what numbers you get. :) > > : > > :> read + write is a better way to do it. It is still possible to > > :> double buffer. In this case simply create a small anonymous shared > > :> mmap that fits in the L2 cache (like 128K), setup a pipe, fork, and > > :> have one process read() from the source while the other write()s to the > > :> destination. The added overhead is actually less then 'one buffer copy' > > :> worth if the added buffering fits in the L1 or L2 cache. > > : > > :It seems silly to implement something as trivial and straightforward as > > :copying a file in userland. The process designated to copy a file just > > :sits in a tight loop invoking the read()/write() syscalls > > :repeatedly. Since this operation is already system bound and very simple, > > :what's the arguement against absorbing it into the kernel? > > : > > :-MB > > > > I don't think anyone has suggested that it be absorbed into the kernel. > > We are talking about userland code here. > > > > The argument for double-buffering is a simple one - it allows the > > process read()ing from the source file to block without stalling the > > process write()ing to the destination file. > > > > I think the reality, though, is that at least insofar as copying a > > single large file the source is going to be relatively contiguous on > > the disk and thus will tend not to block. More specifically, the > > disk itself is probably the bottleneck. Disk writes tend to be > > somewhat slower then disk reads and the seeking alone (between source > > file and destination file), even when using a large block size, > > will reduce performance drastically verses simply reading or writing > > a single file linearly. Double buffering may help a disk-to-disk > > file copy, but I doubt it will help a disk-to-same-disk file copy. > > Speaking about requential file read, the asynchronous read-ahead mechanism > in the kernel already has the same effect as a double-buffering. In > addition, real disks do prefetch data based on physical position and this > also help when the file is not too fragmented. When I did my buildworld's on different IDE controllers, my flags were 0xa0ffa0ff on both controllers, which pretty much used everything the IDE drive could provide because of 32-bit transfers, UDMA-33, and 16 sector read aheads. Kent > > However, some bottleneck may exist when reads and writes transverse the > same controller or involve a single device. This problem _is_ addressed by > SCSI. The disconnection feature allows the BUS bandwidth not to be wasted > and tagged command queuing allows to provide devices with several IO > requests simultaneously. > It is also addressed by ATA using the same mechanisms, but I doubt > disconnections and tagged commands will ever be reliable enough to be > actually usable on this interface given that it targets personnal > computers that donnot require fast multi-streamed disk IOs. > > This let me think that: > - User-space double-bufferred cp will not help at all given a decent > IO sub-system and decent devices. > - It will also not help when the controller and/or the device (as legacy > IDE) just act as an IO bottleneck for cp (double bottleneck in case of > reading and writing to the same disk ;-) ). > > By experience, connecting a real hard disk like a Cheetah to a real SCSI > controller (LVD preferred) and using a real O/S help a lot better. ;-) > > Gérard. > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-hackers" in the body of the message -- Kent Stewart Richland, WA mailto:kstewart@3-cities.com http://www.3-cities.com/~kstewart/index.html FreeBSD News http://daily.daemonnews.org/ SETI(Search for Extraterrestrial Intelligence) @ HOME http://setiathome.ssl.berkeley.edu/ Hunting Archibald Stewart, b 1802 in Ballymena, Antrim Co., NIR http://www.3-cities.com/~kstewart/genealogy/archibald_stewart.html To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?39021467.BD1599C5>