From owner-freebsd-hackers Sat Apr 22 14: 9: 3 2000 Delivered-To: freebsd-hackers@freebsd.org Received: from corinth.bossig.com (mail.dohboys.com [208.26.253.10]) by hub.freebsd.org (Postfix) with ESMTP id CDED037B568 for ; Sat, 22 Apr 2000 14:08:52 -0700 (PDT) (envelope-from kstewart@3-cities.com) Received: from 3-cities.com (unverified [208.26.242.14]) by corinth.bossig.com (Rockliffe SMTPRA 4.2.1) with ESMTP id ; Sat, 22 Apr 2000 14:11:19 -0700 Message-ID: <39021467.BD1599C5@3-cities.com> Date: Sat, 22 Apr 2000 14:06:47 -0700 From: Kent Stewart Organization: Columbia Basin Virtual Community Project X-Mailer: Mozilla 4.72 [en] (Windows NT 5.0; U) X-Accept-Language: en MIME-Version: 1.0 To: =?iso-8859-1?Q?G=E9rard?= Roudier Cc: Matthew Dillon , Michael Bacarella , Alfred Perlstein , Kevin Day , hackers@FreeBSD.ORG Subject: Re: Double buffered cp(1) References: Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Gérard Roudier wrote: > > On Sat, 22 Apr 2000, Matthew Dillon wrote: > > > :> :extend (using truncate) and then mmap() the destination file, then > > :> :read() directly into the mmap()'d portion. > > :> : > > :> :I'd like to see what numbers you get. :) > > : > > :> read + write is a better way to do it. It is still possible to > > :> double buffer. In this case simply create a small anonymous shared > > :> mmap that fits in the L2 cache (like 128K), setup a pipe, fork, and > > :> have one process read() from the source while the other write()s to the > > :> destination. The added overhead is actually less then 'one buffer copy' > > :> worth if the added buffering fits in the L1 or L2 cache. > > : > > :It seems silly to implement something as trivial and straightforward as > > :copying a file in userland. The process designated to copy a file just > > :sits in a tight loop invoking the read()/write() syscalls > > :repeatedly. Since this operation is already system bound and very simple, > > :what's the arguement against absorbing it into the kernel? > > : > > :-MB > > > > I don't think anyone has suggested that it be absorbed into the kernel. > > We are talking about userland code here. > > > > The argument for double-buffering is a simple one - it allows the > > process read()ing from the source file to block without stalling the > > process write()ing to the destination file. > > > > I think the reality, though, is that at least insofar as copying a > > single large file the source is going to be relatively contiguous on > > the disk and thus will tend not to block. More specifically, the > > disk itself is probably the bottleneck. Disk writes tend to be > > somewhat slower then disk reads and the seeking alone (between source > > file and destination file), even when using a large block size, > > will reduce performance drastically verses simply reading or writing > > a single file linearly. Double buffering may help a disk-to-disk > > file copy, but I doubt it will help a disk-to-same-disk file copy. > > Speaking about requential file read, the asynchronous read-ahead mechanism > in the kernel already has the same effect as a double-buffering. In > addition, real disks do prefetch data based on physical position and this > also help when the file is not too fragmented. When I did my buildworld's on different IDE controllers, my flags were 0xa0ffa0ff on both controllers, which pretty much used everything the IDE drive could provide because of 32-bit transfers, UDMA-33, and 16 sector read aheads. Kent > > However, some bottleneck may exist when reads and writes transverse the > same controller or involve a single device. This problem _is_ addressed by > SCSI. The disconnection feature allows the BUS bandwidth not to be wasted > and tagged command queuing allows to provide devices with several IO > requests simultaneously. > It is also addressed by ATA using the same mechanisms, but I doubt > disconnections and tagged commands will ever be reliable enough to be > actually usable on this interface given that it targets personnal > computers that donnot require fast multi-streamed disk IOs. > > This let me think that: > - User-space double-bufferred cp will not help at all given a decent > IO sub-system and decent devices. > - It will also not help when the controller and/or the device (as legacy > IDE) just act as an IO bottleneck for cp (double bottleneck in case of > reading and writing to the same disk ;-) ). > > By experience, connecting a real hard disk like a Cheetah to a real SCSI > controller (LVD preferred) and using a real O/S help a lot better. ;-) > > Gérard. > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-hackers" in the body of the message -- Kent Stewart Richland, WA mailto:kstewart@3-cities.com http://www.3-cities.com/~kstewart/index.html FreeBSD News http://daily.daemonnews.org/ SETI(Search for Extraterrestrial Intelligence) @ HOME http://setiathome.ssl.berkeley.edu/ Hunting Archibald Stewart, b 1802 in Ballymena, Antrim Co., NIR http://www.3-cities.com/~kstewart/genealogy/archibald_stewart.html To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message