Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 22 Apr 2000 21:51:53 +0200 (CEST)
From:      =?ISO-8859-1?Q?G=E9rard_Roudier?= <groudier@club-internet.fr>
To:        Matthew Dillon <dillon@apollo.backplane.com>
Cc:        Michael Bacarella <mbac@nyct.net>, Alfred Perlstein <bright@wintelcom.net>, Kevin Day <toasty@dragondata.com>, hackers@FreeBSD.ORG
Subject:   Re: Double buffered cp(1)
Message-ID:  <Pine.LNX.4.10.10004222103001.727-100000@linux.local>
In-Reply-To: <200004221736.KAA55484@apollo.backplane.com>

next in thread | previous in thread | raw e-mail | index | archive | help


On Sat, 22 Apr 2000, Matthew Dillon wrote:

> :> :extend (using truncate) and then mmap() the destination file, then
> :> :read() directly into the mmap()'d portion.
> :> :
> :> :I'd like to see what numbers you get. :)
> :
> :>     read + write is a better way to do it.  It is still possible to
> :>     double buffer.  In this case simply create a small anonymous share=
d
> :>     mmap that fits in the L2 cache (like 128K), setup a pipe, fork, an=
d=20
> :>     have one process read() from the source while the other write()s t=
o the
> :>     destination.  The added overhead is actually less then 'one buffer=
 copy'
> :>     worth if the added buffering fits in the L1 or L2 cache.
> :
> :It seems silly to implement something as trivial and straightforward as
> :copying a file in userland. The process designated to copy a file just
> :sits in a tight loop invoking the read()/write() syscalls
> :repeatedly. Since this operation is already system bound and very simple=
,
> :what's the arguement against absorbing it into the kernel?
> :
> :-MB
>=20
>     I don't think anyone has suggested that it be absorbed into the kerne=
l.
>     We are talking about userland code here.
>=20
>     The argument for double-buffering is a simple one - it allows the
>     process read()ing from the source file to block without stalling the
>     process write()ing to the destination file.
>=20
>     I think the reality, though, is that at least insofar as copying a
>     single large file the source is going to be relatively contiguous on
>     the disk and thus will tend not to block.  More specifically, the
>     disk itself is probably the bottleneck.  Disk writes tend to be
>     somewhat slower then disk reads and the seeking alone (between source
>     file and destination file), even when using a large block size,=20
>     will reduce performance drastically verses simply reading or writing
>     a single file linearly.  Double buffering may help a disk-to-disk
>     file copy, but I doubt it will help a disk-to-same-disk file copy.

Speaking about requential file read, the asynchronous read-ahead mechanism
in the kernel already has the same effect as a double-buffering. In
addition, real disks do prefetch data based on physical position and this=
=20
also help when the file is not too fragmented.

However, some bottleneck may exist when reads and writes transverse the
same controller or involve a single device. This problem _is_ addressed by
SCSI. The disconnection feature allows the BUS bandwidth not to be wasted
and tagged command queuing allows to provide devices with several IO
requests simultaneously.
It is also addressed by ATA using the same mechanisms, but I doubt
disconnections and tagged commands will ever be reliable enough to be
actually usable on this interface given that it targets personnal=20
computers that donnot require fast multi-streamed disk IOs. =20

This let me think that:
- User-space double-bufferred cp will not help at all given a decent=20
  IO sub-system and decent devices.
- It will also not help when the controller and/or the device (as legacy
  IDE) just act as an IO bottleneck for cp (double bottleneck in case of=20
  reading and writing to the same disk ;-) ).

By experience, connecting a real hard disk like a Cheetah to a real SCSI=20
controller (LVD preferred) and using a real O/S help a lot better. ;-)

G=E9rard.



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.LNX.4.10.10004222103001.727-100000>