From owner-freebsd-hackers  Fri Nov 13 00:03:42 1998
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Received: (from majordom@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id AAA06356
          for freebsd-hackers-outgoing; Fri, 13 Nov 1998 00:03:42 -0800 (PST)
          (envelope-from owner-freebsd-hackers@FreeBSD.ORG)
Received: from news2.du.gtn.com (news2.du.gtn.com [194.77.9.57] (may be forged))
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id AAA06351
          for <hackers@FreeBSD.ORG>; Fri, 13 Nov 1998 00:03:37 -0800 (PST)
          (envelope-from ticso@cicely5.cicely.de)
Received: from cicely.cicely.de (cicely.de [194.231.9.142])
	by news2.du.gtn.com (8.8.6/8.8.6) with ESMTP id JAA10573;
	Fri, 13 Nov 1998 09:03:01 +0100 (MET)
Received: from cicely5.cicely.de (cicely5.cicely.de [10.1.1.7])
	by cicely.cicely.de (8.8.8/8.8.8) with ESMTP id JAA01065;
	Fri, 13 Nov 1998 09:03:17 +0100 (CET)
Received: (from ticso@localhost)
	by cicely5.cicely.de (8.9.0/8.9.0) id JAA29661;
	Fri, 13 Nov 1998 09:03:12 +0100 (CET)
Message-ID: <19981113090311.05423@cicely.de>
Date: Fri, 13 Nov 1998 09:03:11 +0100
From: Bernd Walter <ticso@cicely.de>
To: Greg Lehey <grog@lemis.com>, Mike Smith <mike@smith.net.au>,
        hackers@FreeBSD.ORG
Subject: Re: [Vinum] Stupid benchmark: newfsstone
References: <199811100638.WAA00637@dingo.cdrom.com> <19981111103028.L18183@freebie.lemis.com> <19981111040654.07145@cicely.de> <19981111134546.D20374@freebie.lemis.com> <19981111085152.55040@cicely.de> <19981111183546.D20849@freebie.lemis.com> <19981111194157.06719@cicely.de> <19981112184509.K463@freebie.lemis.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 0.89i
In-Reply-To: <19981112184509.K463@freebie.lemis.com>; from Greg Lehey on Thu, Nov 12, 1998 at 06:45:09PM +1030
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

On Thu, Nov 12, 1998 at 06:45:09PM +1030, Greg Lehey wrote:
> On Wednesday, 11 November 1998 at 19:41:57 +0100, Bernd Walter wrote:
> 
[...]
> > What I expect is that an agreagation such a 60k chunk access on the
> > volume is splited into only one transaction per drive - so you can
> > read from all the drives at the same time and get an bandwidth
> > increase.
> 
> OK, so you want to have 4 15 kB reads, and you expect a performance
> improvement because of it.
> 
> Let's consider the hardware: a good modern disk has a disk transfer
> rate of 10 MB/s and a rotational speed of 7200 rpm.  Let's look at the
> times involved:
> 
> 		rotational		transfer time	total
> 		latency
> 
> 1 disk/60 kB	   4.2 ms		6 ms		10.2 ms
> 4 disks/15 kB	   7.8 ms		1.5 ms		 9.3 ms
> 
> Huh?  Why the difference in rotational latency?  If you're reading
> from one disk, on average you'll have a half track latency.  For two,
> on average one is half a track off from the other, so you'll have a
> latency of .75 a track.  With three drives, it's .875, and with four
> drives, it's .9375 of a track.  Still, in this case (the largest
> possible block size, and only 4 disks), you win--barely.  Let's look
> at a more typical case: 16 kB
> 
> 		rotational		transfer time	total
> 		latency
> 
> 1 disk/16 kB	   4.2 ms		1.6 ms		 5.8 ms
> 4 disks/4 kB	   7.8 ms		 .4 ms		 8.2 ms
> 
> Most transfers are 16 kB or less.  What really kills you is the lack
> of spindle synchronization between the disks.  If they were
OK I agree - but to get a optimal performance it was alwaysdependend 
of application to choice the best parameters for stripes and newfsparms

> synchronized, that would be fine, but that's more complicated than it
> looks.  You'd need identical disks with identical layout (subdisks in
> the same place on each disk).  And it's almost impossible to find
> spindle synchronized disks nowadays.  Finally, aggregating involves a
The  drives I'm running vinum on ARE capable of spindle syncronisation
and I know for shure that at least modern IBM server disks are to.
If there's an interest I can ask Seagate for the cabeling to use it on my.

> scatter/gather approach which, unless I've missed something, is not
> supported at a hardware level.  Each request to the driver specifies
> one buffer for the transfer, so the scatter gather would have to be
> done by allocating more memory and performing the transfer there (for
> a read) and then copying to the correct place.
That's something I don't understand - where's the difference between usual
parallel access?
> 
> I have thought about aggregating in the manner you describe, and to a
> certain extent I feel it's a copout not to do so.  I hope you now see
> that it doesn't really make sense in this context.
> 
> Greg
> --
> See complete headers for address, home page and phone numbers
> finger grog@lemis.com for PGP public key

-- 
  B.Walter


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message