From owner-freebsd-chat  Mon Dec 13 17: 2:58 1999
Delivered-To: freebsd-chat@freebsd.org
Received: from mta3.rcsntx.swbell.net (mta3.rcsntx.swbell.net [151.164.30.27])
	by hub.freebsd.org (Postfix) with ESMTP id F1CDF152CE
	for <chat@FreeBSD.ORG>; Mon, 13 Dec 1999 17:02:49 -0800 (PST)
	(envelope-from noslenj@swbell.net)
Received: from swbell.net ([207.193.44.221]) by mta3.rcsntx.swbell.net
 (Sun Internet Mail Server sims.3.5.1999.09.16.21.57.p8)
 with ESMTP id <0FMP008KOHK2N7@mta3.rcsntx.swbell.net> for chat@FreeBSD.ORG;
 Mon, 13 Dec 1999 19:02:34 -0600 (CST)
Received: from localhost (localhost [127.0.0.1]) by swbell.net (8.9.3/8.9.3)
 with ESMTP id SAA00402; Mon,
 13 Dec 1999 18:43:16 -0600 (CST envelope-from noslenj@swbell.net)
Date: Mon, 13 Dec 1999 18:43:16 -0600 (CST)
From: Jay Nelson <noslenj@swbell.net>
Subject: Re: dual 400 -> dual 600 worth it?
In-reply-to: <199912140001.RAA25603@usr08.primenet.com>
To: Terry Lambert <tlambert@primenet.com>
Cc: chat@FreeBSD.ORG
Message-id: <Pine.BSF.4.05.9912131841490.388-100000@acp.swbell.net>
MIME-version: 1.0
Content-type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-chat@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

This answers the questions. Thanks, Terry. I've left the whole message
intact for the benefit of others searching the archives.

-- Jay

On Tue, 14 Dec 1999, Terry Lambert wrote:

>> On Sat, 11 Dec 1999, Terry Lambert wrote:
>> 
>> [snip]
>> 
>> >Soft updates speak to the ability to stall dependent writes
>> >until they either _must_ be done, or until they no longer are
>> >relevent.  It does this as a strategy for ordering the metadata
>> >updates (other methods are DOW - Delayed Ordered Writes - and
>> >synchronous writing of metadata... in decreasing order of
>> >performance).
>> 
>> It's this ability to delay and gather writes that prompted the
>> question. If a SCSI bus can handle 8-12MB with tagged queuing and
>> UltraDMA can do 30MB while blocking, where do the performance lines
>> cross -- or do they? As the number of spindles go up, I would expect
>> SCSI to outperform IDE -- but on a single drive system, do writes to
>> an IDE at UDMA speeds block less than gathered writes to a drive on a
>> nominally slower SCSI bus?
>
>The question you have to ask is whether or not your I/O requests
>are going to be interleaved (and therefore concurrently outstanding)
>or not.
>
>You only need four outstanding concurrent requests for your 8M
>SCSI bus to beat a 30M UltraDMA.  I'll note for the record here
>that your PCI bus is capable of 133M burst, and considerably
>less when doing continuous duty, so bus limits will hit before
>that (I rather suspect your 30M number, if it's real and not a
>"for instance", is a burst rate, not a continuous transfer rate,
>and depends either on pre-reading or a disk read cache hit).
>
>
>Effectively, what you are asking is "if I have a really fast
>network interface, can I set my TCP/IP windows size down to
>one frame per window, and get better performance than a not
>quite as fast interface using a large window?".
>
>The answer depends on your latency.  If you have zero latency,
>then you don't need the ability to interleave I/O.  If you
>non-zero latency, then interleaving I/O will help you move
>more data.
>
>For tagged command queues, the questions are:
>
>o	are your seek times so fast that elevator sorting the
>	requests (only possible if the drive knows about two
>	or more requests) will not yield better performance?
>
>o	is your transfer latency so low that interleaving
>	your I/O will not yield better performance?
>
>o	is all of your I/O so serialized, because you are only
>	actively running a single application at a time, that
>	you won't benefit from increased I/O concurrency?
>
>
>
>> [snip]
>> 
>> >So the very short answer to the question is "on a multi-applicaiton
>> >server, using soft updates doesn't mean that you wouldn't benefit
>> >from interleaving your I/O".
>> 
>> Hmm... that suggests you also might not. On a single drive system 
>> with soft updates, would an Ultra IDE perform worse, on par or better
>> than SCSI with a light to moderate IO load?
>
>Worse, if there is any significance to the I/O, so long as
>you have non-zero latency.
>
>Unfortunately, with drives which lie about their true geometry,
>and which claim to be "perfect" by automatically redirecting
>bad sectors, it's not possible to elevator sort in the OS (this
>used to be common, and variable/unknown drive geometry is why
>this tuning is currently turned off by default in newfs).
>
>
>> >To speak to Brett's issue of RAID 5, parity is striped across
>> >all disks, and doing parity updates on one stripe on one disk
>> >will block all non-interleaved I/O to all other areas of the
>> >disk; likewise, doing a data write will prevent a parity write
>> >from completing for the other four disks in the array.
>> 
>> I have seen reletavely few benefits of RAID 5. Performance sucks
>> relative to mirroring across separate controllers and until you reach
>> 10-12 drives, the cost is about the same. I never thought about the
>> parity, though. Now that I have, I like RAID 5 even less.
>
>If your I/O wasn't being serialized by your drives/controller,
>the parity would be much less of a performance issue.
>
>
>> >This effectively means that, unless you can interleave I/O
>> >requests, as tagged command queues do, you are much worse off.
>> 
>> Applying that to the single drive IDE vs. SCSI question suggests that,
>> even with higher bus burst speeds, I'm still lkely to end up worse
>> off, depending on load, than I would with SCSI -- soft updates
>> not withstanding. Is that correct?
>
>Yes, depending on load.
>
>For a single user desktop connected to a human, generally you
>only run one application at a time, and so serialized I/O is
>OK, even if you are doing streaming media to or from the disk.
>
>The master/slave bottleneck and multimedia are the reason
>that modern systems with IDE tend to have two controllers,
>with one used for the main disk, and the other used for the
>CDROM/DVD.
>
>
>> >As to the issue of spindle sync, which Brett alluded to, I
>> >don't think that it is supported for IDE, so you will be
>> >eating a full rotational latency on average, instead of one
>> >half a rotational latency, on average: (0 + 1)/2 vs. (1 + 1)/2.
>> 
>> I think that just answered my earlier question.
>
>This is a RAID-specific issue.  Most modern drives in a one
>drive system record sectors in descending order on a track.  As
>soon as the seek completes, it begins caching data, and returns
>the data you asked for as soon as it has cached back far enough.
>For single application sequential read behaviour, this pre-caches
>the data that you are likely to ask for next.
>
>A different issue that probably applies to what you meant is
>for a multiple program machine with two or more readers, the
>value of the cache is greatly reduced, unless the drive supports
>multiple track caches, one per reader, in order to preserve
>locality of reference for each reader.  This one is not RAID
>specific, but is load specific.
>
>In general, good SCSI drives will support a track cache per
>tagged command queue.
>
>It would be interesting to test algorithms for increasing I/O
>locality to keep the number of files on which I/O is outstanding
>under the number of track caches (probably by delaying I/O for
>files that were not in the last track cache count minus one I/Os
>in favor of I/O requests that are).
>
>
>> >Rod Grimes did some experimentation with CCD and spindle sync
>> >on SCSI devices back when CCD first became capable of mirroring,
>> >and has some nice hard data that you should ask him for (or dig
>> >it out of DejaNews on the FreeBSD news group).
>> 
>> Thanks -- I'll look them up. And -- I appreciate your answer. I
>> learned quite a bit from it. It did raise the question of differences
>> between soft updates and lfs -- but I'll save that for another time.
>
>8-).
>
>
>					Terry Lambert
>					terry@lambert.org
>---
>Any opinions in this posting are my own and not those of my present
>or previous employers.
>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-chat" in the body of the message