From owner-freebsd-chat Mon Dec 13 17: 2:58 1999 Delivered-To: freebsd-chat@freebsd.org Received: from mta3.rcsntx.swbell.net (mta3.rcsntx.swbell.net [151.164.30.27]) by hub.freebsd.org (Postfix) with ESMTP id F1CDF152CE for ; Mon, 13 Dec 1999 17:02:49 -0800 (PST) (envelope-from noslenj@swbell.net) Received: from swbell.net ([207.193.44.221]) by mta3.rcsntx.swbell.net (Sun Internet Mail Server sims.3.5.1999.09.16.21.57.p8) with ESMTP id <0FMP008KOHK2N7@mta3.rcsntx.swbell.net> for chat@FreeBSD.ORG; Mon, 13 Dec 1999 19:02:34 -0600 (CST) Received: from localhost (localhost [127.0.0.1]) by swbell.net (8.9.3/8.9.3) with ESMTP id SAA00402; Mon, 13 Dec 1999 18:43:16 -0600 (CST envelope-from noslenj@swbell.net) Date: Mon, 13 Dec 1999 18:43:16 -0600 (CST) From: Jay Nelson Subject: Re: dual 400 -> dual 600 worth it? In-reply-to: <199912140001.RAA25603@usr08.primenet.com> To: Terry Lambert Cc: chat@FreeBSD.ORG Message-id: MIME-version: 1.0 Content-type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-chat@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org This answers the questions. Thanks, Terry. I've left the whole message intact for the benefit of others searching the archives. -- Jay On Tue, 14 Dec 1999, Terry Lambert wrote: >> On Sat, 11 Dec 1999, Terry Lambert wrote: >> >> [snip] >> >> >Soft updates speak to the ability to stall dependent writes >> >until they either _must_ be done, or until they no longer are >> >relevent. It does this as a strategy for ordering the metadata >> >updates (other methods are DOW - Delayed Ordered Writes - and >> >synchronous writing of metadata... in decreasing order of >> >performance). >> >> It's this ability to delay and gather writes that prompted the >> question. If a SCSI bus can handle 8-12MB with tagged queuing and >> UltraDMA can do 30MB while blocking, where do the performance lines >> cross -- or do they? As the number of spindles go up, I would expect >> SCSI to outperform IDE -- but on a single drive system, do writes to >> an IDE at UDMA speeds block less than gathered writes to a drive on a >> nominally slower SCSI bus? > >The question you have to ask is whether or not your I/O requests >are going to be interleaved (and therefore concurrently outstanding) >or not. > >You only need four outstanding concurrent requests for your 8M >SCSI bus to beat a 30M UltraDMA. I'll note for the record here >that your PCI bus is capable of 133M burst, and considerably >less when doing continuous duty, so bus limits will hit before >that (I rather suspect your 30M number, if it's real and not a >"for instance", is a burst rate, not a continuous transfer rate, >and depends either on pre-reading or a disk read cache hit). > > >Effectively, what you are asking is "if I have a really fast >network interface, can I set my TCP/IP windows size down to >one frame per window, and get better performance than a not >quite as fast interface using a large window?". > >The answer depends on your latency. If you have zero latency, >then you don't need the ability to interleave I/O. If you >non-zero latency, then interleaving I/O will help you move >more data. > >For tagged command queues, the questions are: > >o are your seek times so fast that elevator sorting the > requests (only possible if the drive knows about two > or more requests) will not yield better performance? > >o is your transfer latency so low that interleaving > your I/O will not yield better performance? > >o is all of your I/O so serialized, because you are only > actively running a single application at a time, that > you won't benefit from increased I/O concurrency? > > > >> [snip] >> >> >So the very short answer to the question is "on a multi-applicaiton >> >server, using soft updates doesn't mean that you wouldn't benefit >> >from interleaving your I/O". >> >> Hmm... that suggests you also might not. On a single drive system >> with soft updates, would an Ultra IDE perform worse, on par or better >> than SCSI with a light to moderate IO load? > >Worse, if there is any significance to the I/O, so long as >you have non-zero latency. > >Unfortunately, with drives which lie about their true geometry, >and which claim to be "perfect" by automatically redirecting >bad sectors, it's not possible to elevator sort in the OS (this >used to be common, and variable/unknown drive geometry is why >this tuning is currently turned off by default in newfs). > > >> >To speak to Brett's issue of RAID 5, parity is striped across >> >all disks, and doing parity updates on one stripe on one disk >> >will block all non-interleaved I/O to all other areas of the >> >disk; likewise, doing a data write will prevent a parity write >> >from completing for the other four disks in the array. >> >> I have seen reletavely few benefits of RAID 5. Performance sucks >> relative to mirroring across separate controllers and until you reach >> 10-12 drives, the cost is about the same. I never thought about the >> parity, though. Now that I have, I like RAID 5 even less. > >If your I/O wasn't being serialized by your drives/controller, >the parity would be much less of a performance issue. > > >> >This effectively means that, unless you can interleave I/O >> >requests, as tagged command queues do, you are much worse off. >> >> Applying that to the single drive IDE vs. SCSI question suggests that, >> even with higher bus burst speeds, I'm still lkely to end up worse >> off, depending on load, than I would with SCSI -- soft updates >> not withstanding. Is that correct? > >Yes, depending on load. > >For a single user desktop connected to a human, generally you >only run one application at a time, and so serialized I/O is >OK, even if you are doing streaming media to or from the disk. > >The master/slave bottleneck and multimedia are the reason >that modern systems with IDE tend to have two controllers, >with one used for the main disk, and the other used for the >CDROM/DVD. > > >> >As to the issue of spindle sync, which Brett alluded to, I >> >don't think that it is supported for IDE, so you will be >> >eating a full rotational latency on average, instead of one >> >half a rotational latency, on average: (0 + 1)/2 vs. (1 + 1)/2. >> >> I think that just answered my earlier question. > >This is a RAID-specific issue. Most modern drives in a one >drive system record sectors in descending order on a track. As >soon as the seek completes, it begins caching data, and returns >the data you asked for as soon as it has cached back far enough. >For single application sequential read behaviour, this pre-caches >the data that you are likely to ask for next. > >A different issue that probably applies to what you meant is >for a multiple program machine with two or more readers, the >value of the cache is greatly reduced, unless the drive supports >multiple track caches, one per reader, in order to preserve >locality of reference for each reader. This one is not RAID >specific, but is load specific. > >In general, good SCSI drives will support a track cache per >tagged command queue. > >It would be interesting to test algorithms for increasing I/O >locality to keep the number of files on which I/O is outstanding >under the number of track caches (probably by delaying I/O for >files that were not in the last track cache count minus one I/Os >in favor of I/O requests that are). > > >> >Rod Grimes did some experimentation with CCD and spindle sync >> >on SCSI devices back when CCD first became capable of mirroring, >> >and has some nice hard data that you should ask him for (or dig >> >it out of DejaNews on the FreeBSD news group). >> >> Thanks -- I'll look them up. And -- I appreciate your answer. I >> learned quite a bit from it. It did raise the question of differences >> between soft updates and lfs -- but I'll save that for another time. > >8-). > > > Terry Lambert > terry@lambert.org >--- >Any opinions in this posting are my own and not those of my present >or previous employers. > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-chat" in the body of the message