From owner-freebsd-scsi Mon Apr 12 7:56:35 1999 Delivered-To: freebsd-scsi@freebsd.org Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by hub.freebsd.org (Postfix) with ESMTP id D196814CF0 for ; Mon, 12 Apr 1999 07:55:35 -0700 (PDT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.9.1/8.9.1) with ESMTP id KAA17621; Mon, 12 Apr 1999 10:53:15 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.9.3/8.9.1) id KAA43666; Mon, 12 Apr 1999 10:52:33 -0400 (EDT) (envelope-from gallatin@cs.duke.edu) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Date: Mon, 12 Apr 1999 10:52:32 -0400 (EDT) To: Greg Lehey Cc: freebsd-scsi@FreeBSD.ORG Subject: Re: odd performance 'bug' & other questions In-Reply-To: <19990412112149.F2142@lemis.com> References: <14097.8430.806061.277769@grasshopper.cs.duke.edu> <19990412112149.F2142@lemis.com> X-Mailer: VM 6.43 under 20.4 "Emerald" XEmacs Lucid Message-ID: <14098.675.979991.329696@grasshopper.cs.duke.edu> Sender: owner-freebsd-scsi@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Greg Lehey writes: > On Sunday, 11 April 1999 at 19:14:25 -0400, Andrew Gallatin wrote: > > > > We're setting up a few large servers, each with 6 9GB seagate medalist > > pro drives spread across 2 scsi controllers (aic7890 & ncr53c875). > > > > We've noticed that if we set up the disks using a simple ccd stripe, > > after trying various interleaves, the best read bandwidth we can get > > is only ~35-40MB/sec (using dd if=/dev/rccd0 of=/dev/null bs=64k), > > which is odd because we'd thought we should be getting at least > > 55-60MB/sec, as we get about 13.5MB/sec from each drive with the same > > test. > > This kind of test is not very useful, and may be counterproductive. > What size stripe did you use? 64kb. 80kb shows the same results. > I'm attaching a plot of Vinum performance against stripe size with a > volume spanning four very slow disks. I briefly tested ccd and got > very similar results. These were done with 'rawio' against the > character device with 8 concurrent processes. You'll note that > sequential I/O performance peaks at about 80 kB stripes, whereas > random transfers (which are presumably closer to what you're doing) I'm (at least trying) to do sequential. I'm just concered that the number of defects might mean these drives really don't do sequential.. > improve with increasing stripe size. You'll also notice that the > performance ratio is approximately the same as you describe, rather > less than 3x the single disk, but this is misleading, since I only had > four drives in this test. > > You'll also notice that the performance figures are terrible; that's > because they were done on some ancient drives (look at the throughput > of the raw drives without Vinum). > > > Upon closer examination, we discovered that on some of the drives the > > performance wanders all over the place -- if you do a dd if=/dev/rX > > of=/dev/null bs=64k on an individual disk on an otherwise idle system > > & watch with iostat or systat, you can see the bandwidth jump around > > quite a bit. I'm thinking that my performance problems might be due > > to the fact that the reads aren't really sequential, rather the disk > > arm is moving all over the place to read remapped defective blocks. > > Have you considered rotational latency? Unless you spindle-sync the > drives, you can end up with random delays in moving from one drive to > the next. If you spindle-sync them, you may or may not incur a > whole-rotation delay :-) I've done some tests here, and they show the > same effects for single processes. This sounds interesting... how do I spindle-sync the drives? Also, how does vinum deal with stripes across multiple controllers? Eg. I have da0:ahc0:0:0:0 da1:ahc0:0:1:0 da2:ahc0:0:2:0 da0:ncr0:0:0:0 da1:ncr0:0:1:0 da2:ncr0:0:2:0 Using ccd, if I stripe them so that each component is atached to alternating controllers (which i think is the right way to do it): ccd0 64 none /dev/da0c /dev/da3c /dev/da1c /dev/da4c /dev/da2c /dev/da5c I see about 40MB/sec total from dd if=/dev/rccd0c of=/dev/null bs=64k, or about 6.5MB/sec per drive. If I stripe them like this: ccd0 64 none /dev/da0c /dev/da1c /dev/da2c /dev/da2c /dev/da4c /dev/da5c I see about 29MB/sec, or about 4.8MB/sec from each drive. If I use vinum, no matter how I organize the sripe section, I always get about 27MB/sec. Cheers, Drew ------------------------------------------------------------------------------ Andrew Gallatin, Sr Systems Programmer http://www.cs.duke.edu/~gallatin Duke University Email: gallatin@cs.duke.edu Department of Computer Science Phone: (919) 660-6590 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message