From owner-freebsd-scsi Sat Mar 20 14:15:18 1999 Delivered-To: freebsd-scsi@freebsd.org Received: from allegro.lemis.com (allegro.lemis.com [192.109.197.134]) by hub.freebsd.org (Postfix) with ESMTP id EC50914ED6 for ; Sat, 20 Mar 1999 14:14:58 -0800 (PST) (envelope-from grog@freebie.lemis.com) Received: from freebie.lemis.com (freebie.lemis.com [192.109.197.137]) by allegro.lemis.com (8.9.1/8.9.0) with ESMTP id IAA09922; Sun, 21 Mar 1999 08:44:39 +1030 (CST) Received: (from grog@localhost) by freebie.lemis.com (8.9.3/8.9.0) id IAA93881; Sun, 21 Mar 1999 08:44:37 +1030 (CST) Message-ID: <19990321084436.Z429@lemis.com> Date: Sun, 21 Mar 1999 08:44:36 +1030 From: Greg Lehey To: Tom Cc: Nick Hilliard , freebsd-scsi@FreeBSD.ORG Subject: Re: dpt raid-5 performance References: <19990321074613.V429@lemis.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.93.2i In-Reply-To: ; from Tom on Sat, Mar 20, 1999 at 12:58:40PM -0800 WWW-Home-Page: http://www.lemis.com/~grog Organization: LEMIS, PO Box 460, Echunga SA 5153, Australia Phone: +61-8-8388-8286 Fax: +61-8-8388-8725 Mobile: +61-41-739-7062 Sender: owner-freebsd-scsi@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Saturday, 20 March 1999 at 12:58:40 -0800, Tom wrote: > > On Sun, 21 Mar 1999, Greg Lehey wrote: > >> On Saturday, 20 March 1999 at 8:55:27 -0800, Tom wrote: >>> >>> On Fri, 19 Mar 1999, Nick Hilliard wrote: >>> >>>> If I set up a raid-5 system on these disks with a slice-size of 512K, the >>>> write performance is terrible. Bonnie reports: >>> >>> 512k will not result in very good single process write performance. >>> In fact, 512k is suboptimal for many operations on RAID-5 arrays. >> >> I disagree. It's pretty much optimal for Vinum. I'd be interested in >> your reasoning. > > Not so optimum for a DPT card running RAID-5. Why not? > FreeBSD will likely never send IOs big enough for the strip size of > 512kb to ever be useful. It does. The largest I/O transfer is 64 kB. You don't want to fragment requests, because that increases the I/O load on the array. >>> If you are optimizing for a single process, use something smaller >>> (like 8k). >> >> That's far, far too small. > > Why? From an earlier message: > On Thursday, 17 December 1998 at 1:05:29 -0500, Matthew Patton wrote: >> Here are my suggestions, all are predominantly HW based. In all cases, 1 >> hot spare per channel. Plenty of ambient cooling and power regulation is >> needed. In addition to the simple 'art' of RAID selection there is also >> 'slice/interleave' size to factor in. With drives doing full track read >> aheads with their own fancy algorithms and varying lamounts of onboard >> cache, precise numbers are very difficult to come up with. It depends on >> what kinds of IO you do. On a database, I would probably use 16 or >> 32kb. > > On FreeBSD, this is too small. > >> A media server (large files) 64kb or more. On boxes with lots of >> small sizes, accessed randomly and rapidly, 8 or 16kb. > > Far too small > >> But you dont' want to fill up the controller's command queue with >> too many commands. > > That's not the big problem. The fact is that the block I/O system > issues requests of between .5kB and 60 kB; a typical mix is somewhere > round 8 kB. You can't stop any striping system from breaking a > request into two physical requests, and if you do it wrong it can be > broken into several. This will result in a significant drop > in performance: the decrease in transfer time per disk is offset by > the order of magnitude greater increase in latency. > > With modern disk sizes and the FreeBSD block I/O system, you can > expect to have a reasonably small number of fragmented requests with a > stripe size between 256 kB and 512 kB; I can't see any reason not to > increase the size to 2 or 4 MB on a large disk. > > The easiest way to consider the impact of any transfer is the total > time it takes: since just about everything is cached, the time > relationship between the request and its completion is not important. > Consider, then, a typical news article of 24 kB, which will probably > be read in a single I/O. Take disks with a transfer rate of 6 MB/s > and an average positioning time of 8 ms, and a file system with 4 kB > blocks. Since it's 24 kB, we don't have to worry about fragments, so > the file will start on a 4 kB boundary. The number of transfers > required depends on where the block starts: it's (S + F - 1) / S, > where S is the stripe size in file system blocks, and F is the file > size in file system blocks. > > 1: Stripe size of 4 kB. You'll have 6 transfers. Total subsystem > load: 48 ms latency, 2 ms transfer, 50 ms total. > > 2: Stripe size of 8 kB. On average, you'll have 3.5 transfers. Total > subsystem load: 28 ms latency, 2 ms transfer, 30 ms total. > > 3: Stripe size of 16 kB. On average, you'll have 2.25 transfers. > Total subsystem load: 18 ms latency, 2 ms transfer, 20 ms total. > > 4: Stripe size of 256 kB. On average, you'll have 1.08 transfers. > Total subsystem load: 8.6 ms latency, 2 ms transfer, 10.6 ms total. > > These calculations are borne out in practice. I haven't yet replied to Nick's message because I wanted to check something here first, and I've been too busy so far. But I'll come back with some comparisons. Greg -- See complete headers for address, home page and phone numbers finger grog@lemis.com for PGP public key To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message