From owner-cvs-all Mon Oct 29 0:46:43 2001 Delivered-To: cvs-all@freebsd.org Received: from monorchid.lemis.com (monorchid.lemis.com [192.109.197.75]) by hub.freebsd.org (Postfix) with ESMTP id 483C837B405; Mon, 29 Oct 2001 00:46:20 -0800 (PST) Received: by monorchid.lemis.com (Postfix, from userid 1004) id 242C6786E1; Mon, 29 Oct 2001 19:16:07 +1030 (CST) Date: Mon, 29 Oct 2001 19:16:07 +1030 From: Greg Lehey To: Bruce Evans Cc: Matthew Jacob , Doug Rabson , Luigi Rizzo , John Baldwin , Jonathan Lemon , cvs-all@FreeBSD.org, cvs-committers@FreeBSD.org Subject: Re: RAID-5 parity calculations (was: cvs commit: src/sys/dev/fxp if_fx) Message-ID: <20011029191607.C19178@monorchid.lemis.com> References: <20011029100728.D88146@monorchid.lemis.com> <20011029190516.G9442-100000@delplex.bde.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20011029190516.G9442-100000@delplex.bde.org>; from bde@zeta.org.au on Mon, Oct 29, 2001 at 07:20:38PM +1100 Organization: The FreeBSD Project Phone: +61-8-8388-8286 Fax: +61-8-8388-8725 Mobile: +61-418-838-708 WWW-Home-Page: http://www.FreeBSD.org/ X-PGP-Fingerprint: 6B 7B C3 8C 61 CD 54 AF 13 24 52 F8 6D A4 95 EF Sender: owner-cvs-all@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Monday, 29 October 2001 at 19:20:38 +1100, Bruce Evans wrote: > On Mon, 29 Oct 2001, Greg Lehey wrote: > >> On Sunday, 28 October 2001 at 22:57:33 +1100, Bruce Evans wrote: >>> On Sat, 27 Oct 2001, Greg Lehey wrote: >>> >>>> On Thursday, 25 October 2001 at 15:24:06 -0700, Matt Jacob wrote: >>>>> >>>>> And the fastest software RAID-V I've known was at NASA/Ames on the >>>>> Convex 3280s- they used the otherwise unused vector units for parity >>>>> calculations- this gave write performance for a 22 wide stripe on a >>>>> terabyte fileystem to be at about 88% of theoretical maximum, which > ^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>>> sure aint' bad. >>>> >>>> The parity calculations for RAID-5 are several orders of magnitude >>>> faster than the disk accesses. Even on a 486, they took hardly any >>>> time. >>> >>> Actually, a 486 can't possibly have been more than about one order of >>> magnitude faster than the disk accesses, since main memory was only >>> that much faster (usually less). My 486DX2/66 has 15MB/sec main memory >>> and a 2MB/sec disk. It would be possible to upgrade the disk (but not >>> the memory). Then the disk would want to transfer at about half an >>> order of magnitude faster then the memory. >> >> My claims are based on measurements, not theory. > > So are mine. Since we are talking about the theoretical maximum, only > certain measurements are relevant. My point was that any practical results which contradict the theoretical would cast doubt on the theory: > Actually, a 486 can't possibly have been more than about one order > of magnitude faster than the disk accesses, It's a valid point, however, that if you're really writing sequentially to a RAID-5 stripe, things change. In particular, you would use a different algorithm which would write whole stripes only. Vinum doesn't do this, because this kind of access remains almost completely theoretical, and it would harm the normal case of random accesses. >> You're forgetting that most of the transfer time is in positioning. >> That's why (in the original message) I mentioned the transfer size. A >> 2 MB/s disk is fast for those days; I've seen more like 800 kB/s. Even >> accepting your values, the average seek time is 10 ms (check with >> rawio if you have a different expectation). Such a disk, doing >> transfers of 6 kB, will perform about 75 random transfers per second, >> or about 450 kB/s. (By comparison, a disk with 800 kB/s transfer rate >> would perform about 57 transfers). > > I didn't forget this. It's not interesting that the disk can be slowed > down by a huge factor by writing tinygrams. It's immensely interesting. That's the way most things work nowadays. Look at the output of iostat or a similar tool. Sure, it's a lot sexier to talk about sequential transfers, but they're extremely uncommon. > I also intentionally didn't mention that main memory speed might not > be a factor because the i/o is already pessimized by using PIO. > (The extra main memory accesses for parity computations may reduce > the main memory accesses for PIO.) Yes, that's a valid point, and I was also ignoring PIO. My tests were done with SCSI (AHA 1542B in the case of the 486). Greg -- See complete headers for address and phone numbers To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe cvs-all" in the body of the message