From owner-freebsd-stable@FreeBSD.ORG Mon Mar 31 19:01:38 2003 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BF05037B401 for ; Mon, 31 Mar 2003 19:01:38 -0800 (PST) Received: from k9.kaibren.com (dsl092-237-027.phl1.dsl.speakeasy.net [66.92.237.27]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7FB0A43F3F for ; Mon, 31 Mar 2003 19:01:37 -0800 (PST) (envelope-from mbrenner@kaibren.com) Received: from cpimb-cme56.kaibren.com (cpimb-cme56.kaibren.com [169.254.12.29]) by k9.kaibren.com (8.12.9/8.12.9) with ESMTP id h3131Z6r035730; Mon, 31 Mar 2003 22:01:36 -0500 (EST) (envelope-from mbrenner@kaibren.com) Message-Id: <5.2.0.9.2.20030331214429.02677a90@gw.kaibren.com> X-Sender: mbrenner@gw.kaibren.com (Unverified) X-Mailer: QUALCOMM Windows Eudora Version 5.2.0.9 Date: Mon, 31 Mar 2003 22:01:25 -0500 To: Jason Andresen , FreeBSD Stable List From: "Michael C. Brenner" In-Reply-To: <3E88B601.90802@mitre.org> References: <3E88AECD.10607@liwing.de> <20030330125138.K23911@leelou.in.tern> <3E870CC7.5000204@mac.com> <20030330175605.E23911@leelou.in.tern> <3E87204C.5060304@ludd.luth.se> <3E88524A.1060600@mitre.org> <3E88AECD.10607@liwing.de> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Subject: Re: vinum performance X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Apr 2003 03:01:39 -0000 At 04:41 PM 3/31/2003, Jason Andresen wrote: >>>>>Ok. But I still don't understand why RAID 5 write performance is _so_ bad. >>>>>The CPU is not the bottle neck, it's rather bored. And I don't understand >>>>>why RAID 0 doesn't give a big boost at all. Is the ahc driver known to be >>>>>slow? >>>> >>>>(Both of these were on previously untouched files to prevent any >>>>caching, and the "write" test is on a new file, not rewriting an old one) >Write speed: >81920000 bytes transferred in 3.761307 secs (21779663 bytes/sec) >Read speed: >81920000 bytes transferred in 3.488978 secs (23479655 bytes/sec) > >But on the RAID5: >Write speed: >81920000 bytes transferred in 17.651300 secs (4641018 bytes/sec) >Read speed: >81920000 bytes transferred in 4.304083 secs (19033090 bytes/sec) Writing to a RAID5 stripe set requires that all disks in the array successfully report completion before the RAID5 controller's buffer can be released back to the cache. (Applies to either software or hardware raid.) If you are doing a large block write (like dd) you can easily fill the cache on most controllers. Once the cache is full, the controller slows each write to the LONGEST completion time of each spindle in the array. ECC calculation becomes part of the latency also. In a 5 drive system (other than one where the cache is larger than the largest file being written as in a large EMC array) the writes are always about 4-5 times longer than the reads. Tuning stripes and blocking factors can speed up a specific transfer but RAID5 has always been slow to write large data and best for read mostly data. Read operations benefit from RAID5 or mirrors. Now the shortest completion time of the minimal drive set is the gating event. The first set of drives to deliver the data block ends the operation. This makes a 2 to 1 difference into a 4 to one difference. MB