From owner-freebsd-performance@FreeBSD.ORG Mon May 2 16:43:55 2005 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 40BB116A4E3; Mon, 2 May 2005 16:43:55 +0000 (GMT) Received: from multiplay.co.uk (www1.multiplay.co.uk [212.42.16.7]) by mx1.FreeBSD.org (Postfix) with ESMTP id B99ED43D54; Mon, 2 May 2005 16:43:53 +0000 (GMT) (envelope-from killing@multiplay.co.uk) Received: from vader ([212.135.219.179]) by multiplay.co.uk (multiplay.co.uk [212.42.16.7]) (MDaemon.PRO.v8.0.1.R) with ESMTP id md50001375947.msg; Mon, 02 May 2005 17:39:47 +0100 Message-ID: <003901c54f36$0c64ad40$b3db87d4@multiplay.co.uk> From: "Steven Hartland" To: "Poul-Henning Kamp" References: <17813.1115042014@critter.freebsd.dk> Date: Mon, 2 May 2005 17:43:16 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.2527 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2527 X-Spam-Processed: multiplay.co.uk, Mon, 02 May 2005 17:39:47 +0100 (not processed: message from valid local sender) X-MDRemoteIP: 212.135.219.179 X-Return-Path: killing@multiplay.co.uk X-MDAV-Processed: multiplay.co.uk, Mon, 02 May 2005 17:39:47 +0100 cc: Eric Anderson cc: freebsd-performance@freebsd.org cc: Robert Watson Subject: Re: Very low disk performance on 5.x X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 May 2005 16:43:55 -0000 ----- Original Message ----- From: "Poul-Henning Kamp" >>Interesting stuff so: >>1. How to we test if this is happening? > > Calculate by hand what the offset of the striped/raid part of the disk > is (ie: take slice+partition stats into account). How's that done? An explained example would be good. >>3. Why would this be effecting reads and not writes as surely the same >>blocking is being done for both? > > Write on RAID5 uses a cache which lies to you about when things are > safely stored on the disk. Your assuming I have cache on the card I dont. So the question still remains. > Good RAID5 has battery backup for that cache. > > The MBR slice format is stupid because it more often than not gets > this exactly wrong. Typically there are 63 "sectors per track" and > that ruins any alignment in 99% of the cases. Surely if this is know about it would be something that should have been fixed ages ago as it will be crippling everyone. > Sysinstall, fdisk and bsdlabel should know about all this and try > to help the user get it right. Fixing them to do so may be more > trouble than writing a better too bottom up. Ok from what your saying it sounds like RAID on FreeBSD is useless apart to create large disks. Now to the damaging facts the results from my two days worth of testing: *FreeBSD 6.0-CURRENT* H/W RAID5 ( 5 disk ) 16kb Stripe Write: 137Mb/s Read: 131Mb/s *FreeBSD 5.4-STABLE* H/W RAID5 16kb Stripe Write: 138Mb/s Read: 130Mb/s H/W RAID5 32kb Stripe Write: 137Mb/s Read: 115Mb/s H/W RAID5 64kb Stripe (Default ) Write: 139Mb/s Read: 88Mb/s H/W RAID5 1M Stripe Write: 141Mb/s Read: 51Mb/s S/W RAID5 Default Stripe ( vinum ) Write: 6Mb/s Read: 23Mb/s *FreeBSD 4.11-RELEASE* H/W RAID5 16kb Strip Write: 138Mb/s Read: 130Mb/s *Linux ( Suse 9.1 )* H/W RAID5 16kb Stripe Write: 105Mb/s Read: 137Mb/s H/W RAID5 32kb Stripe Write: 112Mb/s Read: 182Mb/s H/W RAID5 64kb Stripe ( Default ) Write: 120Mb/s Read: 122Mb/s H/W RAID5 1M Stripe Write: 117Mb/s Read: 102Mb/s S/W RAID5 Default Stripe ( Linux RAID ) Write: 269Mb/s Read: 259Mb/s *Summary / Conclusions* 1. Linux on this controller disk set is significantly quicker using H/W RAID logging a max read rate of 182Mb/s compared to 131Mb/s for FreeBSD. 2. The version of FreeBSD used ( for this controller ) didn't have any significant difference on performance, they where all poor. 3. Software RAID in linux totally blows away all the other configurations logging a max sustained read rate of 259Mb/s and write of 269Mb/s which shows the disks / controller are capable of producing the expected good performance. In comparison FreeBSD's vinum is not even worth using. N.B. vinum's extremely poor performance could have been down to poor default config but there are no performance tuning details to be found in the docs. *Test method / Hardware* Dual 244 Opteron 2Gb ECC RAM Highpoint 1820a controller in a PCI-X 133Mhz Slot 5 x Seagate 400GB SATA disks Write ( 6Gb ): dd if=/dev/zero of=/mnt/testfile bs=64k count=100000 Read ( 6Gb ): dd if=/mnt/testfile of=/dev/null bs=64k count=100000 and the following to check if block size was affecting / to test a typical app read. /usr/bin/time -h cat /mnt/testfile > /dev/null All tests where done on a empty formatted partition 100Gb in size, on a freshly initialised RAID5 array. OS was on an independent disk off the motherboard ( not connected to the raid controller ). All partitions / file system creation was done using the OS default tool i.e. FreeBSD: sysinstall, Suse: yast Note: FreeBSD 4.11 sysinstall's label was not functional on this array so it was created with a manual disklabel + newfs ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone (023) 8024 3137 or return the E.mail to postmaster@multiplay.co.uk.