Date: Mon, 02 May 2005 10:39:18 -0400 From: Allen <bsdlists@rfnj.org> To: Arne =?iso-8859-1?Q?=22W=F6rner=22?= <arne_woerner@yahoo.com> Cc: freebsd-performance@freebsd.org Subject: Re: Very low disk performance on 5.x Message-ID: <6.2.1.2.2.20050502103041.037618d0@mail.rfnj.org> In-Reply-To: <20050502141456.90371.qmail@web41212.mail.yahoo.com> References: <6.2.1.2.2.20050502094757.037077f0@mail.rfnj.org> <20050502141456.90371.qmail@web41212.mail.yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
At 10:14 5/2/2005, Arne "W=F6rner" wrote: >--- Allen <bsdlists@rfnj.org> wrote: > > Also you should keep in mind, there could simply be some really > > goofy > > controller option enabled, that forces the RAID5 to behave in a > > "degraded" > > state for reads -- forcing it to read up all the other disks in > > the stripe > > and calculate the XOR again, to make sure the data it read off > > the disk > > matches the checksum. It's rare, but I've seen it before, and > > it will > > cause exactly this sort of RAID5 performance inversion. Since > > the XOR is > > recalculated on every write and requires only reading up one > > sector on a > > different disk, options that do the above will result in read > > scores > > drastically lower than writes to the same array. > > >Isn't that compensated by the cache? I mean: >We would just >1. read all the blocks, that correspond to the block, that is >requested, >2. put them all into the cache >3. check the parity bits (XOR should be very fast; especially in >comparison to the disc read times) >4. keep them in the cache (some kind of read ahead...) >5. send the requested block to the driver Your steps are appropriate but you should note that #3 is not true on cards= =20 that support RAID5 but do not have hardware-XOR. Some of the very cheap=20 i960 based cards have this failing, so the XOR itself is slow on top of=20 everything else. However, the cache doesn't play a part in this at all. It's the difference= =20 between these two read cycles, assume just one block was written to out of= =20 4, on a 5-disk system. Scenario A, verified read disabled: 1. RAID card reads up one block from appropriate drive. Done. Scenario B, verified read enabled: 1. RAID card reads up ALL blocks in the stripe (5 reads). 2. RAID card pretends the block requested is on a "degraded" drive, and=20 calculates it from the other 3 + the XOR stripe. 3. RAID card reports the value back, or tosses some kind of error. You can see, the cache just doesn't play a part in what I was describing,=20 which is basically the array performing as though it is degraded when in=20 fact it is not, to catch failures that would otherwise be missed.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6.2.1.2.2.20050502103041.037618d0>