Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 02 May 2005 10:39:18 -0400
From:      Allen <bsdlists@rfnj.org>
To:        Arne =?iso-8859-1?Q?=22W=F6rner=22?= <arne_woerner@yahoo.com>
Cc:        freebsd-performance@freebsd.org
Subject:   Re: Very low disk performance on 5.x
Message-ID:  <6.2.1.2.2.20050502103041.037618d0@mail.rfnj.org>
In-Reply-To: <20050502141456.90371.qmail@web41212.mail.yahoo.com>
References:  <6.2.1.2.2.20050502094757.037077f0@mail.rfnj.org> <20050502141456.90371.qmail@web41212.mail.yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
At 10:14 5/2/2005, Arne "W=F6rner" wrote:
>--- Allen <bsdlists@rfnj.org> wrote:
> > Also you should keep in mind, there could simply be some really
> > goofy
> > controller option enabled, that forces the RAID5 to behave in a
> > "degraded"
> > state for reads -- forcing it to read up all the other disks in
> > the stripe
> > and calculate the XOR again, to make sure the data it read off
> > the disk
> > matches the checksum.  It's rare, but I've seen it before, and
> > it will
> > cause exactly this sort of RAID5 performance inversion.  Since
> > the XOR is
> > recalculated on every write and requires only reading up one
> > sector on a
> > different disk, options that do the above will result in read
> > scores
> > drastically lower than writes to the same array.
> >
>Isn't that compensated by the cache? I mean:
>We would just
>1. read all the blocks, that correspond to the block, that is
>requested,
>2. put them all into the cache
>3. check the parity bits (XOR should be very fast; especially in
>comparison to the disc read times)
>4. keep them in the cache (some kind of read ahead...)
>5. send the requested block to the driver

Your steps are appropriate but you should note that #3 is not true on cards=
=20
that support RAID5 but do not have hardware-XOR.  Some of the very cheap=20
i960 based cards have this failing, so the XOR itself is slow on top of=20
everything else.

However, the cache doesn't play a part in this at all.  It's the difference=
=20
between these two read cycles, assume just one block was written to out of=
=20
4, on a 5-disk system.

Scenario A, verified read disabled:
1. RAID card reads up one block from appropriate drive.  Done.

Scenario B, verified read enabled:
1. RAID card reads up ALL blocks in the stripe (5 reads).
2. RAID card pretends the block requested is on a "degraded" drive, and=20
calculates it from the other 3 + the XOR stripe.
3. RAID card reports the value back, or tosses some kind of error.

You can see, the cache just doesn't play a part in what I was describing,=20
which is basically the array performing as though it is degraded when in=20
fact it is not, to catch failures that would otherwise be missed.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6.2.1.2.2.20050502103041.037618d0>