Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 16 Nov 2007 11:20:24 -0500
From:      "Simon" <simon@optinet.com>
To:        "Brian A Seklecki (Mobile)" <bseklecki@collaborativefusion.com>
Cc:        Sean McAfee <smcafee@collaborativefusion.com>, Jason Thomson <jason.thomson@mintel.com>, Benjie Chen <benjie@addgene.org>, "freebsd-hardware@freebsd.org" <freebsd-hardware@freebsd.org>
Subject:   Re: PERC5 (LSI MegaSAS) Patrol Read crashes
Message-ID:  <20071116161831.49FBE13C4B8@mx1.freebsd.org>
In-Reply-To: <1195217644.4042.199.camel@new-host>

next in thread | previous in thread | raw e-mail | index | archive | help

LSI recommends consistency checks at least once a month. This is what
I was told: should a block where parity is become bad, on the next read
during consistency check this parity will be recalculated and rewritten
elsewhere on the disk. This cannot happen on its own without a consistency
check. The drive will try to remap bad blocks but it cannot do this unless
the block is either read from or written to. It won't know there is a bad
block until then. So if you develop bad blocks and a drive fails, you will
have missing parity. Many times, there are places on the disks that are
written to once and are then not used, so the disk never reads those blocks.
When you run Patrol Read, it goes thru entire disk and even those parts
that are otherwise not read from until you need to rebuilt. This is why both
Patrol Reads and Consistency Checks are important.

Please correct me if I'm wrong.

-Simon

On Fri, 16 Nov 2007 07:53:24 -0500, Brian A Seklecki (Mobile) wrote:


>> If Patrol Reads are marketing bullshit, what do you use? consistency

>If a block is bad, a block is bad.  The disk has a certain number of
>spares.  Those are automatically allocated by the disk underneath the
>controller.  

>When that compliment is exhausted, then _THE DISK IS BAD_.  A single bad
>sector on a RAID volume component disk means that you need to _REPLACE
>THE DISK_.  When a controller finds a bad sector, that disk should be
>moved to degraded state.  Period.

>As for the idea of randomly finding and fixing parity errors while the
>OS is running... hardware or otherwise related, that just sounds like a
>bad idea to me.

>I prefer the CMU RAIDFrame / GMirror approach:

>You check the volume at start-up.  You search for records of a graceful
>shutdown on both components.  If you _don't_ find them, you run a full
>parity check.

>The volume is then parity-clean until it is shutdown ungracefully.  

>How could the parity be found to be bad while the OS is running if there
>are no bad components or other hardware events?? ( And why does the
>PERC5 (and for that matter, the PERC4) never scan parity at startup? )


>I asked Dell two year ago and never got an answer.  Until then, FreeBSD
>Gmirror is still a perfectly valid option.

>> The way I see it, until manufacturers such as Dell and HP start fully
>> supporting FreeBSD, the mentioned problems will never go away

>They don't have to "Support FreeBSD".  Thats our job.  What they need to
>do is work with OEM component vendors who don't consider kernel-hardware
>interface I.P.  See OpenBSD.









Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071116161831.49FBE13C4B8>