From owner-freebsd-stable@FreeBSD.ORG  Sat Aug 25 10:04:47 2007
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 20C1716A418
	for <freebsd-stable@freebsd.org>; Sat, 25 Aug 2007 10:04:47 +0000 (UTC)
	(envelope-from tom@tomjudge.com)
Received: from smtp809.mail.ird.yahoo.com (smtp809.mail.ird.yahoo.com
	[217.146.188.69])
	by mx1.freebsd.org (Postfix) with SMTP id 9948313C46A
	for <freebsd-stable@freebsd.org>; Sat, 25 Aug 2007 10:04:46 +0000 (UTC)
	(envelope-from tom@tomjudge.com)
Received: (qmail 22558 invoked from network); 25 Aug 2007 10:04:45 -0000
Received: from unknown (HELO ?192.168.1.2?)
	(thomasjudge@btinternet.com@86.140.28.215 with plain)
	by smtp809.mail.ird.yahoo.com with SMTP; 25 Aug 2007 10:04:44 -0000
X-YMail-OSG: RWhL4G8VM1nwNqoNF1z5kWLXsJ42ww5FCt_jx17PIXpfnKll
Message-ID: <46D00CE1.9@tomjudge.com>
Date: Sat, 25 Aug 2007 12:05:05 +0100
From: Tom Judge <tom@tomjudge.com>
User-Agent: Thunderbird 1.5.0.12 (X11/20070604)
MIME-Version: 1.0
To: Tom Samplonius <tom@samplonius.org>
References: <9812134.411188026402612.JavaMail.root@ly.sdf.com>
In-Reply-To: <9812134.411188026402612.JavaMail.root@ly.sdf.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Artem Kuchin <matrix@itlegion.ru>,
	freebsd-stable <freebsd-stable@freebsd.org>
Subject: Re: A little story of failed raid5 (3ware 8000 series)
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 25 Aug 2007 10:04:47 -0000

Tom Samplonius wrote:
> ----- "Artem Kuchin" <matrix@itlegion.ru> wrote: ...
>> But i don't understand how and why it happened. ONly 6 hours ago (a
>>  night before) all those files were backed up fine w/o any read
>> error. And now, right after replacing the driver and starting
>> rebuild it said that there are bad sectors all over those file. How
>> come?
> 
> What happened to you was an extremely common occurrence.  You had a
> disk develop a media failure sometime ago, but the controller never
> detected it, because that particular bad area was not read.  Your
> backups worked because they never touched this portion of the disk
> (ex. empty space, meta data, etc).  And then another drive developed
> a electronics failure, which is instantly detected, putting the array
> into a degraded mode.  When you did a rebuild onto a replace drive,
> the controller discovered that there was a second failed disk, and
> this is unrecoverable.

3ware controllers can recover from this situation, all you need to do is 
tell the controller not to verify the source data.  This is a litle 
dangerous but it has saved me in the past where 1 drive died in a raid 
10 array and 2 of the 3 remaining drives had surface defects.  The trick 
was to replace each drive 1 at a time and rebuild without data 
verification.  After 10 painful hours the array was rebuild with out any 
noticeable data corruption.


> 
> RAID, of any level, isn't magic.  It is important to understand how
> it works, an realize that drives can passive fail.  BTW, if you were
> using RAID1 or RAID10, you would likely have had the same problem
> (well, RAID10 can survive _some_ double-disk failures).  RAID6 is the
> only RAID level that can survive failure of any two disks.

This is not all true RAID 1 can survive multiple disk failures as it has
the storage capacity of 1 spindle and can tolerate the failure of N-1
spindles where N is the number of spindles in the mirror set.  This also 
is kind of true in RAID 10, the more spindles in your mirror sets the 
more chance you have of being able to survive multiple failures in the 
array (Say use 6 disks in 2 3 disk mirror sets striped together).

> 
> The real solution is RAID scrubbing:  a low level background process
> that reads every sector of every disk.  All of the real RAID systems
> do this (usually scheduled weekly, or every other week).  Most 3ware
> RAID card don't have this feature.
> 
> So rather than not using RAID5 or RAID6 again, you should just not
> use 3ware anymore.

If you use the 3dm2 management interface you can schedule verify and
rebuild tasks to run on a regular basis.  I think that 7500 series
controllers can do this, 9500 and 9550's definitely can.

We have 50+ systems that are using 3ware cards (7500-9550 4 and 8 
channel models) with 200+ spindles in use (no host spares unfortunately) 
and drives in that pool failing on average around once a month. We have 
only ever had trouble recovering from failed drives on 7500 series 
controllers that have been in production for a reasonably long time.

I don't think that you are justified in your slagging off of 3ware 
controllers.

Tom