From owner-freebsd-fs@FreeBSD.ORG  Wed Oct 31 17:58:31 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 1EAED2A4
 for <freebsd-fs@freebsd.org>; Wed, 31 Oct 2012 17:58:31 +0000 (UTC)
 (envelope-from zbeeble@gmail.com)
Received: from mail-la0-f54.google.com (mail-la0-f54.google.com
 [209.85.215.54])
 by mx1.freebsd.org (Postfix) with ESMTP id 8CB2F8FC0A
 for <freebsd-fs@freebsd.org>; Wed, 31 Oct 2012 17:58:30 +0000 (UTC)
Received: by mail-la0-f54.google.com with SMTP id e12so1594503lag.13
 for <freebsd-fs@freebsd.org>; Wed, 31 Oct 2012 10:58:29 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=fYKTxSmc88mxLX0lAEfop1praagarLhkcJWY5EjSkUA=;
 b=jBj8qe0tNAGiHt41TofY1r8ZU/0qISQnZByN25yZ7XTopAbHzf2ITwH/GsM6jgzGSv
 8Q6LZJ7V2XctzJ6SH2J0KNu1TwmY1/Pl21szWIe+QAonlYUuAjwEebOtjQPP1WdWmKHY
 YnBW/gOWRMPuY/cMcwpD1+Hkhu9O4Fg+eYK5jvYrrpuYRr4FLzPtRmJS4Jhk1HhBO2DB
 l18n41U1j2A+WGRptChWRA5pJYfd+JHyJKzOuUvoL1M3LNocvyeEgRZJ8DyMzL0fFZKw
 ixQBY4Z1FDsK45xxZyxKZpc7Apjf7cSNHIxfPcPQoW0Asph/P56dtQoTo7oUZIHajooD
 BdIQ==
MIME-Version: 1.0
Received: by 10.112.54.99 with SMTP id i3mr14322012lbp.37.1351706309341; Wed,
 31 Oct 2012 10:58:29 -0700 (PDT)
Received: by 10.112.49.138 with HTTP; Wed, 31 Oct 2012 10:58:29 -0700 (PDT)
In-Reply-To: <op.wm1axoqv8527sy@ronaldradial.versatec.local>
References: <508F98F9.3040604@fletchermoorland.co.uk>
 <1351598684.88435.19.camel@btw.pki2.com>
 <508FE643.4090107@fletchermoorland.co.uk>
 <op.wmz1vtrd8527sy@ronaldradial.versatec.local>
 <5090010A.4050109@fletchermoorland.co.uk>
 <op.wm1axoqv8527sy@ronaldradial.versatec.local>
Date: Wed, 31 Oct 2012 13:58:29 -0400
Message-ID: <CACpH0MeJpSg3ti-QUgT=XwaC0jkEo5JeBAfRGPTFfUE6eLJFJg@mail.gmail.com>
Subject: Re: ZFS RaidZ-2 problems
From: Zaphod Beeblebrox <zbeeble@gmail.com>
To: Ronald Klop <ronald-freebsd8@klop.yi.org>
Content-Type: text/plain; charset=ISO-8859-1
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 31 Oct 2012 17:58:31 -0000

I'd start off by saying "smart is your friend."  Install smartmontools
and study the somewhat opaque "smartctl -a /dev/mydisk" output
carefully.  Try running a short and/or long test, too.  Many times the
disk can tell you what the problem is.  If too many blocks are being
replaced, your drive is dying.  If the drive sees errors in commands
it receives, the cable or the controller are at fault.   ZFS itself
does _exceptionally_ well at trying to use what it has.

I'll also say that bad power supplies make for bad disks.  Replacing a
power supply has often been the solution to bad disk problems I've
had.  Disks are sensitive to under voltage problems.  Brown-outs can
exacerbate this problem.  My parents live out where power is very
flaky.  Cheap UPSs didn't help much ... but a good power supply can
make all the difference.

But I've also had bad controllers of late, too.  My most recent
problem had my 9-disk raidZ1 array loose a disk.  Smartctl said that
it was loosing blocks fast, so I RMA'd the disk.  When the new disk
came, the array just wouldn't heal... it kept loosing the disks
attached to a certain controller.  Now it's possible the controller
was bad before the disk had died ... or that it died during the first
attempt at resilver ... or that FreeBSD drivers don't like it anymore
... I don't know.

My solution was to get two more 4 drive "pro box" SATA enclosures.
They use a 1-to-4 SATA breakout and the 6 motherboard ports I have are
a revision of the ICH11 intel chipset that supports SATA port
replication (I already had two of these boxes).  In this manner I
could remove the defective controller and put all disks onto the
motherboard ICH11 (it actually also allowed me to later expand the
array... but that's not part of this story).

The upshot was that I now had all the disks present for a raidZ array,
but tonnes of the errors had occured when there were not enough disks.
 zpool status -v listed hundresds thousands of files and directories
that were "bad" or lost.  But I'd seen this before and started a
scrub.  The result of the scrub was: perfect recovery.  Actually... it
took a 2nd scrub --- I don't know why.  It was happy after the 1st
scrub, but then some checksum errors were found --- and then fixed, so
I scrubbed again ... and that fixed it.

How does it do it?  Unlike other RAID systems, ZFS can tell a bad
block from a good one.  When it is asked to re-recover after really
bad multiple failures, it can tell if a block is good or not.  This
means that it can choose among alternate or partially recovered
versions and get the right one.  Certainly, my above experience would
have been a dead array ... or an array with much loss if I had used
any other RAID technology.

What does this mean?  Well... one thing it means is that for
non-essential systems (say my home media array), using cheap
technology is less risky.  None of these is enterprise level
technology, but none of it costs anywhere near what enterprise level,
either.