From owner-freebsd-questions@FreeBSD.ORG  Sun Nov 11 15:57:10 2007
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id EA44416A417
	for <freebsd-questions@freebsd.org>;
	Sun, 11 Nov 2007 15:57:10 +0000 (UTC)
	(envelope-from dnewman@networktest.com)
Received: from mail.networktest.com (mail.networktest.com [207.181.8.134])
	by mx1.freebsd.org (Postfix) with ESMTP id D260D13C4B5
	for <freebsd-questions@freebsd.org>;
	Sun, 11 Nov 2007 15:57:10 +0000 (UTC)
	(envelope-from dnewman@networktest.com)
Received: by mail.networktest.com (Postfix, from userid 1002)
	id 260D178C4D; Sun, 11 Nov 2007 07:56:56 -0800 (PST)
X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on mail.networktest.com
X-Spam-Level: 
X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00,RCVD_IN_PBL,
	RDNS_DYNAMIC autolearn=no version=3.2.3
Received: from lion.local (cpe-75-82-195-55.socal.res.rr.com [75.82.195.55])
	by mail.networktest.com (Postfix) with ESMTP id 92BEE78C55
	for <freebsd-questions@freebsd.org>;
	Sun, 11 Nov 2007 07:56:49 -0800 (PST)
Message-ID: <47372644.4060201@networktest.com>
Date: Sun, 11 Nov 2007 07:56:52 -0800
From: David Newman <dnewman@networktest.com>
Organization: Network Test Inc.
User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728)
MIME-Version: 1.0
To: freebsd-questions@freebsd.org
References: <4736593E.1090905@networktest.com>
	<64c038660711102109x2ea186afjdd219292d8eed700@mail.gmail.com>
In-Reply-To: <64c038660711102109x2ea186afjdd219292d8eed700@mail.gmail.com>
X-Enigmail-Version: 0.95.5
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Subject: Re: dealing with a failing drive
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 11 Nov 2007 15:57:11 -0000

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11/10/07 9:09 PM, Modulok wrote:
>>> I'd welcome suggestions on how (or whether) to try to revive a SCSI
> drive that's failing.
> 
> It depends on how valuable the data on the array is, and more
> importantly, how much funding you have at your disposal to fix the
> problem. If it were me, I would set aside the bad disk, connect a new
> disk to the card and re-synchronize the array. (Assuming one of the
> members still retains a good copy of the data.) Afterwards I would
> destroy, or toss the existing disk in the trash can (depending on the
> sensitivity of the data stored on it.)

Thanks for your reply.

An update: After doing what you suggest (leaving in the "good" disk,
adding a new disk, RAID rebuilding) I still got soft write errors --
with *either one* of the disks I tried.

Then I tried putting both disks in an identical server and they came up
fine, no read or write errors.

Ergo, the bad RAID controller is bad and the disks may be OK.

>>> Is there some other way to:
>>> b)monitor the health of disks on a Compaq controller so it doesn't
> get to this point to begin with?
> 
> There are various tools out there that attempt to 'monitor' the
> condition of disk drives to try and predict when failure is eminent.
> For valuable data, it is safer to setup a mirror and simply toss out
> bad disks as they fail. For extremely valuable data use a 3 disk
> array. With a 3 disk setup you will still be covered in the event that
> an additional disk craps out during the re-sync.
> 
> To quote google's article on disk failure, regarding SMART:

Right, I've heard it said that "SMART isn't."

Nonetheless, I'd appreciate any suggestions to monitor the health of
disks -- and RAID controllers too -- on HP Proliant servers running FreeBSD.

thanks again.

dn


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (Darwin)

iD8DBQFHNyZDyPxGVjntI4IRAqk1AKCUwByNOAJZwvtD9V21TZfyaMWaxgCdFSCZ
dZjf3ynK+4OffBzsDOawF9A=
=DUqc
-----END PGP SIGNATURE-----