From owner-freebsd-stable@FreeBSD.ORG  Wed Oct 22 15:51:47 2003
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 18BDA16A4B3
	for <stable@freebsd.org>; Wed, 22 Oct 2003 15:51:47 -0700 (PDT)
Received: from pit.databus.com (p70-227.acedsl.com [66.114.70.227])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 04EA643F3F
	for <stable@freebsd.org>; Wed, 22 Oct 2003 15:51:46 -0700 (PDT)
	(envelope-from barney@pit.databus.com)
Received: from pit.databus.com (localhost [127.0.0.1])
	by pit.databus.com (8.12.9p2/8.12.9) with ESMTP id h9MMpjYL096691;
	Wed, 22 Oct 2003 18:51:45 -0400 (EDT)
	(envelope-from barney@pit.databus.com)
Received: (from barney@localhost)
	by pit.databus.com (8.12.9p2/8.12.9/Submit) id h9MMpj9w096690;
	Wed, 22 Oct 2003 18:51:45 -0400 (EDT)
	(envelope-from barney)
Date: Wed, 22 Oct 2003 18:51:45 -0400
From: Barney Wolff <barney@databus.com>
To: Doug White <dwhite@gumbysoft.com>
Message-ID: <20031022225145.GA96546@pit.databus.com>
References: <20031018020939.GA24917@pit.databus.com>
	<20031018161424.X35407@carver.gumbysoft.com>
	<20031019020310.GA40618@pit.databus.com>
	<20031022152157.J71676@carver.gumbysoft.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20031022152157.J71676@carver.gumbysoft.com>
User-Agent: Mutt/1.4.1i
X-Scanned-By: MIMEDefang 2.37
cc: stable@freebsd.org
Subject: Re: unintended ATARAIDDELETE
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Production branch of FreeBSD source code
	<freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 22 Oct 2003 22:51:47 -0000

On Wed, Oct 22, 2003 at 03:22:58PM -0700, Doug White wrote:
> On Sat, 18 Oct 2003, Barney Wolff wrote:
> 
> > > This usually means your disk is bad, which is why it keeps trashing the
> > > array.  Your system is trying to tell you something :-)
> >
> > Well of course the bad block is h/w.  But deleting a raid0 on a hard
> > error is insane.  I can more-or-less understand for raid1 why that
> > might be thought sensible, but a split raid0 is of no use for anything.
> > Nor could I find anywhere in the kernel that actually deletes the raid.
> > But for sure -stable from 9/24 behaved differently (ie, sanely) on
> > getting the error than -stable from 10/13 or so.  I don't think that's
> > hardware.  Time will tell, perhaps.
> 
> Since one part of the RAID is defective, and raid0 is not redundant, the
> RAID is marked offline.  It is working as designed.

I beg to differ.  The RAID was not (or not only) marked offline, it was
broken apart, and had to be reconfigured manually on reboot.  Could you
point me to where the marking is done?

> If you need to tolerate disk failure, you probably want raid1, a mirrored
> configuration.

In the case at hand, the drive eventually remapped the defective block,
and has given no trouble since.  (That includes reading every sector of
the raid0 with dd.)

I would have been less upset if the system had panic'd and rebooted.
It's the altering of a bios setting that I don't understand.  If that's
by intentional design, what benefit is it intended to provide?

Thanks for your help in understanding this.
Barney

-- 
Barney Wolff         http://www.databus.com/bwresume.pdf
I'm available by contract or FT, in the NYC metro area or via the 'Net.