From owner-freebsd-stable@FreeBSD.ORG Wed Oct 22 15:51:47 2003 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 18BDA16A4B3 for ; Wed, 22 Oct 2003 15:51:47 -0700 (PDT) Received: from pit.databus.com (p70-227.acedsl.com [66.114.70.227]) by mx1.FreeBSD.org (Postfix) with ESMTP id 04EA643F3F for ; Wed, 22 Oct 2003 15:51:46 -0700 (PDT) (envelope-from barney@pit.databus.com) Received: from pit.databus.com (localhost [127.0.0.1]) by pit.databus.com (8.12.9p2/8.12.9) with ESMTP id h9MMpjYL096691; Wed, 22 Oct 2003 18:51:45 -0400 (EDT) (envelope-from barney@pit.databus.com) Received: (from barney@localhost) by pit.databus.com (8.12.9p2/8.12.9/Submit) id h9MMpj9w096690; Wed, 22 Oct 2003 18:51:45 -0400 (EDT) (envelope-from barney) Date: Wed, 22 Oct 2003 18:51:45 -0400 From: Barney Wolff To: Doug White Message-ID: <20031022225145.GA96546@pit.databus.com> References: <20031018020939.GA24917@pit.databus.com> <20031018161424.X35407@carver.gumbysoft.com> <20031019020310.GA40618@pit.databus.com> <20031022152157.J71676@carver.gumbysoft.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20031022152157.J71676@carver.gumbysoft.com> User-Agent: Mutt/1.4.1i X-Scanned-By: MIMEDefang 2.37 cc: stable@freebsd.org Subject: Re: unintended ATARAIDDELETE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Oct 2003 22:51:47 -0000 On Wed, Oct 22, 2003 at 03:22:58PM -0700, Doug White wrote: > On Sat, 18 Oct 2003, Barney Wolff wrote: > > > > This usually means your disk is bad, which is why it keeps trashing the > > > array. Your system is trying to tell you something :-) > > > > Well of course the bad block is h/w. But deleting a raid0 on a hard > > error is insane. I can more-or-less understand for raid1 why that > > might be thought sensible, but a split raid0 is of no use for anything. > > Nor could I find anywhere in the kernel that actually deletes the raid. > > But for sure -stable from 9/24 behaved differently (ie, sanely) on > > getting the error than -stable from 10/13 or so. I don't think that's > > hardware. Time will tell, perhaps. > > Since one part of the RAID is defective, and raid0 is not redundant, the > RAID is marked offline. It is working as designed. I beg to differ. The RAID was not (or not only) marked offline, it was broken apart, and had to be reconfigured manually on reboot. Could you point me to where the marking is done? > If you need to tolerate disk failure, you probably want raid1, a mirrored > configuration. In the case at hand, the drive eventually remapped the defective block, and has given no trouble since. (That includes reading every sector of the raid0 with dd.) I would have been less upset if the system had panic'd and rebooted. It's the altering of a bios setting that I don't understand. If that's by intentional design, what benefit is it intended to provide? Thanks for your help in understanding this. Barney -- Barney Wolff http://www.databus.com/bwresume.pdf I'm available by contract or FT, in the NYC metro area or via the 'Net.