From owner-freebsd-geom@FreeBSD.ORG Fri Sep 10 22:33:30 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 731AA1065674 for ; Fri, 10 Sep 2010 22:33:30 +0000 (UTC) (envelope-from rick@rix.kiwi-computer.com) Received: from rix.kiwi-computer.com (66-191-70-202.static.stcd.mn.charter.com [66.191.70.202]) by mx1.freebsd.org (Postfix) with SMTP id 09D238FC13 for ; Fri, 10 Sep 2010 22:33:29 +0000 (UTC) Received: (qmail 29859 invoked by uid 2000); 10 Sep 2010 22:33:29 -0000 Date: Fri, 10 Sep 2010 17:33:29 -0500 From: "Rick C. Petty" To: Miroslav Lachman <000.fbsd@quip.cz> Message-ID: <20100910223329.GB29252@rix.kiwi-computer.com> References: <20100910031851.GA7066@rix.kiwi-computer.com> <4C8A04EC.3090409@quip.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4C8A04EC.3090409@quip.cz> User-Agent: Mutt/1.4.2.3i Cc: freebsd-geom@freebsd.org Subject: Re: it's a race between gmirror and UFS labels and gjournal X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: rick-freebsd2009@kiwi-computer.com List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Sep 2010 22:33:30 -0000 On Fri, Sep 10, 2010 at 12:14:04PM +0200, Miroslav Lachman wrote: > > The same problem is with gjournal on gmirror. If gmirror drops the disk, > then gjournal is (sometimes) detected on the first "broken disk" and > this disk cannot be re-inserted in to gmirror again. (I have this > experience on 7.x). One must reboot in to single-user without gjournal > module loaded and re-insert the disk in to gmirror. I take back my earlier statement about booting into single-user. That didn't even work, because my root filesystem is on that mirror. I had to physically pull the cable off of one drive so that GEOM probed correctly and grabbed the mirror. Thankfully, the controller recognized when I plugged in the disk again (some controllers still don't, in FreeBSD), or I wouldn't have been able to get that other disk back into the mirror. And thankfully I had physical access to the machine. My condolences to those of you who run up against this broken behavior with remote machines. The point here is that there is some brokenness with GEOM probing. The fix that I hope to see is that gmirror doesn't drop the "broken" providers so that other GEOMs can taste them. It should instead hold on to those providers (yet marked as broken) and let the sysadmin decide how to handle it. In my case, I would wanted to do a "gmirror rebuild" after verifying the correct disks were mapped to the correct mirrors. With the current implementation, this just isn't possible. I'm hoping some can explain to me why broken providers are released back into GEOM for re-tasting. I'm willing to submit patches, but I haven't figured out why disks marked as broken aren't kept as part of the mirror in the first place. -- Rick C. Petty