From owner-freebsd-geom@FreeBSD.ORG Wed Feb 16 17:55:41 2005 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3A1A616A4CE for ; Wed, 16 Feb 2005 17:55:41 +0000 (GMT) Received: from gromit.dlib.vt.edu (gromit.dlib.vt.edu [128.173.49.29]) by mx1.FreeBSD.org (Postfix) with ESMTP id BEB4F43D31 for ; Wed, 16 Feb 2005 17:55:40 +0000 (GMT) (envelope-from paul@gromit.dlib.vt.edu) Received: from zappa.Chelsea-Ct.Org (pool-151-199-113-125.roa.east.verizon.net [151.199.113.125]) by gromit.dlib.vt.edu (8.13.1/8.13.1) with ESMTP id j1GHtckM083278 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Wed, 16 Feb 2005 12:55:39 -0500 (EST) (envelope-from paul@gromit.dlib.vt.edu) Received: from zappa.Chelsea-Ct.Org (localhost.Chelsea-Ct.Org [127.0.0.1]) by zappa.Chelsea-Ct.Org (8.13.1/8.13.1) with ESMTP id j1GHtWDI000935 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Wed, 16 Feb 2005 12:55:33 -0500 (EST) (envelope-from paul@gromit.dlib.vt.edu) Received: (from paul@localhost) by zappa.Chelsea-Ct.Org (8.13.1/8.13.1/Submit) id j1GHtWTP000934 for freebsd-geom@freebsd.org; Wed, 16 Feb 2005 12:55:32 -0500 (EST) (envelope-from paul@gromit.dlib.vt.edu) X-Authentication-Warning: zappa.Chelsea-Ct.Org: paul set sender to paul@gromit.dlib.vt.edu using -f From: Paul Mather To: freebsd-geom@freebsd.org Content-Type: text/plain Content-Transfer-Encoding: 7bit Date: Wed, 16 Feb 2005 12:55:32 -0500 Message-Id: <1108576532.887.19.camel@zappa.Chelsea-Ct.Org> Mime-Version: 1.0 X-Mailer: Evolution 2.0.3 FreeBSD GNOME Team Port Subject: geom_mirror was stale, now broken X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Feb 2005 17:55:41 -0000 Because of the annoying ATA regression that crept into 5.3, I semi-regularly get "TIMEOUT - WRITE_DMA" errors that ultimately cause a drive to be removed from my geom_mirror configuration. :-( Previously, the drive suffering the WRITE_DMA problem would be marked as a "stale" provider during boot. Recently, this appears to have changed, and the provider is listed as "broken." E.g.: Feb 16 05:21:45 zappa kernel: ad2: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=49981679 Feb 16 05:21:50 zappa kernel: ad2: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=49981679 Feb 16 05:21:50 zappa kernel: ad2: FAILURE - WRITE_DMA timed out Feb 16 05:21:50 zappa kernel: GEOM_MIRROR: Cannot update metadata on disk ad2 (error=5). Feb 16 05:21:50 zappa kernel: GEOM_MIRROR: Device raid1: provider ad2 disconnected. [[...]] Feb 16 11:48:37 zappa kernel: FreeBSD 6.0-CURRENT #0: Fri Feb 11 09:03:49 EST 2005 [[...]] Feb 16 11:48:37 zappa kernel: GEOM_MIRROR: Device raid1 created (id=723259611). Feb 16 11:48:37 zappa kernel: GEOM_MIRROR: Device raid1: provider ad0 detected. Feb 16 11:48:37 zappa kernel: GEOM_MIRROR: Device raid1: provider ad2 detected. Feb 16 11:48:37 zappa kernel: GEOM_MIRROR: Component ad2 (device raid1) broken, skipping. Feb 16 11:48:37 zappa kernel: GEOM_MIRROR: Device raid1: provider ad0 activated. Feb 16 11:48:37 zappa kernel: GEOM_MIRROR: Device raid1: provider mirror/raid1 launched. One artifact of the mirror provider being marked as "broken" is that I can no longer simply rebuild onto it. Now, I have to "gmirror forget" and then "gmirror insert" the "broken" provider back into the mirror and then rebuild onto it. Under what circumstances is a mirror provider considered "broken" as opposed to "stale?" BTW, here is the current status of my geom_mirror (it is currently rebuilding): Geom name: raid1 State: DEGRADED Components: 2 Balance: split Slice: 4096 Flags: NOAUTOSYNC GenID: 4 SyncID: 12 ID: 723259611 Providers: 1. Name: mirror/raid1 Mediasize: 25590619648 (24G) Sectorsize: 512 Mode: r6w5e5 Consumers: 1. Name: ad0 Mediasize: 25590620160 (24G) Sectorsize: 512 Mode: r1w1e1 State: ACTIVE Priority: 0 Flags: DIRTY GenID: 4 SyncID: 12 ID: 1971505175 2. Name: ad2 Mediasize: 25590620160 (24G) Sectorsize: 512 Mode: r1w1e1 State: SYNCHRONIZING Priority: 1 Flags: DIRTY, SYNCHRONIZING, FORCE_SYNC GenID: 4 SyncID: 12 Synchronized: 89% ID: 3025777059 Geom name: raid1.sync Consumers: 1. Name: mirror/raid1 Mediasize: 25590619648 (24G) Sectorsize: 512 Mode: r1w0e0 I notice the "GenID" of the providers increases at every breakage. What is the GenID? (It seems like a relatively recent addition.) I've also noticed the geom_mirror appears less resilient to drive "failures" nowadays than it did before. I recently had to do a hard reset reboot this morning because my system "froze"---apparently unable to do any disk I/O. :-( Alas, I can't get any crashdumps (I'm using GBDE-encrypted swap on my geom_mirror), and can't set up a serial console at this time because the system has only one serial port and it's being used for my olde Apple LaserWriter II right now. :-) BTW, I run X. Is it possible to break to the debugger from the regular console should the system freeze again like it did this morning? If so, how do I do that? Cheers, Paul. -- e-mail: paul@gromit.dlib.vt.edu "Without music to decorate it, time is just a bunch of boring production deadlines or dates by which bills must be paid." --- Frank Vincent Zappa From owner-freebsd-geom@FreeBSD.ORG Fri Feb 18 22:29:53 2005 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AE49516A4CE for ; Fri, 18 Feb 2005 22:29:53 +0000 (GMT) Received: from imladris.teardrop.org (imladris.teardrop.org [66.92.66.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6310643D39 for ; Fri, 18 Feb 2005 22:29:53 +0000 (GMT) (envelope-from snow@teardrop.org) Received: by imladris.teardrop.org (Postfix, from userid 100) id C19D4C06E1; Fri, 18 Feb 2005 17:29:52 -0500 (EST) Date: Fri, 18 Feb 2005 17:29:52 -0500 From: James Snow To: freebsd-geom@freebsd.org Message-ID: <20050218222952.GA860@teardrop.org> References: <16901.26814.588055.457273@satchel.alerce.com> <16902.27236.71619.138367@satchel.alerce.com> <20050206191209.GC1080@darkness.comp.waw.pl> <16902.28195.6589.299894@satchel.alerce.com> <20050211133917.GA45990@engelschall.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050211133917.GA45990@engelschall.com> User-Agent: Mutt/1.4.2.1i Subject: Re: Hardcoding gmirror provider [was Re: Problem with migrating...] X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Feb 2005 22:29:53 -0000 On Fri, Feb 11, 2005 at 02:39:17PM +0100, Ralf S. Engelschall wrote: > > I've added a comment to the slice creation command that one just > substract one block or alternatively use the -h option on the "gmirror > label" command for hard-coding the provider. Thanks for catching this > subtle problem. I've tried several iterations of the single-slice method, but I'm still unable to boot. The loader complains that it can't find the kernel and lsdev reports: disk1s1: FFS bad disklabel disk2s1: FFS bad disklabel I'm a little puzzled. I got this working on another machine by hardcoding the provider labels in gmirror and not decrementing the size of the slice in fdisk. Not sure why it's giving me so much trouble going this route. Any thoughts on what I'm doing wrong? -Snow