From owner-freebsd-geom@FreeBSD.ORG Sun Apr 24 02:13:34 2005 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7B10916A4CE for ; Sun, 24 Apr 2005 02:13:34 +0000 (GMT) Received: from gromit.dlib.vt.edu (gromit.dlib.vt.edu [128.173.49.29]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0F8A443D41 for ; Sun, 24 Apr 2005 02:13:34 +0000 (GMT) (envelope-from paul@gromit.dlib.vt.edu) Received: from zappa.Chelsea-Ct.Org (pool-70-110-10-69.roa.east.verizon.net [70.110.10.69]) by gromit.dlib.vt.edu (8.13.3/8.13.3) with ESMTP id j3O2DVlO019520 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sat, 23 Apr 2005 22:13:32 -0400 (EDT) (envelope-from paul@gromit.dlib.vt.edu) Received: from zappa.Chelsea-Ct.Org (localhost.Chelsea-Ct.Org [127.0.0.1]) by zappa.Chelsea-Ct.Org (8.13.3/8.13.3) with ESMTP id j3O2DO04071963 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sat, 23 Apr 2005 22:13:26 -0400 (EDT) (envelope-from paul@gromit.dlib.vt.edu) Received: (from paul@localhost) by zappa.Chelsea-Ct.Org (8.13.3/8.13.3/Submit) id j3O2DMl2071962 for freebsd-geom@freebsd.org; Sat, 23 Apr 2005 22:13:23 -0400 (EDT) (envelope-from paul@gromit.dlib.vt.edu) X-Authentication-Warning: zappa.Chelsea-Ct.Org: paul set sender to paul@gromit.dlib.vt.edu using -f From: Paul Mather To: freebsd-geom@freebsd.org Content-Type: text/plain Content-Transfer-Encoding: 7bit Date: Sat, 23 Apr 2005 22:13:21 -0400 Message-Id: <1114308801.71938.2.camel@zappa.Chelsea-Ct.Org> Mime-Version: 1.0 X-Mailer: Evolution 2.2.2 FreeBSD GNOME Team Port Subject: Is there a "disconnected" state for geom_mirror providers? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Apr 2005 02:13:34 -0000 Sadly, the "TIMEOUT - WRITE_DMA"-induced disk disconnections have returned on my -CURRENT system since I upgraded to ATA Mk.III. :-( However, I've noticed that when a drive is marked as failed and the device detached, the provider also disappears from the geom_mirror it is part of, instead of being marked as a "stale" or "disconnected" or "missing" component of the remaining mirror components. Is this the correct behaviour? In the latest failure to occur, ad0 was detached: ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=49981679 ad0: FAILURE - device detached subdisk0: detached ad0: detached GEOM_MIRROR: Cannot update metadata on disk ad0 (error=5). GEOM_MIRROR: Cannot update metadata on disk ad0 (error=6). GEOM_MIRROR: Device raid1: provider ad0 disconnected. GEOM_MIRROR: Request failed (error=6). ad0[WRITE(offset=3847741440, length=16384)] I performed an "atacontrol detach 0" followed by an "atacontrol attach 0" to "re-discover" the "failed" ad0 as part of the existing geom_mirror. This yielded the following: acd0: detached (cd0:ata0:0:1:0): lost device (cd0:ata0:0:1:0): removing device entry atapicam0: detached stray irq14 ad0: 24405MB at ata0-master UDMA33 GEOM_MIRROR: Component ad0 (device raid1) broken, skipping. GEOM_MIRROR: Cannot add disk ad0 to raid1 (error=22). acd0: DVDR at ata0-slave UDMA33 cd0 at ata0 bus 0 target 1 lun 0 cd0: Removable CD-ROM SCSI-0 device cd0: 33.000MB/s transfers cd0: cd present [1 x 2048 byte records] The provider ad0 did not show up as a "stale" provider of my "raid1" mirror (from which it had disappeared when it was detached due to the "TIMEOUT - WRITE_DMA" failure). I had to do a "gmirror forget raid1" before a "gmirror insert raid1 ad0" would allow me to re-insert it so I could perform a "gmirror rebuild raid1 ad0" to kick off synchronisation. What is the definition of a "broken" component? What is the difference between a "stale" and a "broken" component? If I were to detach and remove a hot-plug geom_mirror component and subsequently re-attach it, will the component be considered "stale" or "broken?" This is not a major inconvenience (well, the return of the "TIMEOUT - WRITE_DMA" errors are:), but I was just wondering why my failed providers disappear now as opposed to being marked as stale as happened in the past. BTW, my system is a fairly recent -CURRENT: FreeBSD 6.0-CURRENT #0: Mon Apr 18 12:25:24 EDT 2005. Cheers, Paul. -- e-mail: paul@gromit.dlib.vt.edu "Without music to decorate it, time is just a bunch of boring production deadlines or dates by which bills must be paid." --- Frank Vincent Zappa