Date: Sat, 23 Apr 2005 22:13:21 -0400 From: Paul Mather <paul@gromit.dlib.vt.edu> To: freebsd-geom@freebsd.org Subject: Is there a "disconnected" state for geom_mirror providers? Message-ID: <1114308801.71938.2.camel@zappa.Chelsea-Ct.Org>
next in thread | raw e-mail | index | archive | help
Sadly, the "TIMEOUT - WRITE_DMA"-induced disk disconnections have returned on my -CURRENT system since I upgraded to ATA Mk.III. :-( However, I've noticed that when a drive is marked as failed and the device detached, the provider also disappears from the geom_mirror it is part of, instead of being marked as a "stale" or "disconnected" or "missing" component of the remaining mirror components. Is this the correct behaviour? In the latest failure to occur, ad0 was detached: ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=49981679 ad0: FAILURE - device detached subdisk0: detached ad0: detached GEOM_MIRROR: Cannot update metadata on disk ad0 (error=5). GEOM_MIRROR: Cannot update metadata on disk ad0 (error=6). GEOM_MIRROR: Device raid1: provider ad0 disconnected. GEOM_MIRROR: Request failed (error=6). ad0[WRITE(offset=3847741440, length=16384)] I performed an "atacontrol detach 0" followed by an "atacontrol attach 0" to "re-discover" the "failed" ad0 as part of the existing geom_mirror. This yielded the following: acd0: detached (cd0:ata0:0:1:0): lost device (cd0:ata0:0:1:0): removing device entry atapicam0: detached stray irq14 ad0: 24405MB <IBM DJNA-352500 J51OA30K> at ata0-master UDMA33 GEOM_MIRROR: Component ad0 (device raid1) broken, skipping. GEOM_MIRROR: Cannot add disk ad0 to raid1 (error=22). acd0: DVDR <LITE-ON DVDRW SOHW-832S/VS08> at ata0-slave UDMA33 cd0 at ata0 bus 0 target 1 lun 0 cd0: <LITE-ON DVDRW SOHW-832S VS08> Removable CD-ROM SCSI-0 device cd0: 33.000MB/s transfers cd0: cd present [1 x 2048 byte records] The provider ad0 did not show up as a "stale" provider of my "raid1" mirror (from which it had disappeared when it was detached due to the "TIMEOUT - WRITE_DMA" failure). I had to do a "gmirror forget raid1" before a "gmirror insert raid1 ad0" would allow me to re-insert it so I could perform a "gmirror rebuild raid1 ad0" to kick off synchronisation. What is the definition of a "broken" component? What is the difference between a "stale" and a "broken" component? If I were to detach and remove a hot-plug geom_mirror component and subsequently re-attach it, will the component be considered "stale" or "broken?" This is not a major inconvenience (well, the return of the "TIMEOUT - WRITE_DMA" errors are:), but I was just wondering why my failed providers disappear now as opposed to being marked as stale as happened in the past. BTW, my system is a fairly recent -CURRENT: FreeBSD 6.0-CURRENT #0: Mon Apr 18 12:25:24 EDT 2005. Cheers, Paul. -- e-mail: paul@gromit.dlib.vt.edu "Without music to decorate it, time is just a bunch of boring production deadlines or dates by which bills must be paid." --- Frank Vincent Zappa
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1114308801.71938.2.camel>