Date: Mon, 18 May 2009 13:49:44 +0300 From: Achilleas Mantzios <achill@matrix.gatewaynet.com> To: Manolis Kiagias <sonicy@otenet.gr> Cc: freebsd-questions@freebsd.org Subject: Re: Weird problem with gmirror - cannot add the Good disk when previously failed SATA disk is online Message-ID: <200905181349.44924.achill@matrix.gatewaynet.com> In-Reply-To: <4A11382E.6040909@otenet.gr> References: <200905181200.28732.achill@matrix.gatewaynet.com> <4A11382E.6040909@otenet.gr>
next in thread | previous in thread | raw e-mail | index | archive | help
Hey Manoli! glad to see you again, =D3=F4=E9=F2 Monday 18 May 2009 13:27:58 =EF/=E7 Manolis Kiagias =DD=E3=F1= =E1=F8=E5: > Achilleas Mantzios wrote: > > Hello, > > in advance sorry for the cross posting, it is just that freebsd-geom di= dnt seem that populated. > > I run 7.1-PRERELEASE, its a home server. > > today morning after a power failure, the rebuild my root gm0 failed on = disk ad4. > > The messages were: > > > > May 18 08:02:02 panix kernel: ad4: WARNING - WRITE_DMA UDMA ICRC error = (retrying request) LBA=3D268091264 > > May 18 08:02:08 panix kernel: drm0: <Intel i865G GMCH> on vgapci0 > > May 18 08:02:08 panix kernel: info: [drm] AGP at 0xf0000000 128MB > > May 18 08:02:08 panix kernel: info: [drm] Initialized i915 1.5.0 200601= 19 > > May 18 08:02:08 panix kernel: drm0: [ITHREAD] > > May 18 08:02:08 panix kernel: ad4: FAILURE - device detached > > May 18 08:02:08 panix kernel: subdisk4: detached > > May 18 08:02:08 panix kernel: ad4: detached > > May 18 08:02:08 panix kernel: GEOM_MIRROR: Device gm0: provider ad4 dis= connected. > > May 18 08:02:08 panix kernel: GEOM_MIRROR: Device gm0: rebuilding provi= der ad4 stopped. > > > > =20 >=20 > It looks to me you got a bad disk now. >=20 I certainly hope so, since there is nothing else i can do > > I read http://www.eztiger.org/2008/08/removing-and-re-adding-a-disk-in-= gmirror/ > > hoping that the rebuld failure was temprary > > and so i tried to just run=20 > > # gmirror forget gm0 > > # gmirror insert gm0 ad4 > > > > But the system responded (if i remember correctly) =20 > > Unknown provider ad4. > > The system no longer could see ad4 being online. > > > > So i rebooted the system many times and had these results: > > -When having put offline ad4 (disconnected by hardware), the system boo= ted ok. > > -When having both disks online the system responded consistently=20 > > with: > > "GEOM_MIRROR: Cannot add disk ad6 to gm0 (error=3D22)." > > Which IMO is not very ok, since gm0 should add ad6 without problem, > > no matter if ad4 is online or not. > > -When having only ad4 online, then it simply cannot find gm0 at all. (k= ind of reasonable) > > > > So my only option is to have only ad6 online, with a current gmirror st= atus: > > panix# gmirror status > > Name Status Components > > mirror/gm0 COMPLETE ad6 > > > > Anyone has an idea of how should i proceed (besides buying a UPS unit!) > > Is it meaningfull to go for a new Disk to replace current ad4? > > =20 >=20 > I'd recommend attaching the bad disk on its own to a system and perform > tests on it. Is the BIOS recognizing this properly? I would run hardware Yes, the BIOS recognizes it ok i suppose. > tests on it - either manufacturer ones, or stuff like > sysutils/smartmontools. You could also try installing FreeBSD on it and > see if it works. And probably use dd to clean all the contents, esp. > the partition table and the last sector where geom information is stored. >=20 Thanx, lacking time i think i will try to use a brand new identical disk. > > Why is the presence of the supposed bad disk ad4, affecting gm0, > > when having already told gm0 to forget about ad4? > > =20 >=20 > The bad disk may be sending confusing signals to the bus / IDE > interface. I've had this once (although it was due to a bad cable). The > entire mirror would disappear suddenly. >=20 >=20 =2D-=20 Achilleas Mantzios
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200905181349.44924.achill>