Date: Wed, 31 Jan 2007 23:00:04 +0100 From: Pawel Jakub Dawidek <pjd@FreeBSD.org> To: "Simon L. Nielsen" <simon@FreeBSD.org> Cc: sos@FreeBSD.org, Oliver Fromme <olli@lurza.secnetix.de>, freebsd-geom@FreeBSD.ORG Subject: Re: gmirror or ata problem Message-ID: <20070131220004.GC487@garage.freebsd.pl> In-Reply-To: <20070131201201.GB973@zaphod.nitro.dk> References: <200701300851.l0U8pEkO005250@lurza.secnetix.de> <20070131201201.GB973@zaphod.nitro.dk>
next in thread | previous in thread | raw e-mail | index | archive | help
--6zdv2QT/q3FMhpsV Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jan 31, 2007 at 09:12:02PM +0100, Simon L. Nielsen wrote: > On 2007.01.30 09:51:14 +0100, Oliver Fromme wrote: >=20 > > This is strange. gmirror just detached one of its disks > > for no apparent reason. I've built a mirror consisting of > > the components ad0 and ad1 (both SATA drives). It has > > been running fine. This is RELENG_6 from 2006-12-20. > >=20 > > Yesterday evening ad1 was detached. There is no other > > error message logged on console or in the logs (i.e. no > > I/O error such as a bad sector or anything). There was > > no particularly high load at that time. In fact, the > > machine had been under much higher load before, without > > anything bad happening. > >=20 > > This is from the logs: > >=20 > > Jan 29 19:10:13 pluto -- MARK -- > > Jan 29 19:20:26 pluto kernel: ad1: FAILURE - device detached > > Jan 29 19:20:26 pluto kernel: subdisk1: detached > > Jan 29 19:20:26 pluto kernel: ad1: detached > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot write metadata on ad1= (device=3Dgm0, error=3D6). > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot update metadata on di= sk ad1 (error=3D6). > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot update metadata on di= sk ad1 (error=3D6). > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Device gm0: provider ad1 dis= connected. > > Jan 29 19:50:13 pluto -- MARK -- >=20 > I have seen similar problems on my graid3. I think it's simply the > disk which stops responding to commands, or at least ata(4) can't talk > to the disk anymore... >=20 > I see it on: >=20 > ad10: 305245MB <WDC WD3200SD-01KNB0 08.05J08> at ata5-master SATA150 > ad12: 305245MB <WDC WD3200SD-01KNB0 08.05J08> at ata6-master SATA150 > ad14: 305245MB <WDC WD3200YS-01PGB0 21.00M21> at ata7-master SATA150 >=20 > After a reboot everything seems fine again and my RAID is rebuilt. >=20 > I don't know why it happens, but it sucks :-/. I'm running 7-CURRENT > BTW. It seems that when gmirror/graid3 writes to more than one disk at a time, this puts too much load on ata channel or something and ata disconnects the disk. I don't really know how it works exactly, but maybe some timeout should be increased in the ata code? --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --6zdv2QT/q3FMhpsV Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFFwRFkForvXbEpPzQRAlMeAKDWwPjha/sx1jFR6XMMA4xJ4iSQtgCeNZ06 wELBJjHfOcMiP1VPUjJVBkU= =/smt -----END PGP SIGNATURE----- --6zdv2QT/q3FMhpsV--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070131220004.GC487>