Date: Thu, 1 Feb 2007 23:06:54 +0100 From: "Simon L. Nielsen" <simon@FreeBSD.org> To: Pawel Jakub Dawidek <pjd@FreeBSD.org> Cc: freebsd-geom@FreeBSD.ORG, Oliver Fromme <olli@lurza.secnetix.de>, sos@FreeBSD.org Subject: Re: gmirror or ata problem Message-ID: <20070201220653.GA974@zaphod.nitro.dk> In-Reply-To: <20070131220004.GC487@garage.freebsd.pl> References: <200701300851.l0U8pEkO005250@lurza.secnetix.de> <20070131201201.GB973@zaphod.nitro.dk> <20070131220004.GC487@garage.freebsd.pl>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2007.01.31 23:00:04 +0100, Pawel Jakub Dawidek wrote: > On Wed, Jan 31, 2007 at 09:12:02PM +0100, Simon L. Nielsen wrote: > > On 2007.01.30 09:51:14 +0100, Oliver Fromme wrote: > > [...] > > > Jan 29 19:10:13 pluto -- MARK -- > > > Jan 29 19:20:26 pluto kernel: ad1: FAILURE - device detached > > > Jan 29 19:20:26 pluto kernel: subdisk1: detached > > > Jan 29 19:20:26 pluto kernel: ad1: detached > > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot write metadata on ad1 (device=gm0, error=6). > > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot update metadata on disk ad1 (error=6). > > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot update metadata on disk ad1 (error=6). > > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Device gm0: provider ad1 disconnected. > > > Jan 29 19:50:13 pluto -- MARK -- > > > > I have seen similar problems on my graid3. I think it's simply the > > disk which stops responding to commands, or at least ata(4) can't talk > > to the disk anymore... > > > > I see it on: > > > > ad10: 305245MB <WDC WD3200SD-01KNB0 08.05J08> at ata5-master SATA150 > > ad12: 305245MB <WDC WD3200SD-01KNB0 08.05J08> at ata6-master SATA150 > > ad14: 305245MB <WDC WD3200YS-01PGB0 21.00M21> at ata7-master SATA150 > > > > After a reboot everything seems fine again and my RAID is rebuilt. > > > > I don't know why it happens, but it sucks :-/. I'm running 7-CURRENT > > BTW. > > It seems that when gmirror/graid3 writes to more than one disk at a > time, this puts too much load on ata channel or something and ata > disconnects the disk. I don't really know how it works exactly, but > maybe some timeout should be increased in the ata code? I mainly see problems when there is high IO load, e.g. if fsck or raid rebuild is running I far more often see problems. I will try to play with timeout values this weekend and see if I can provoke problems. Just for the record, I don't use ataidle or similar to spin my disks down, they should run all the time. -- Simon L. Nielsen
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070201220653.GA974>