Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 1 Feb 2007 23:06:54 +0100
From:      "Simon L. Nielsen" <simon@FreeBSD.org>
To:        Pawel Jakub Dawidek <pjd@FreeBSD.org>
Cc:        freebsd-geom@FreeBSD.ORG, Oliver Fromme <olli@lurza.secnetix.de>, sos@FreeBSD.org
Subject:   Re: gmirror or ata problem
Message-ID:  <20070201220653.GA974@zaphod.nitro.dk>
In-Reply-To: <20070131220004.GC487@garage.freebsd.pl>
References:  <200701300851.l0U8pEkO005250@lurza.secnetix.de> <20070131201201.GB973@zaphod.nitro.dk> <20070131220004.GC487@garage.freebsd.pl>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2007.01.31 23:00:04 +0100, Pawel Jakub Dawidek wrote:
> On Wed, Jan 31, 2007 at 09:12:02PM +0100, Simon L. Nielsen wrote:
> > On 2007.01.30 09:51:14 +0100, Oliver Fromme wrote:
> > 
[...]
> > > Jan 29 19:10:13 pluto -- MARK --
> > > Jan 29 19:20:26 pluto kernel: ad1: FAILURE - device detached
> > > Jan 29 19:20:26 pluto kernel: subdisk1: detached
> > > Jan 29 19:20:26 pluto kernel: ad1: detached
> > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot write metadata on ad1 (device=gm0, error=6).
> > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot update metadata on disk ad1 (error=6).
> > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot update metadata on disk ad1 (error=6).
> > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Device gm0: provider ad1 disconnected.
> > > Jan 29 19:50:13 pluto -- MARK --
> > 
> > I have seen similar problems on my graid3.  I think it's simply the
> > disk which stops responding to commands, or at least ata(4) can't talk
> > to the disk anymore...
> > 
> > I see it on:
> > 
> > ad10: 305245MB <WDC WD3200SD-01KNB0 08.05J08> at ata5-master SATA150
> > ad12: 305245MB <WDC WD3200SD-01KNB0 08.05J08> at ata6-master SATA150
> > ad14: 305245MB <WDC WD3200YS-01PGB0 21.00M21> at ata7-master SATA150
> > 
> > After a reboot everything seems fine again and my RAID is rebuilt.
> > 
> > I don't know why it happens, but it sucks :-/.  I'm running 7-CURRENT
> > BTW.
> 
> It seems that when gmirror/graid3 writes to more than one disk at a
> time, this puts too much load on ata channel or something and ata
> disconnects the disk. I don't really know how it works exactly, but
> maybe some timeout should be increased in the ata code?

I mainly see problems when there is high IO load, e.g. if fsck or raid
rebuild is running I far more often see problems.  I will try to play
with timeout values this weekend and see if I can provoke problems.

Just for the record, I don't use ataidle or similar to spin my disks
down, they should run all the time.

-- 
Simon L. Nielsen



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070201220653.GA974>