From owner-freebsd-geom@FreeBSD.ORG Thu Feb 1 22:07:01 2007 Return-Path: X-Original-To: freebsd-geom@FreeBSD.ORG Delivered-To: freebsd-geom@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1EAC316A401; Thu, 1 Feb 2007 22:07:01 +0000 (UTC) (envelope-from simon@zaphod.nitro.dk) Received: from mx.nitro.dk (zarniwoop.nitro.dk [83.92.207.38]) by mx1.freebsd.org (Postfix) with ESMTP id C1E9113C4B4; Thu, 1 Feb 2007 22:06:56 +0000 (UTC) (envelope-from simon@zaphod.nitro.dk) Received: from zaphod.nitro.dk (unknown [192.168.3.39]) by mx.nitro.dk (Postfix) with ESMTP id A450C2D4A9E; Thu, 1 Feb 2007 22:06:22 +0000 (UTC) Received: by zaphod.nitro.dk (Postfix, from userid 3000) id 001BD1141D; Thu, 1 Feb 2007 23:06:54 +0100 (CET) Date: Thu, 1 Feb 2007 23:06:54 +0100 From: "Simon L. Nielsen" To: Pawel Jakub Dawidek Message-ID: <20070201220653.GA974@zaphod.nitro.dk> References: <200701300851.l0U8pEkO005250@lurza.secnetix.de> <20070131201201.GB973@zaphod.nitro.dk> <20070131220004.GC487@garage.freebsd.pl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070131220004.GC487@garage.freebsd.pl> User-Agent: Mutt/1.5.11 Cc: freebsd-geom@FreeBSD.ORG, Oliver Fromme , sos@FreeBSD.org Subject: Re: gmirror or ata problem X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Feb 2007 22:07:01 -0000 On 2007.01.31 23:00:04 +0100, Pawel Jakub Dawidek wrote: > On Wed, Jan 31, 2007 at 09:12:02PM +0100, Simon L. Nielsen wrote: > > On 2007.01.30 09:51:14 +0100, Oliver Fromme wrote: > > [...] > > > Jan 29 19:10:13 pluto -- MARK -- > > > Jan 29 19:20:26 pluto kernel: ad1: FAILURE - device detached > > > Jan 29 19:20:26 pluto kernel: subdisk1: detached > > > Jan 29 19:20:26 pluto kernel: ad1: detached > > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot write metadata on ad1 (device=gm0, error=6). > > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot update metadata on disk ad1 (error=6). > > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot update metadata on disk ad1 (error=6). > > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Device gm0: provider ad1 disconnected. > > > Jan 29 19:50:13 pluto -- MARK -- > > > > I have seen similar problems on my graid3. I think it's simply the > > disk which stops responding to commands, or at least ata(4) can't talk > > to the disk anymore... > > > > I see it on: > > > > ad10: 305245MB at ata5-master SATA150 > > ad12: 305245MB at ata6-master SATA150 > > ad14: 305245MB at ata7-master SATA150 > > > > After a reboot everything seems fine again and my RAID is rebuilt. > > > > I don't know why it happens, but it sucks :-/. I'm running 7-CURRENT > > BTW. > > It seems that when gmirror/graid3 writes to more than one disk at a > time, this puts too much load on ata channel or something and ata > disconnects the disk. I don't really know how it works exactly, but > maybe some timeout should be increased in the ata code? I mainly see problems when there is high IO load, e.g. if fsck or raid rebuild is running I far more often see problems. I will try to play with timeout values this weekend and see if I can provoke problems. Just for the record, I don't use ataidle or similar to spin my disks down, they should run all the time. -- Simon L. Nielsen