From owner-freebsd-geom@FreeBSD.ORG Wed Jan 31 22:01:02 2007 Return-Path: X-Original-To: freebsd-geom@FreeBSD.ORG Delivered-To: freebsd-geom@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 44D6916A402; Wed, 31 Jan 2007 22:01:02 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl [83.17.198.132]) by mx1.freebsd.org (Postfix) with ESMTP id 90CD613C4A8; Wed, 31 Jan 2007 22:01:01 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 26CD245CDA; Wed, 31 Jan 2007 23:01:00 +0100 (CET) Received: from localhost (154.81.datacomsa.pl [195.34.81.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id BFCF64569A; Wed, 31 Jan 2007 23:00:54 +0100 (CET) Date: Wed, 31 Jan 2007 23:00:04 +0100 From: Pawel Jakub Dawidek To: "Simon L. Nielsen" Message-ID: <20070131220004.GC487@garage.freebsd.pl> References: <200701300851.l0U8pEkO005250@lurza.secnetix.de> <20070131201201.GB973@zaphod.nitro.dk> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="6zdv2QT/q3FMhpsV" Content-Disposition: inline In-Reply-To: <20070131201201.GB973@zaphod.nitro.dk> X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 7.0-CURRENT i386 User-Agent: mutt-ng/devel-r804 (FreeBSD) X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: sos@FreeBSD.org, Oliver Fromme , freebsd-geom@FreeBSD.ORG Subject: Re: gmirror or ata problem X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 Jan 2007 22:01:02 -0000 --6zdv2QT/q3FMhpsV Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jan 31, 2007 at 09:12:02PM +0100, Simon L. Nielsen wrote: > On 2007.01.30 09:51:14 +0100, Oliver Fromme wrote: >=20 > > This is strange. gmirror just detached one of its disks > > for no apparent reason. I've built a mirror consisting of > > the components ad0 and ad1 (both SATA drives). It has > > been running fine. This is RELENG_6 from 2006-12-20. > >=20 > > Yesterday evening ad1 was detached. There is no other > > error message logged on console or in the logs (i.e. no > > I/O error such as a bad sector or anything). There was > > no particularly high load at that time. In fact, the > > machine had been under much higher load before, without > > anything bad happening. > >=20 > > This is from the logs: > >=20 > > Jan 29 19:10:13 pluto -- MARK -- > > Jan 29 19:20:26 pluto kernel: ad1: FAILURE - device detached > > Jan 29 19:20:26 pluto kernel: subdisk1: detached > > Jan 29 19:20:26 pluto kernel: ad1: detached > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot write metadata on ad1= (device=3Dgm0, error=3D6). > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot update metadata on di= sk ad1 (error=3D6). > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot update metadata on di= sk ad1 (error=3D6). > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Device gm0: provider ad1 dis= connected. > > Jan 29 19:50:13 pluto -- MARK -- >=20 > I have seen similar problems on my graid3. I think it's simply the > disk which stops responding to commands, or at least ata(4) can't talk > to the disk anymore... >=20 > I see it on: >=20 > ad10: 305245MB at ata5-master SATA150 > ad12: 305245MB at ata6-master SATA150 > ad14: 305245MB at ata7-master SATA150 >=20 > After a reboot everything seems fine again and my RAID is rebuilt. >=20 > I don't know why it happens, but it sucks :-/. I'm running 7-CURRENT > BTW. It seems that when gmirror/graid3 writes to more than one disk at a time, this puts too much load on ata channel or something and ata disconnects the disk. I don't really know how it works exactly, but maybe some timeout should be increased in the ata code? --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --6zdv2QT/q3FMhpsV Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFFwRFkForvXbEpPzQRAlMeAKDWwPjha/sx1jFR6XMMA4xJ4iSQtgCeNZ06 wELBJjHfOcMiP1VPUjJVBkU= =/smt -----END PGP SIGNATURE----- --6zdv2QT/q3FMhpsV--