From owner-freebsd-stable@FreeBSD.ORG Wed May 5 12:41:42 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C8A1C1065679 for ; Wed, 5 May 2010 12:41:42 +0000 (UTC) (envelope-from h.schmalzbauer@omnilan.de) Received: from host.omnilan.net (host.omnilan.net [62.245.232.135]) by mx1.freebsd.org (Postfix) with ESMTP id 54D688FC1D for ; Wed, 5 May 2010 12:41:41 +0000 (UTC) Received: from titan.flintsbach.schmalzbauer.de (titan.flintsbach.schmalzbauer.de [172.21.1.150]) (authenticated bits=0) by host.omnilan.net (8.13.8/8.13.8) with ESMTP id o45CfeVE075364 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 5 May 2010 14:41:41 +0200 (CEST) (envelope-from h.schmalzbauer@omnilan.de) Message-ID: <4BE16784.8050400@omnilan.de> Date: Wed, 05 May 2010 14:41:40 +0200 From: Harald Schmalzbauer Organization: OmniLAN User-Agent: Thunderbird 2.0.0.23 (X11/20090906) MIME-Version: 1.0 To: FreeBSD Stable X-Enigmail-Version: 0.95.6 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigF882454B7297B5A9D4A5538D" Subject: ZFS (zpool) doesn't detect failed drive X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 May 2010 12:41:42 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigF882454B7297B5A9D4A5538D Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: quoted-printable Hello, one drive of my mirror failed today, but 'zpool staus' shows it "online".= Every process using a ZFS mount hangs. Also 'zpool offline /dev/ad1'=20 hangs infinitely. Here's the dmesg of the failing (and correctly detached) device: ad1: TIMEOUT - FLUSHCACHE48 retrying (1 retry left) ata3: port is not ready (timeout 10000ms) tfd =3D 00000080 ata3: hardware reset timeout ad1: FAILURE - device detached But: zpool status pool: URUBAmirrorP1 state: ONLINE status: One or more devices are faulted in response to IO failures. action: Make sure the affected devices are connected, then run 'zpool=20 clear'. see: http://www.sun.com/msg/ZFS-8000-JQ scrub: none requested config: NAME STATE READ WRITE CKSUM URUBAmirrorP1 ONLINE 0 7K 0 ad1 ONLINE 3 14,9K 0 ad2 ONLINE 0 0 0 Reboot doesn't work, somebody had to reset the machine. How should such a error event be handled??? Isn't a mirror useless if=20 there's no way to continue with one remaining good drive? If the OS was on the same pool the complete machine is unaccessable with = a failing drive..?!? Thanks, -Harry --------------enigF882454B7297B5A9D4A5538D Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.13 (FreeBSD) iEYEARECAAYFAkvhZ4QACgkQLDqVQ9VXb8gmqACgtNZdKOaSuuWH9poO0lK8XzvU JQkAniEJS2+ouUufjNzI7SVM25C2W+HF =pEos -----END PGP SIGNATURE----- --------------enigF882454B7297B5A9D4A5538D--