Date: Wed, 05 May 2010 16:56:41 +0200 From: Harald Schmalzbauer <h.schmalzbauer@omnilan.de> To: FreeBSD Stable <freebsd-stable@freebsd.org> Subject: Re: ZFS (zpool) doesn't detect failed drive Message-ID: <4BE18729.3050209@omnilan.de> In-Reply-To: <4BE16784.8050400@omnilan.de>
index | next in thread | previous in thread | raw e-mail
[-- Attachment #1 --]
Harald Schmalzbauer schrieb am 05.05.2010 14:41 (localtime):
> Hello,
>
> one drive of my mirror failed today, but 'zpool staus' shows it "online".
> Every process using a ZFS mount hangs. Also 'zpool offline /dev/ad1'
> hangs infinitely.
...
Sorry, I made an error with zpool create. Somehow the little word
"mirror" must have been lost. So the pool wasn't a mirror but a stripe.
Then of course I can't make one vdev offline. Sorry for the noise.
But I took the opportunity to do some tests with that failing drive and
created a _real_ mirror. That works without failures, but using the
mirror again leads to:
ad1: TIMEOUT - FLUSHCACHE48 retrying (1 retry left)
ad1: TIMEOUT - FLUSHCACHE48 retrying (1 retry left)
ad1: TIMEOUT - FLUSHCACHE48 retrying (1 retry left)
ad1: TIMEOUT - FLUSHCACHE48 retrying (1 retry left)
ad1: TIMEOUT - FLUSHCACHE48 retrying (1 retry left)
ad1: TIMEOUT - FLUSHCACHE48 retrying (1 retry left)
ata3: port is not ready (timeout 10000ms) tfd = 00000080
ata3: hardware reset timeout
ad1: FAILURE - device detached
Now zpool reporsts the vdev ad1 still online although it has been
detached and 'atacontrol list' doesn't show it anymore:
zpool status
pool: URUBAmirrorP1
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are
unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
URUBAmirrorP1 ONLINE 0 0 0
mirror ONLINE 0 0 0
ad1 ONLINE 3 302K 0
ad2 ONLINE 0 0 0
errors: No known data errors
atacontrol list
ATA channel 2:
Master: ad0 <TRANSCEND/20090520> SATA revision 1.x
Slave: no device present
ATA channel 3:
Master: no device present
Slave: no device present
ATA channel 4:
Master: ad2 <SAMSUNG HD154UI/1AG01118> SATA revision 2.x
Slave: no device present
ATA channel 5:
Master: ad3 <ST3750640NS/3.AEG> SATA revision 1.x
Slave: no device present
How should such a failure be handled?
Do I have to manually mark the drive offline for zpool?
Thanks,
-Harry
[-- Attachment #2 --]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.13 (FreeBSD)
iEYEARECAAYFAkvhhykACgkQLDqVQ9VXb8jSkgCgpLygtJqPYi+8ZrCCuUdyI7Pw
LmQAnRn4VGBFQDN8ufU2ckVDMBT9x/NA
=9sN5
-----END PGP SIGNATURE-----
home |
help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4BE18729.3050209>
