Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 21 Sep 2009 11:44:26 -0600
From:      Kurt Touet <ktouet@gmail.com>
To:        Aaron Hurt <aaron@goflexitllc.com>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: ZFS - Unable to offline drive in raidz1 based pool
Message-ID:  <2a5e326f0909211044k349d6bc1lb9bd9094e7216e41@mail.gmail.com>
In-Reply-To: <2a5e326f0909211021o431ef53bh3077589efb0bed6c@mail.gmail.com>
References:  <2a5e326f0909201500w1513aeb5ra644f1c748e22f34@mail.gmail.com> <4AB757E4.5060501@goflexitllc.com> <2a5e326f0909211021o431ef53bh3077589efb0bed6c@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Apparently you were right Aaron:

monolith# zpool scrub storage
monolith# zpool status storage
  pool: storage
 state: ONLINE
 scrub: resilver completed after 0h1m with 0 errors on Mon Sep 21 11:37:24 =
2009
config:

        NAME        STATE     READ WRITE CKSUM
        storage     ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            ad14    ONLINE       0     0     0  1.46M resilvered
            ad6     ONLINE       0     0     0  2K resilvered
            ad12    ONLINE       0     0     0  3K resilvered
            ad4     ONLINE       0     0     0  3K resilvered

errors: No known data errors
monolith# zpool offline storage ad6
monolith# zpool online storage ad6
monolith# zpool status storage
  pool: storage
 state: ONLINE
 scrub: resilver completed after 0h0m with 0 errors on Mon Sep 21 11:40:12 =
2009
config:

        NAME        STATE     READ WRITE CKSUM
        storage     ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            ad14    ONLINE       0     0     0  67.5K resilvered
            ad6     ONLINE       0     0     0  671K resilvered
            ad12    ONLINE       0     0     0  67.5K resilvered
            ad4     ONLINE       0     0     0  53K resilvered

errors: No known data errors


I wonder then, with the storage array reporting itself as healthy, how
did it know that one drive had desynced data, and why wouldn't that
have shown up as an error like DEGRADED?

Cheers,
-kurt


On Mon, Sep 21, 2009 at 11:21 AM, Kurt Touet <ktouet@gmail.com> wrote:
> I thought about that possibility as well.. but I had scrubbed the
> array within 10 days. I'll give it a shot again today and see if that
> brings up any other errors (or allows me to offline the drive
> afterwards).
>
> Cheers,
> -kurt
>
> On Mon, Sep 21, 2009 at 4:39 AM, Aaron Hurt <aaron@goflexitllc.com> wrote=
:
>> Kurt Touet wrote:
>>>
>>> I am using ZFS pool based on a 4-drive raidz1 setup for storage. =A0I
>>> believe that one of the drives is failing, and I'd like to
>>> remove/replace it. =A0The drive has been causing some issues (such as
>>> becoming non-responsive and hanging the system with timeouts), so I'd
>>> like to offline it, and then run in degraded mode until I can grab a
>>> new drive (tomorrow). =A0However, when I disconnected the drive (pulled
>>> the plug, not using a zpool offline command), the following occurred:
>>>
>>> =A0 =A0 =A0 =A0NAME =A0 =A0 =A0 =A0STATE =A0 =A0 READ WRITE CKSUM
>>> =A0 =A0 =A0 =A0storage =A0 =A0 FAULTED =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 =
1
>>> =A0 =A0 =A0 =A0 =A0raidz1 =A0 =A0DEGRADED =A0 =A0 0 =A0 =A0 0 =A0 =A0 0
>>> =A0 =A0 =A0 =A0 =A0 =A0ad14 =A0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =
=A0 0
>>> =A0 =A0 =A0 =A0 =A0 =A0ad6 =A0 =A0 UNAVAIL =A0 =A0 =A00 =A0 =A0 0 =A0 =
=A0 0
>>> =A0 =A0 =A0 =A0 =A0 =A0ad12 =A0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =
=A0 0
>>> =A0 =A0 =A0 =A0 =A0 =A0ad4 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =
=A0 0
>>>
>>> Note: That's my recreation of the output... not the actual text.
>>>
>>> At this point, I was unable to to do anything with the pool... and all
>>> data was inaccessible. =A0Fortunately, the after sitting pulled for a
>>> bit, I tried putting the failing drive back into the array, and it
>>> booted properly. =A0Of course, I still want to replace it, but this is
>>> what happens when I try to take it offline:
>>>
>>> monolith# zpool status storage
>>> =A0pool: storage
>>> =A0state: ONLINE
>>> =A0scrub: none requested
>>> config:
>>>
>>> =A0 =A0 =A0 =A0NAME =A0 =A0 =A0 =A0STATE =A0 =A0 READ WRITE CKSUM
>>> =A0 =A0 =A0 =A0storage =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 0
>>> =A0 =A0 =A0 =A0 =A0raidz1 =A0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0=
 0
>>> =A0 =A0 =A0 =A0 =A0 =A0ad14 =A0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =
=A0 0
>>> =A0 =A0 =A0 =A0 =A0 =A0ad6 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =
=A0 0
>>> =A0 =A0 =A0 =A0 =A0 =A0ad12 =A0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =
=A0 0
>>> =A0 =A0 =A0 =A0 =A0 =A0ad4 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =
=A0 0
>>>
>>> errors: No known data errors
>>> monolith# zpool offline storage ad6
>>> cannot offline ad6: no valid replicas
>>> monolith# uname -a
>>> FreeBSD monolith 8.0-RC1 FreeBSD 8.0-RC1 #2 r197370: Sun Sep 20
>>> 15:32:08 CST 2009 =A0 =A0 k@monolith:/usr/obj/usr/src/sys/MONOLITH =A0a=
md64
>>>
>>> If the array is online and healthy, why can't I simply offline a drive
>>> and then replace it afterwards? =A0Any thoughts? =A0 Also, how does a
>>> degraded raidz1 array end up faulting the entire pool?
>>>
>>> Thanks,
>>> -kurt
>>> _______________________________________________
>>> freebsd-fs@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>>>
>>> !DSPAM:2,4ab6ac55126167777521459!
>>>
>>>
>>
>> I'm not sure why it would be giving you that message. =A0In a raidz1 you
>> should be able to sustain one failure. =A0The only thing that comes to m=
ind
>> this early in the morning would be that somehow your data replication ac=
ross
>> your discs isn't totally in sync. =A0I would suggest you try a scrub and=
 then
>> see if you can remove the drive afterwards.
>>
>> Aaron Hurt
>> Managing Partner
>> Flex I.T., LLC
>> 611 Commerce Street
>> Suite 3117
>> Nashville, TN =A037203
>> Phone: 615.438.7101
>> E-mail: aaron@goflexitllc.com
>>
>>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2a5e326f0909211044k349d6bc1lb9bd9094e7216e41>