From owner-freebsd-fs@FreeBSD.ORG Mon Sep 21 17:44:29 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 62EBE1065670 for ; Mon, 21 Sep 2009 17:44:29 +0000 (UTC) (envelope-from ktouet@gmail.com) Received: from mail-yw0-f178.google.com (mail-yw0-f178.google.com [209.85.211.178]) by mx1.freebsd.org (Postfix) with ESMTP id 1B98C8FC17 for ; Mon, 21 Sep 2009 17:44:28 +0000 (UTC) Received: by ywh8 with SMTP id 8so3816201ywh.14 for ; Mon, 21 Sep 2009 10:44:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=e3NIuXTvlMABFbOHLxkJaRtePnNIYN0r0r6O6S3AVKU=; b=ipPTiHsZt0nbhVZgvjMAMdVMWuGuUhvuj+Dzo/GnXyXWWL3h8py/sVqMn0dYJx4rIz mO5yNdtmm7Zur3YXl4F+TN05CO+O+5uRv2UA37sR87BrvsgxKDTNpe3a4xCoSrXl37EH mQYRNPh3mQQensuMTs4E3qD2gPhMeFg4fcy8Q= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=KH4LG6hHNcAedJnDM2BWl4g/PWSsnXC5TbJanoflkHCFU9+08ggf53UrNN/2z0HNK2 iZVLljl0ipw7bMHHFfUUI3mDfsLdyAT4F7fgXzqvyCB2VwEMUnJ7SO6uhH+k9jhSPQW7 uT0bferQP8iD6bZRbhB6UhGi3/eVQVncbmFF8= MIME-Version: 1.0 Received: by 10.91.189.1 with SMTP id r1mr3441527agp.109.1253555067040; Mon, 21 Sep 2009 10:44:27 -0700 (PDT) In-Reply-To: <2a5e326f0909211021o431ef53bh3077589efb0bed6c@mail.gmail.com> References: <2a5e326f0909201500w1513aeb5ra644f1c748e22f34@mail.gmail.com> <4AB757E4.5060501@goflexitllc.com> <2a5e326f0909211021o431ef53bh3077589efb0bed6c@mail.gmail.com> Date: Mon, 21 Sep 2009 11:44:26 -0600 Message-ID: <2a5e326f0909211044k349d6bc1lb9bd9094e7216e41@mail.gmail.com> From: Kurt Touet To: Aaron Hurt Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: ZFS - Unable to offline drive in raidz1 based pool X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Sep 2009 17:44:29 -0000 Apparently you were right Aaron: monolith# zpool scrub storage monolith# zpool status storage pool: storage state: ONLINE scrub: resilver completed after 0h1m with 0 errors on Mon Sep 21 11:37:24 = 2009 config: NAME STATE READ WRITE CKSUM storage ONLINE 0 0 0 raidz1 ONLINE 0 0 0 ad14 ONLINE 0 0 0 1.46M resilvered ad6 ONLINE 0 0 0 2K resilvered ad12 ONLINE 0 0 0 3K resilvered ad4 ONLINE 0 0 0 3K resilvered errors: No known data errors monolith# zpool offline storage ad6 monolith# zpool online storage ad6 monolith# zpool status storage pool: storage state: ONLINE scrub: resilver completed after 0h0m with 0 errors on Mon Sep 21 11:40:12 = 2009 config: NAME STATE READ WRITE CKSUM storage ONLINE 0 0 0 raidz1 ONLINE 0 0 0 ad14 ONLINE 0 0 0 67.5K resilvered ad6 ONLINE 0 0 0 671K resilvered ad12 ONLINE 0 0 0 67.5K resilvered ad4 ONLINE 0 0 0 53K resilvered errors: No known data errors I wonder then, with the storage array reporting itself as healthy, how did it know that one drive had desynced data, and why wouldn't that have shown up as an error like DEGRADED? Cheers, -kurt On Mon, Sep 21, 2009 at 11:21 AM, Kurt Touet wrote: > I thought about that possibility as well.. but I had scrubbed the > array within 10 days. I'll give it a shot again today and see if that > brings up any other errors (or allows me to offline the drive > afterwards). > > Cheers, > -kurt > > On Mon, Sep 21, 2009 at 4:39 AM, Aaron Hurt wrote= : >> Kurt Touet wrote: >>> >>> I am using ZFS pool based on a 4-drive raidz1 setup for storage. =A0I >>> believe that one of the drives is failing, and I'd like to >>> remove/replace it. =A0The drive has been causing some issues (such as >>> becoming non-responsive and hanging the system with timeouts), so I'd >>> like to offline it, and then run in degraded mode until I can grab a >>> new drive (tomorrow). =A0However, when I disconnected the drive (pulled >>> the plug, not using a zpool offline command), the following occurred: >>> >>> =A0 =A0 =A0 =A0NAME =A0 =A0 =A0 =A0STATE =A0 =A0 READ WRITE CKSUM >>> =A0 =A0 =A0 =A0storage =A0 =A0 FAULTED =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 = 1 >>> =A0 =A0 =A0 =A0 =A0raidz1 =A0 =A0DEGRADED =A0 =A0 0 =A0 =A0 0 =A0 =A0 0 >>> =A0 =A0 =A0 =A0 =A0 =A0ad14 =A0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 = =A0 0 >>> =A0 =A0 =A0 =A0 =A0 =A0ad6 =A0 =A0 UNAVAIL =A0 =A0 =A00 =A0 =A0 0 =A0 = =A0 0 >>> =A0 =A0 =A0 =A0 =A0 =A0ad12 =A0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 = =A0 0 >>> =A0 =A0 =A0 =A0 =A0 =A0ad4 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 = =A0 0 >>> >>> Note: That's my recreation of the output... not the actual text. >>> >>> At this point, I was unable to to do anything with the pool... and all >>> data was inaccessible. =A0Fortunately, the after sitting pulled for a >>> bit, I tried putting the failing drive back into the array, and it >>> booted properly. =A0Of course, I still want to replace it, but this is >>> what happens when I try to take it offline: >>> >>> monolith# zpool status storage >>> =A0pool: storage >>> =A0state: ONLINE >>> =A0scrub: none requested >>> config: >>> >>> =A0 =A0 =A0 =A0NAME =A0 =A0 =A0 =A0STATE =A0 =A0 READ WRITE CKSUM >>> =A0 =A0 =A0 =A0storage =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 0 >>> =A0 =A0 =A0 =A0 =A0raidz1 =A0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0= 0 >>> =A0 =A0 =A0 =A0 =A0 =A0ad14 =A0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 = =A0 0 >>> =A0 =A0 =A0 =A0 =A0 =A0ad6 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 = =A0 0 >>> =A0 =A0 =A0 =A0 =A0 =A0ad12 =A0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 = =A0 0 >>> =A0 =A0 =A0 =A0 =A0 =A0ad4 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 = =A0 0 >>> >>> errors: No known data errors >>> monolith# zpool offline storage ad6 >>> cannot offline ad6: no valid replicas >>> monolith# uname -a >>> FreeBSD monolith 8.0-RC1 FreeBSD 8.0-RC1 #2 r197370: Sun Sep 20 >>> 15:32:08 CST 2009 =A0 =A0 k@monolith:/usr/obj/usr/src/sys/MONOLITH =A0a= md64 >>> >>> If the array is online and healthy, why can't I simply offline a drive >>> and then replace it afterwards? =A0Any thoughts? =A0 Also, how does a >>> degraded raidz1 array end up faulting the entire pool? >>> >>> Thanks, >>> -kurt >>> _______________________________________________ >>> freebsd-fs@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >>> >>> !DSPAM:2,4ab6ac55126167777521459! >>> >>> >> >> I'm not sure why it would be giving you that message. =A0In a raidz1 you >> should be able to sustain one failure. =A0The only thing that comes to m= ind >> this early in the morning would be that somehow your data replication ac= ross >> your discs isn't totally in sync. =A0I would suggest you try a scrub and= then >> see if you can remove the drive afterwards. >> >> Aaron Hurt >> Managing Partner >> Flex I.T., LLC >> 611 Commerce Street >> Suite 3117 >> Nashville, TN =A037203 >> Phone: 615.438.7101 >> E-mail: aaron@goflexitllc.com >> >> >