From owner-freebsd-fs@FreeBSD.ORG Sat Jan 9 23:35:33 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1A0FE1065692 for ; Sat, 9 Jan 2010 23:35:33 +0000 (UTC) (envelope-from stevenschlansker@gmail.com) Received: from mail-yx0-f171.google.com (mail-yx0-f171.google.com [209.85.210.171]) by mx1.freebsd.org (Postfix) with ESMTP id C04D68FC0C for ; Sat, 9 Jan 2010 23:35:32 +0000 (UTC) Received: by yxe1 with SMTP id 1so19057926yxe.3 for ; Sat, 09 Jan 2010 15:35:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:content-type:mime-version :subject:from:in-reply-to:date:content-transfer-encoding:message-id :references:to:x-mailer; bh=Y9FfXUojKxNnGklHpB3GvyLcNgjcjrOQVD/fpMlgL6Y=; b=hdVwbhF9+A8SuGOUF2rFsGmhDCkmZUYSMQqxXdbitRQ2ADjOccHfXM2dOHkyRsr7NP Dln9OLnCcaGLScIWOMrULhXUrC3ahdpjOGiBeJdHuYnAWb+TSvVvwQK8XBnrnDChE5dq gPSsGBBVh3UypBiJIJ+IpaBfMy/nSEBMVC0pA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer; b=WBeRfhP34FKj0iQyTG6kN40UD01bCxWUT3RbBFbxWVEpcGgs82eW7PSsGEFmSDXRWH Vr8TGVCydZbKQDaRwsBxWoaeiJKVeW1oYfqJaVIaNprsVYCNpbsk33nc3yQR4BXSSfwZ jQrJnoC7Uea7vXacDYFAE3Pa9LVX1qx+8povY= Received: by 10.151.16.3 with SMTP id t3mr5363777ybi.264.1263080128840; Sat, 09 Jan 2010 15:35:28 -0800 (PST) Received: from ?192.168.42.92? (70-36-134-162.dsl.dynamic.sonic.net [70.36.134.162]) by mx.google.com with ESMTPS id 9sm9854822yxf.41.2010.01.09.15.35.27 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sat, 09 Jan 2010 15:35:27 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1077) From: Steven Schlansker In-Reply-To: Date: Sat, 9 Jan 2010 15:35:03 -0800 Content-Transfer-Encoding: quoted-printable Message-Id: References: <048AF210-8B9A-40EF-B970-E8794EC66B2F@gmail.com> <4B315320.5050504@quip.cz> <5da0588e0912221741r48395defnd11e34728d2b7b97@mail.gmail.com> <9CEE3EE5-2CF7-440E-B5F4-D2BD796EA55C@gmail.com> <5565955F-482A-4628-A528-117C58046B1F@gmail.com> To: freebsd-fs@freebsd.org X-Mailer: Apple Mail (2.1077) Subject: Re: ZFS: Can't repair raidz2 (Cannot replace a replacing device) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Jan 2010 23:35:33 -0000 On Dec 27, 2009, at 8:59 PM, Wes Morgan wrote: > On Sun, 27 Dec 2009, Steven Schlansker wrote: >=20 >>=20 >> On Dec 24, 2009, at 5:17 AM, Wes Morgan wrote: >>=20 >>> On Wed, 23 Dec 2009, Steven Schlansker wrote: >>>>=20 >>>> Why has the replacing vdev not gone away? I still can't detach - >>>> [steven@universe:~]% sudo zpool detach universe 6170688083648327969 >>>> cannot detach 6170688083648327969: no valid replicas >>>> even though now there actually is a valid replica (ad26) >>>=20 >>> Try detaching ad26. If it lets you do that it will abort the = replacement and then you just do another replacement with the real = device. If it won't let you do that, you may be stuck having to do some = metadata tricks. >>>=20 >>=20 >> errors: No known data errors >> [steven@universe:~]% sudo zpool detach universe ad26 >> cannot detach ad26: no valid replicas >> [steven@universe:~]% sudo zpool offline -t universe ad26 >> cannot offline ad26: no valid replicas >>=20 >=20 > I just tried to re-create this scenario with some sparse files and I = was able to detach it completely (below). There is one difference, = however. Your array is returning checksum errors for the ad26 device. = Perhaps this is making the system think that there is no sibling device = in the replacement node that has all the data, so it denies the detach. = Even though logically the data will be recovered by a scrub later.. = Interesting. If you can determine where the detach is failing, that will = help paint the complete picture. >=20 Interestingly enough, I found a solution! Somewhat roundabout, but what = I did was replace a different device and let it resilver completely. = Then the array looked like this: NAME STATE READ WRITE CKSUM universe DEGRADED 0 0 0 raidz2 DEGRADED 0 0 0 ad16 ONLINE 0 0 0 replacing DEGRADED 0 0 0 ad26 ONLINE 0 0 0 6170688083648327969 UNAVAIL 0 1.13M 0 was = /dev/ad12 ad8 ONLINE 0 0 0 da0 ONLINE 0 0 0 ad10 ONLINE 0 0 0 concat/ad4ex ONLINE 0 0 0 ad24 ONLINE 0 0 0 concat/ad6ex ONLINE 0 0 0 Just for kicks, I then tried to detach - [steven@universe:~]% sudo zpool detach universe 6170688083648327969 [steven@universe:~]% sudo zpool status pool: universe state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM universe ONLINE 0 0 0 raidz2 ONLINE 0 0 0 ad16 ONLINE 0 0 0 ad26 ONLINE 0 0 0 ad8 ONLINE 0 0 0 da0 ONLINE 0 0 0 ad10 ONLINE 0 0 0 concat/ad4ex ONLINE 0 0 0 ad24 ONLINE 0 0 0 concat/ad6ex ONLINE 0 0 0 Ta-da! I have no idea why this helped, or how it fixed it, but if = anyone has this problem in the future try replacing a different device, letting it resilver, and = then detach the original problematic devices.