Date: Sun, 27 Dec 2009 22:59:34 -0600 (CST) From: Wes Morgan <morganw@chemikals.org> To: Steven Schlansker <stevenschlansker@gmail.com> Cc: freebsd-fs@freebsd.org Subject: Re: ZFS: Can't repair raidz2 (Cannot replace a replacing device) Message-ID: <alpine.BSF.2.00.0912272247410.64051@ibyngvyr> In-Reply-To: <5565955F-482A-4628-A528-117C58046B1F@gmail.com> References: <048AF210-8B9A-40EF-B970-E8794EC66B2F@gmail.com> <4B315320.5050504@quip.cz> <5da0588e0912221741r48395defnd11e34728d2b7b97@mail.gmail.com> <9CEE3EE5-2CF7-440E-B5F4-D2BD796EA55C@gmail.com> <alpine.BSF.2.00.0912240708020.1450@ibyngvyr> <5565955F-482A-4628-A528-117C58046B1F@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 27 Dec 2009, Steven Schlansker wrote: > > On Dec 24, 2009, at 5:17 AM, Wes Morgan wrote: > >> On Wed, 23 Dec 2009, Steven Schlansker wrote: >>> >>> Why has the replacing vdev not gone away? I still can't detach - >>> [steven@universe:~]% sudo zpool detach universe 6170688083648327969 >>> cannot detach 6170688083648327969: no valid replicas >>> even though now there actually is a valid replica (ad26) >> >> Try detaching ad26. If it lets you do that it will abort the replacement and then you just do another replacement with the real device. If it won't let you do that, you may be stuck having to do some metadata tricks. >> > > Sadly, no go: > > pool: universe > state: DEGRADED > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > universe DEGRADED 0 0 0 > raidz2 DEGRADED 0 0 0 > ad16 ONLINE 0 0 0 > replacing DEGRADED 0 0 5.04K > ad26 ONLINE 0 0 0 > 6170688083648327969 UNAVAIL 0 1.08M 0 was /dev/ad12 > ad8 ONLINE 0 0 0 > concat/back2 ONLINE 0 0 0 > ad10 ONLINE 0 0 0 > concat/ad4ex ONLINE 0 0 0 > ad24 ONLINE 0 0 0 > concat/ad6ex ONLINE 0 0 0 > > errors: No known data errors > [steven@universe:~]% sudo zpool detach universe ad26 > cannot detach ad26: no valid replicas > [steven@universe:~]% sudo zpool offline -t universe ad26 > cannot offline ad26: no valid replicas > I just tried to re-create this scenario with some sparse files and I was able to detach it completely (below). There is one difference, however. Your array is returning checksum errors for the ad26 device. Perhaps this is making the system think that there is no sibling device in the replacement node that has all the data, so it denies the detach. Even though logically the data will be recovered by a scrub later.. Interesting. If you can determine where the detach is failing, that will help paint the complete picture. [root@catalyst:~#]: zpool status testz2 pool: testz2 state: DEGRADED scrub: none requested config: NAME STATE READ WRITE CKSUM testz2 DEGRADED 0 0 0 raidz2 DEGRADED 0 0 0 md1 ONLINE 0 0 0 md2 ONLINE 0 0 0 replacing DEGRADED 0 0 0 md3 ONLINE 0 0 0 8502561034916233095 UNAVAIL 0 323 0 was /dev/md7 md4 ONLINE 0 0 0 md5 ONLINE 0 0 0 md6 ONLINE 0 0 0 errors: No known data errors [root@catalyst:~#]: zpool detach testz2 8502561034916233095 [root@catalyst:~#]: zpool status testz2 pool: testz2 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM testz2 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 md1 ONLINE 0 0 0 md2 ONLINE 0 0 0 md3 ONLINE 0 0 0 md4 ONLINE 0 0 0 md5 ONLINE 0 0 0 md6 ONLINE 0 0 0 errors: No known data errors [root@catalyst:~#]:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.00.0912272247410.64051>