Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 06 Sep 2024 14:16:39 -0500
From:      Wes Morgan <morganw@gmail.com>
To:        Chris Ross <cross+freebsd@distal.com>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: Unable to replace drive in raidz1
Message-ID:  <E85B00B1-7205-486D-800C-E6837780E819@gmail.com>
In-Reply-To: <E6C615C1-E9D2-4F0D-8DC2-710BAAF10954@distal.com>
References:  <5ED5CB56-2E2A-4D83-8CDA-6D6A0719ED19@distal.com> <AC67D073-D476-41F5-AC53-F671430BB493@distal.com> <CAOtMX2h52d0vtceuwcDk2dzkH-fZW32inhk-dfjLMJxetVXKYg@mail.gmail.com> <CB79EC2B-E793-4561-95E7-D1CEEEFC1D72@distal.com> <CAOtMX2i_zFYuOnEK_aVkpO_M8uJCvGYW%2BSzLn3OED4n5fKFoEA@mail.gmail.com> <6A20ABDA-9BEA-4526-94C1-5768AA564C13@distal.com> <CAOtMX2jfcd43sBpHraWA=5e_Ka=hMD654m-5=boguPPbYXE4yw@mail.gmail.com> <0CF1E2D7-6C82-4A8B-82C3-A5BF1ED939CF@distal.com> <CAOtMX2hRJvt9uhctKvXO4R2tUNq9zeCEx6NZmc7Vk7fH=HO8eA@mail.gmail.com> <29003A7C-745D-4A06-8558-AE64310813EA@distal.com> <42346193-AD06-4D26-B0C6-4392953D21A3@gmail.com> <E6C615C1-E9D2-4F0D-8DC2-710BAAF10954@distal.com>

next in thread | previous in thread | raw e-mail | index | archive | help


On September 6, 2024 1:54:32 PM CDT, Chris Ross <cross+freebsd@distal=2Eco=
m> wrote:
>
>
>> On Sep 6, 2024, at 14:39, Wes Morgan <morganw@gmail=2Ecom> wrote:
>>=20
>>=20
>> You should make the changes to your /boot/loader=2Econf as suggested ea=
rlier by Freddie Cash and reboot=2E This will eliminate all the confusion w=
ith diskid=2E Then run "zpool clear", which, if da3 is still online and not=
 completely dead, the pool should come out of the faulted state=2E Check zp=
ool status to look for this alleged replacement in progress=2E If it is tru=
ly trying to replace a device, it should show up in zpool status with the a=
ctual device, or the guid if it can't find the device=2E
>
>I saw and appreiciated that response, but didn=E2=80=99t respond on that =
thread because I don=E2=80=99t _want_ to turn all of those things off=2E  A=
t least, I don=E2=80=99t want to refer to everything by the auto-numbered d=
a# that I think that will cause=2E  And, Freddie, your comment about GPT pa=
rtition labels I think doesn=E2=80=99t apply because I don=E2=80=99t have G=
PT on my disks=2E  Just all one big ZFS device=2E  This is why I=E2=80=99m =
looking at glabel=E2=80=99s generic labeling now=2E

You probably don't want that=2E You will have to use the glabel dev, which=
 will not be the same size as your other devices=2E IIRC you have no contro=
l over what device node the system finds first for the pool=2E Even if you =
use GPT labels, the daXpY device will still exist=2E=20

>The former da3 is off-line, out of the chassis=2E  I replaced a disk in a=
 full chassis, having them both online at the same time is not possible=2E =
 That drive in ZFS=E2=80=99s mind is only faulted because I tried =E2=80=9C=
zpool offline -f=E2=80=9D on it to see if that helped=2E

It sounds like you have replaced the wrong device=2E Check the "zpool hist=
ory" to see what you did=2E=20

In your earlier message, three devices were shown in each raidz, when what=
 you should be seeing is that one raidz has an offline device identified by=
 guid and maybe "was /dev/da3" that is being replaced, along with the repla=
cement device=2E I don't see any of that=2E=20

>> If you have initiated a replace, and the replacing disk has now been "l=
ost" or unlabeled, you are in a bind=2E I ran into this problem many years =
ago, and I thought it was fixed, but the bug was called something like "can=
't replace a replacing vdev"=2E I ultimately solved my problem by manually =
editing a fake vdev to have the same guid as the missing device, restarting=
 the replace and then canceling it before zfs realized it was fake=2E But, =
I am almost certain that zpool cancel can do this now, with the guid=2E
>
>I didn=E2=80=99t initiate a replace until after the disks were physically=
 changed=2E  Although in this conversation realize that things likely got c=
onfused by the replacement in the kernel=E2=80=99s mind of da3 with what us=
ed to be da4=2E  :-/

This is why your zpool history will be helpful=2E What did you actually tr=
y to replace, and what did you mean to replace=2E=20


>> If da10 has a label that says it is in the pool, it is probably the "re=
placing" vdev and should be picked up=E2=80=A6
>
>Da10, now also /dev/label/drive03, seems to think it=E2=80=99s in the poo=
l somewhere, according to zdb -l=2E
>But I=E2=80=99m not sure if this helps=2E  And, following your other mess=
age saying I shouldn=E2=80=99t put labels
>on disks that are to be used in their entirety as ZFS devices, I=E2=80=99=
ve deleted that label and
>zlabelclear=E2=80=99d this device now=2E  (since the zfs label still had =
the /dev/label/ path in it)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E85B00B1-7205-486D-800C-E6837780E819>