Date: Fri, 6 Sep 2024 14:54:32 -0400 From: Chris Ross <cross+freebsd@distal.com> To: Wes Morgan <morganw@gmail.com> Cc: freebsd-fs@freebsd.org Subject: Re: Unable to replace drive in raidz1 Message-ID: <E6C615C1-E9D2-4F0D-8DC2-710BAAF10954@distal.com> In-Reply-To: <42346193-AD06-4D26-B0C6-4392953D21A3@gmail.com> References: <5ED5CB56-2E2A-4D83-8CDA-6D6A0719ED19@distal.com> <AC67D073-D476-41F5-AC53-F671430BB493@distal.com> <CAOtMX2h52d0vtceuwcDk2dzkH-fZW32inhk-dfjLMJxetVXKYg@mail.gmail.com> <CB79EC2B-E793-4561-95E7-D1CEEEFC1D72@distal.com> <CAOtMX2i_zFYuOnEK_aVkpO_M8uJCvGYW%2BSzLn3OED4n5fKFoEA@mail.gmail.com> <6A20ABDA-9BEA-4526-94C1-5768AA564C13@distal.com> <CAOtMX2jfcd43sBpHraWA=5e_Ka=hMD654m-5=boguPPbYXE4yw@mail.gmail.com> <0CF1E2D7-6C82-4A8B-82C3-A5BF1ED939CF@distal.com> <CAOtMX2hRJvt9uhctKvXO4R2tUNq9zeCEx6NZmc7Vk7fH=HO8eA@mail.gmail.com> <29003A7C-745D-4A06-8558-AE64310813EA@distal.com> <42346193-AD06-4D26-B0C6-4392953D21A3@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> On Sep 6, 2024, at 14:39, Wes Morgan <morganw@gmail.com> wrote: >=20 >=20 > You should make the changes to your /boot/loader.conf as suggested = earlier by Freddie Cash and reboot. This will eliminate all the = confusion with diskid. Then run "zpool clear", which, if da3 is still = online and not completely dead, the pool should come out of the faulted = state. Check zpool status to look for this alleged replacement in = progress. If it is truly trying to replace a device, it should show up = in zpool status with the actual device, or the guid if it can't find the = device. I saw and appreiciated that response, but didn=E2=80=99t respond on that = thread because I don=E2=80=99t _want_ to turn all of those things off. = At least, I don=E2=80=99t want to refer to everything by the = auto-numbered da# that I think that will cause. And, Freddie, your = comment about GPT partition labels I think doesn=E2=80=99t apply because = I don=E2=80=99t have GPT on my disks. Just all one big ZFS device. = This is why I=E2=80=99m looking at glabel=E2=80=99s generic labeling = now. The former da3 is off-line, out of the chassis. I replaced a disk in a = full chassis, having them both online at the same time is not possible. = That drive in ZFS=E2=80=99s mind is only faulted because I tried = =E2=80=9Czpool offline -f=E2=80=9D on it to see if that helped. > If you have initiated a replace, and the replacing disk has now been = "lost" or unlabeled, you are in a bind. I ran into this problem many = years ago, and I thought it was fixed, but the bug was called something = like "can't replace a replacing vdev". I ultimately solved my problem by = manually editing a fake vdev to have the same guid as the missing = device, restarting the replace and then canceling it before zfs realized = it was fake. But, I am almost certain that zpool cancel can do this now, = with the guid. I didn=E2=80=99t initiate a replace until after the disks were = physically changed. Although in this conversation realize that things = likely got confused by the replacement in the kernel=E2=80=99s mind of = da3 with what used to be da4. :-/ > If da10 has a label that says it is in the pool, it is probably the = "replacing" vdev and should be picked up=E2=80=A6 Da10, now also /dev/label/drive03, seems to think it=E2=80=99s in the = pool somewhere, according to zdb -l. But I=E2=80=99m not sure if this helps. And, following your other = message saying I shouldn=E2=80=99t put labels on disks that are to be used in their entirety as ZFS devices, I=E2=80=99v= e deleted that label and zlabelclear=E2=80=99d this device now. (since the zfs label still had = the /dev/label/ path in it) - Chris
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E6C615C1-E9D2-4F0D-8DC2-710BAAF10954>