Date: Fri, 6 Sep 2024 15:34:36 -0400 From: Chris Ross <cross+freebsd@distal.com> To: Wes Morgan <morganw@gmail.com> Cc: freebsd-fs@freebsd.org Subject: Re: Unable to replace drive in raidz1 Message-ID: <E93A9CA8-6705-4C26-9F33-B620A365F4BD@distal.com> In-Reply-To: <E85B00B1-7205-486D-800C-E6837780E819@gmail.com> References: <5ED5CB56-2E2A-4D83-8CDA-6D6A0719ED19@distal.com> <AC67D073-D476-41F5-AC53-F671430BB493@distal.com> <CAOtMX2h52d0vtceuwcDk2dzkH-fZW32inhk-dfjLMJxetVXKYg@mail.gmail.com> <CB79EC2B-E793-4561-95E7-D1CEEEFC1D72@distal.com> <CAOtMX2i_zFYuOnEK_aVkpO_M8uJCvGYW%2BSzLn3OED4n5fKFoEA@mail.gmail.com> <6A20ABDA-9BEA-4526-94C1-5768AA564C13@distal.com> <CAOtMX2jfcd43sBpHraWA=5e_Ka=hMD654m-5=boguPPbYXE4yw@mail.gmail.com> <0CF1E2D7-6C82-4A8B-82C3-A5BF1ED939CF@distal.com> <CAOtMX2hRJvt9uhctKvXO4R2tUNq9zeCEx6NZmc7Vk7fH=HO8eA@mail.gmail.com> <29003A7C-745D-4A06-8558-AE64310813EA@distal.com> <42346193-AD06-4D26-B0C6-4392953D21A3@gmail.com> <E6C615C1-E9D2-4F0D-8DC2-710BAAF10954@distal.com> <E85B00B1-7205-486D-800C-E6837780E819@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--Apple-Mail=_26064149-B678-4693-B259-B3840725655C Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Sep 6, 2024, at 15:16, Wes Morgan <morganw@gmail.com> wrote: >=20 > You probably don't want that. You will have to use the glabel dev, = which will not be the same size as your other devices. IIRC you have no = control over what device node the system finds first for the pool. Even = if you use GPT labels, the daXpY device will still exist.=20 Right. But if I don=E2=80=99t _use_ those device names, it won=E2=80=99t = matter. If I use /dev/label/foo, or /dev/gpt/foo, I=E2=80=99ll just = always use those. I just did that with the ufs disk I have since it = moved names, now it=E2=80=99s "/dev/ufs/drive12=E2=80=9D in /etc/fstab = et al. I want to have some sort of label. I=E2=80=99d rather not have to add a = partitioning scheme to the disk if I know I=E2=80=99m just going to use = the whole disk just to get a label, but I suppose if I have to I can. = Though I=E2=80=99d have to do it one disk at a time. :-) >=20 >> The former da3 is off-line, out of the chassis. I replaced a disk in = a full chassis, having them both online at the same time is not = possible. That drive in ZFS=E2=80=99s mind is only faulted because I = tried =E2=80=9Czpool offline -f=E2=80=9D on it to see if that helped. >=20 > It sounds like you have replaced the wrong device. Check the "zpool = history" to see what you did.=20 >=20 > In your earlier message, three devices were shown in each raidz, when = what you should be seeing is that one raidz has an offline device = identified by guid and maybe "was /dev/da3" that is being replaced, = along with the replacement device. I don't see any of that.=20 History attached. There is no replacement device (sub-vdev) until after = the =E2=80=9Czpool replace=E2=80=9D starts, which it won=E2=80=99t. >> I didn=E2=80=99t initiate a replace until after the disks were = physically changed. Although in this conversation realize that things = likely got confused by the replacement in the kernel=E2=80=99s mind of = da3 with what used to be da4. :-/ >=20 > This is why your zpool history will be helpful. What did you actually = try to replace, and what did you mean to replace.=20 All of my history since the last previous boot in May. 2024-09-05.09:40:14 zpool offline tank da3 2024-09-05.14:26:44 zpool import -c /etc/zfs/zpool.cache -a -N 2024-09-05.14:32:45 zpool import -c /etc/zfs/zpool.cache -a -N 2024-09-05.14:52:18 zpool offline tank da3 2024-09-05.14:53:51 zpool offline tank da3 2024-09-05.14:59:43 zpool offline -f tank da3 2024-09-05.15:02:53 zpool clear tank 2024-09-05.15:07:41 zpool online tank da3 2024-09-05.15:10:00 zpool add tank spare da10 2024-09-05.15:10:20 zpool offline -f tank da3 2024-09-05.15:35:23 zpool remove tank da10 2024-09-05.15:54:35 zpool scrub tank 2024-09-05.16:01:12 zpool set autoreplace=3Don tank 2024-09-05.16:01:24 zpool set autoexpand=3Don tank 2024-09-05.16:02:16 zpool add -o ashift=3D9 tank spare da10 2024-09-06.10:10:20 zpool remove tank da10 So, I offline=E2=80=99d the disk-to-be-replaced at 09:40 yesterday, then = I shut the system down, removed that physical device replacing it with a = larger disk, and rebooted. I suspect the =E2=80=9Coffline=E2=80=9Ds = after that are me experimenting when it was telling me it couldn=E2=80=99t= start the replace action I was asking for. The scrub I started yesterday just because the replace says sometihng = about an operation in progress, so I did that. It completed with no = issues, but nothing changed w.r.t. my current problem. I=E2=80=99m pretty sure the problem here is that the old da3 went away, = and a new da3 came online as a member of raidz1-1. The new disk I added = came online as da10, for some reason. I had to resolve the issue of the = UFS disk which used to be da10 now being da9, but that was easy enough. = Just unexpected. - Chris= --Apple-Mail=_26064149-B678-4693-B259-B3840725655C Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 <html><head><meta http-equiv=3D"content-type" content=3D"text/html; = charset=3Dutf-8"></head><body style=3D"overflow-wrap: break-word; = -webkit-nbsp-mode: space; line-break: after-white-space;"><br = id=3D"lineBreakAtBeginningOfMessage"><div><br><blockquote = type=3D"cite"><div>On Sep 6, 2024, at 15:16, Wes Morgan = <morganw@gmail.com> wrote:</div><div><br style=3D"caret-color: = rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 12px; font-style: = normal; font-variant-caps: normal; font-weight: 400; letter-spacing: = normal; text-align: start; text-indent: 0px; text-transform: none; = white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none;"><span style=3D"caret-color: rgb(0, 0, 0); = font-family: Menlo-Regular; font-size: 12px; font-style: normal; = font-variant-caps: normal; font-weight: 400; letter-spacing: normal; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none; float: none; display: inline !important;">You = probably don't want that. You will have to use the glabel dev, which = will not be the same size as your other devices. IIRC you have no = control over what device node the system finds first for the pool. Even = if you use GPT labels, the daXpY device will still exist.<span = class=3D"Apple-converted-space"> </span></span><br = style=3D"caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; = font-size: 12px; font-style: normal; font-variant-caps: normal; = font-weight: 400; letter-spacing: normal; text-align: start; = text-indent: 0px; text-transform: none; white-space: normal; = word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: = none;"></div></blockquote><div><br></div>Right. But if I don=E2=80=99= t _use_ those device names, it won=E2=80=99t matter. If I use = /dev/label/foo, or /dev/gpt/foo, I=E2=80=99ll just always use those. = I just did that with the ufs disk I have since it moved names, now = it=E2=80=99s "/dev/ufs/drive12=E2=80=9D in /etc/fstab et = al.</div><div><br></div><div>I want to have some sort of label. = I=E2=80=99d rather not have to add a partitioning scheme to the = disk if I know I=E2=80=99m just going to use the whole disk just to get = a label, but I suppose if I have to I can. Though I=E2=80=99d have = to do it one disk at a time. :-)</div><div><br><blockquote = type=3D"cite"><div><br style=3D"caret-color: rgb(0, 0, 0); font-family: = Menlo-Regular; font-size: 12px; font-style: normal; font-variant-caps: = normal; font-weight: 400; letter-spacing: normal; text-align: start; = text-indent: 0px; text-transform: none; white-space: normal; = word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: = none;"><blockquote type=3D"cite" style=3D"font-family: Menlo-Regular; = font-size: 12px; font-style: normal; font-variant-caps: normal; = font-weight: 400; letter-spacing: normal; orphans: auto; text-align: = start; text-indent: 0px; text-transform: none; white-space: normal; = widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none;">The former da3 is off-line, out of the chassis. = I replaced a disk in a full chassis, having them both online at = the same time is not possible. That drive in ZFS=E2=80=99s mind is = only faulted because I tried =E2=80=9Czpool offline -f=E2=80=9D on it to = see if that helped.<br></blockquote><br style=3D"caret-color: rgb(0, 0, = 0); font-family: Menlo-Regular; font-size: 12px; font-style: normal; = font-variant-caps: normal; font-weight: 400; letter-spacing: normal; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none;"><span style=3D"caret-color: rgb(0, 0, 0); = font-family: Menlo-Regular; font-size: 12px; font-style: normal; = font-variant-caps: normal; font-weight: 400; letter-spacing: normal; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none; float: none; display: inline !important;">It = sounds like you have replaced the wrong device. Check the "zpool = history" to see what you did.<span = class=3D"Apple-converted-space"> </span></span><br = style=3D"caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; = font-size: 12px; font-style: normal; font-variant-caps: normal; = font-weight: 400; letter-spacing: normal; text-align: start; = text-indent: 0px; text-transform: none; white-space: normal; = word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: = none;"><br style=3D"caret-color: rgb(0, 0, 0); font-family: = Menlo-Regular; font-size: 12px; font-style: normal; font-variant-caps: = normal; font-weight: 400; letter-spacing: normal; text-align: start; = text-indent: 0px; text-transform: none; white-space: normal; = word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: = none;"><span style=3D"caret-color: rgb(0, 0, 0); font-family: = Menlo-Regular; font-size: 12px; font-style: normal; font-variant-caps: = normal; font-weight: 400; letter-spacing: normal; text-align: start; = text-indent: 0px; text-transform: none; white-space: normal; = word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: = none; float: none; display: inline !important;">In your earlier message, = three devices were shown in each raidz, when what you should be seeing = is that one raidz has an offline device identified by guid and maybe = "was /dev/da3" that is being replaced, along with the replacement = device. I don't see any of that.<span = class=3D"Apple-converted-space"> </span></span><br = style=3D"caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; = font-size: 12px; font-style: normal; font-variant-caps: normal; = font-weight: 400; letter-spacing: normal; text-align: start; = text-indent: 0px; text-transform: none; white-space: normal; = word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: = none;"></div></blockquote><div><br></div>History attached. There = is no replacement device (sub-vdev) until after the =E2=80=9Czpool = replace=E2=80=9D starts, which it won=E2=80=99t.</div><div><br><blockquote= type=3D"cite"><div><blockquote type=3D"cite" style=3D"font-family: = Menlo-Regular; font-size: 12px; font-style: normal; font-variant-caps: = normal; font-weight: 400; letter-spacing: normal; orphans: auto; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none;">I didn=E2=80=99t initiate a replace until after = the disks were physically changed. Although in this conversation = realize that things likely got confused by the replacement in the = kernel=E2=80=99s mind of da3 with what used to be da4. = :-/<br></blockquote><br style=3D"caret-color: rgb(0, 0, 0); = font-family: Menlo-Regular; font-size: 12px; font-style: normal; = font-variant-caps: normal; font-weight: 400; letter-spacing: normal; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none;"><span style=3D"caret-color: rgb(0, 0, 0); = font-family: Menlo-Regular; font-size: 12px; font-style: normal; = font-variant-caps: normal; font-weight: 400; letter-spacing: normal; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none; float: none; display: inline !important;">This is = why your zpool history will be helpful. What did you actually try to = replace, and what did you mean to replace.<span = class=3D"Apple-converted-space"> </span></span><br></div></blockquote= ><br></div><div>All of my history since the last previous boot in = May.</div><div><br></div><div><div>2024-09-05.09:40:14 zpool offline = tank da3</div><div>2024-09-05.14:26:44 zpool import -c = /etc/zfs/zpool.cache -a -N</div><div>2024-09-05.14:32:45 zpool import -c = /etc/zfs/zpool.cache -a -N</div><div>2024-09-05.14:52:18 zpool offline = tank da3</div><div>2024-09-05.14:53:51 zpool offline tank = da3</div><div>2024-09-05.14:59:43 zpool offline -f tank = da3</div><div>2024-09-05.15:02:53 zpool clear = tank</div><div>2024-09-05.15:07:41 zpool online tank = da3</div><div>2024-09-05.15:10:00 zpool add tank spare = da10</div><div>2024-09-05.15:10:20 zpool offline -f tank = da3</div><div>2024-09-05.15:35:23 zpool remove tank = da10</div><div>2024-09-05.15:54:35 zpool scrub = tank</div><div>2024-09-05.16:01:12 zpool set autoreplace=3Don = tank</div><div>2024-09-05.16:01:24 zpool set autoexpand=3Don = tank</div><div>2024-09-05.16:02:16 zpool add -o ashift=3D9 tank spare = da10</div><div>2024-09-06.10:10:20 zpool remove tank = da10</div><div><br></div><div>So, I offline=E2=80=99d the = disk-to-be-replaced at 09:40 yesterday, then I shut the system down, = removed that physical device replacing it with a larger disk, and = rebooted. I suspect the =E2=80=9Coffline=E2=80=9Ds after that are = me experimenting when it was telling me it couldn=E2=80=99t start the = replace action I was asking for.</div><div><br></div><div>The scrub I = started yesterday just because the replace says sometihng about an = operation in progress, so I did that. It completed with no issues, = but nothing changed w.r.t. my current = problem.</div><div><br></div><div>I=E2=80=99m pretty sure the problem = here is that the old da3 went away, and a new da3 came online as a = member of raidz1-1. The new disk I added came online as da10, for = some reason. I had to resolve the issue of the UFS disk which used = to be da10 now being da9, but that was easy enough. Just = unexpected.</div><div><br></div><div> - = Chris</div></div></body></html>= --Apple-Mail=_26064149-B678-4693-B259-B3840725655C--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E93A9CA8-6705-4C26-9F33-B620A365F4BD>