Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 6 Sep 2024 15:34:36 -0400
From:      Chris Ross <cross+freebsd@distal.com>
To:        Wes Morgan <morganw@gmail.com>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: Unable to replace drive in raidz1
Message-ID:  <E93A9CA8-6705-4C26-9F33-B620A365F4BD@distal.com>
In-Reply-To: <E85B00B1-7205-486D-800C-E6837780E819@gmail.com>
References:  <5ED5CB56-2E2A-4D83-8CDA-6D6A0719ED19@distal.com> <AC67D073-D476-41F5-AC53-F671430BB493@distal.com> <CAOtMX2h52d0vtceuwcDk2dzkH-fZW32inhk-dfjLMJxetVXKYg@mail.gmail.com> <CB79EC2B-E793-4561-95E7-D1CEEEFC1D72@distal.com> <CAOtMX2i_zFYuOnEK_aVkpO_M8uJCvGYW%2BSzLn3OED4n5fKFoEA@mail.gmail.com> <6A20ABDA-9BEA-4526-94C1-5768AA564C13@distal.com> <CAOtMX2jfcd43sBpHraWA=5e_Ka=hMD654m-5=boguPPbYXE4yw@mail.gmail.com> <0CF1E2D7-6C82-4A8B-82C3-A5BF1ED939CF@distal.com> <CAOtMX2hRJvt9uhctKvXO4R2tUNq9zeCEx6NZmc7Vk7fH=HO8eA@mail.gmail.com> <29003A7C-745D-4A06-8558-AE64310813EA@distal.com> <42346193-AD06-4D26-B0C6-4392953D21A3@gmail.com> <E6C615C1-E9D2-4F0D-8DC2-710BAAF10954@distal.com> <E85B00B1-7205-486D-800C-E6837780E819@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail=_26064149-B678-4693-B259-B3840725655C
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset=utf-8



> On Sep 6, 2024, at 15:16, Wes Morgan <morganw@gmail.com> wrote:
>=20
> You probably don't want that. You will have to use the glabel dev, =
which will not be the same size as your other devices. IIRC you have no =
control over what device node the system finds first for the pool. Even =
if you use GPT labels, the daXpY device will still exist.=20

Right.  But if I don=E2=80=99t _use_ those device names, it won=E2=80=99t =
matter.  If I use /dev/label/foo, or /dev/gpt/foo, I=E2=80=99ll just =
always use those.  I just did that with the ufs disk I have since it =
moved names, now it=E2=80=99s "/dev/ufs/drive12=E2=80=9D in /etc/fstab =
et al.

I want to have some sort of label.  I=E2=80=99d rather not have to add a =
partitioning scheme to the disk if I know I=E2=80=99m just going to use =
the whole disk just to get a label, but I suppose if I have to I can.  =
Though I=E2=80=99d have to do it one disk at a time.  :-)

>=20
>> The former da3 is off-line, out of the chassis.  I replaced a disk in =
a full chassis, having them both online at the same time is not =
possible.  That drive in ZFS=E2=80=99s mind is only faulted because I =
tried =E2=80=9Czpool offline -f=E2=80=9D on it to see if that helped.
>=20
> It sounds like you have replaced the wrong device. Check the "zpool =
history" to see what you did.=20
>=20
> In your earlier message, three devices were shown in each raidz, when =
what you should be seeing is that one raidz has an offline device =
identified by guid and maybe "was /dev/da3" that is being replaced, =
along with the replacement device. I don't see any of that.=20

History attached.  There is no replacement device (sub-vdev) until after =
the =E2=80=9Czpool replace=E2=80=9D starts, which it won=E2=80=99t.

>> I didn=E2=80=99t initiate a replace until after the disks were =
physically changed.  Although in this conversation realize that things =
likely got confused by the replacement in the kernel=E2=80=99s mind of =
da3 with what used to be da4.  :-/
>=20
> This is why your zpool history will be helpful. What did you actually =
try to replace, and what did you mean to replace.=20

All of my history since the last previous boot in May.

2024-09-05.09:40:14 zpool offline tank da3
2024-09-05.14:26:44 zpool import -c /etc/zfs/zpool.cache -a -N
2024-09-05.14:32:45 zpool import -c /etc/zfs/zpool.cache -a -N
2024-09-05.14:52:18 zpool offline tank da3
2024-09-05.14:53:51 zpool offline tank da3
2024-09-05.14:59:43 zpool offline -f tank da3
2024-09-05.15:02:53 zpool clear tank
2024-09-05.15:07:41 zpool online tank da3
2024-09-05.15:10:00 zpool add tank spare da10
2024-09-05.15:10:20 zpool offline -f tank da3
2024-09-05.15:35:23 zpool remove tank da10
2024-09-05.15:54:35 zpool scrub tank
2024-09-05.16:01:12 zpool set autoreplace=3Don tank
2024-09-05.16:01:24 zpool set autoexpand=3Don tank
2024-09-05.16:02:16 zpool add -o ashift=3D9 tank spare da10
2024-09-06.10:10:20 zpool remove tank da10

So, I offline=E2=80=99d the disk-to-be-replaced at 09:40 yesterday, then =
I shut the system down, removed that physical device replacing it with a =
larger disk, and rebooted.  I suspect the =E2=80=9Coffline=E2=80=9Ds =
after that are me experimenting when it was telling me it couldn=E2=80=99t=
 start the replace action I was asking for.

The scrub I started yesterday just because the replace says sometihng =
about an operation in progress, so I did that.  It completed with no =
issues, but nothing changed w.r.t. my current problem.

I=E2=80=99m pretty sure the problem here is that the old da3 went away, =
and a new da3 came online as a member of raidz1-1.  The new disk I added =
came online as da10, for some reason.  I had to resolve the issue of the =
UFS disk which used to be da10 now being da9, but that was easy enough.  =
Just unexpected.

      - Chris=

--Apple-Mail=_26064149-B678-4693-B259-B3840725655C
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html; charset=utf-8

<html><head><meta http-equiv=3D"content-type" content=3D"text/html; =
charset=3Dutf-8"></head><body style=3D"overflow-wrap: break-word; =
-webkit-nbsp-mode: space; line-break: after-white-space;"><br =
id=3D"lineBreakAtBeginningOfMessage"><div><br><blockquote =
type=3D"cite"><div>On Sep 6, 2024, at 15:16, Wes Morgan =
&lt;morganw@gmail.com&gt; wrote:</div><div><br style=3D"caret-color: =
rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 12px; font-style: =
normal; font-variant-caps: normal; font-weight: 400; letter-spacing: =
normal; text-align: start; text-indent: 0px; text-transform: none; =
white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; =
text-decoration: none;"><span style=3D"caret-color: rgb(0, 0, 0); =
font-family: Menlo-Regular; font-size: 12px; font-style: normal; =
font-variant-caps: normal; font-weight: 400; letter-spacing: normal; =
text-align: start; text-indent: 0px; text-transform: none; white-space: =
normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; =
text-decoration: none; float: none; display: inline !important;">You =
probably don't want that. You will have to use the glabel dev, which =
will not be the same size as your other devices. IIRC you have no =
control over what device node the system finds first for the pool. Even =
if you use GPT labels, the daXpY device will still exist.<span =
class=3D"Apple-converted-space">&nbsp;</span></span><br =
style=3D"caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; =
font-size: 12px; font-style: normal; font-variant-caps: normal; =
font-weight: 400; letter-spacing: normal; text-align: start; =
text-indent: 0px; text-transform: none; white-space: normal; =
word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: =
none;"></div></blockquote><div><br></div>Right. &nbsp;But if I don=E2=80=99=
t _use_ those device names, it won=E2=80=99t matter. &nbsp;If I use =
/dev/label/foo, or /dev/gpt/foo, I=E2=80=99ll just always use those. =
&nbsp;I just did that with the ufs disk I have since it moved names, now =
it=E2=80=99s "/dev/ufs/drive12=E2=80=9D in /etc/fstab et =
al.</div><div><br></div><div>I want to have some sort of label. =
&nbsp;I=E2=80=99d rather not have to add a partitioning scheme to the =
disk if I know I=E2=80=99m just going to use the whole disk just to get =
a label, but I suppose if I have to I can. &nbsp;Though I=E2=80=99d have =
to do it one disk at a time. &nbsp;:-)</div><div><br><blockquote =
type=3D"cite"><div><br style=3D"caret-color: rgb(0, 0, 0); font-family: =
Menlo-Regular; font-size: 12px; font-style: normal; font-variant-caps: =
normal; font-weight: 400; letter-spacing: normal; text-align: start; =
text-indent: 0px; text-transform: none; white-space: normal; =
word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: =
none;"><blockquote type=3D"cite" style=3D"font-family: Menlo-Regular; =
font-size: 12px; font-style: normal; font-variant-caps: normal; =
font-weight: 400; letter-spacing: normal; orphans: auto; text-align: =
start; text-indent: 0px; text-transform: none; white-space: normal; =
widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; =
text-decoration: none;">The former da3 is off-line, out of the chassis. =
&nbsp;I replaced a disk in a full chassis, having them both online at =
the same time is not possible. &nbsp;That drive in ZFS=E2=80=99s mind is =
only faulted because I tried =E2=80=9Czpool offline -f=E2=80=9D on it to =
see if that helped.<br></blockquote><br style=3D"caret-color: rgb(0, 0, =
0); font-family: Menlo-Regular; font-size: 12px; font-style: normal; =
font-variant-caps: normal; font-weight: 400; letter-spacing: normal; =
text-align: start; text-indent: 0px; text-transform: none; white-space: =
normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; =
text-decoration: none;"><span style=3D"caret-color: rgb(0, 0, 0); =
font-family: Menlo-Regular; font-size: 12px; font-style: normal; =
font-variant-caps: normal; font-weight: 400; letter-spacing: normal; =
text-align: start; text-indent: 0px; text-transform: none; white-space: =
normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; =
text-decoration: none; float: none; display: inline !important;">It =
sounds like you have replaced the wrong device. Check the "zpool =
history" to see what you did.<span =
class=3D"Apple-converted-space">&nbsp;</span></span><br =
style=3D"caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; =
font-size: 12px; font-style: normal; font-variant-caps: normal; =
font-weight: 400; letter-spacing: normal; text-align: start; =
text-indent: 0px; text-transform: none; white-space: normal; =
word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: =
none;"><br style=3D"caret-color: rgb(0, 0, 0); font-family: =
Menlo-Regular; font-size: 12px; font-style: normal; font-variant-caps: =
normal; font-weight: 400; letter-spacing: normal; text-align: start; =
text-indent: 0px; text-transform: none; white-space: normal; =
word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: =
none;"><span style=3D"caret-color: rgb(0, 0, 0); font-family: =
Menlo-Regular; font-size: 12px; font-style: normal; font-variant-caps: =
normal; font-weight: 400; letter-spacing: normal; text-align: start; =
text-indent: 0px; text-transform: none; white-space: normal; =
word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: =
none; float: none; display: inline !important;">In your earlier message, =
three devices were shown in each raidz, when what you should be seeing =
is that one raidz has an offline device identified by guid and maybe =
"was /dev/da3" that is being replaced, along with the replacement =
device. I don't see any of that.<span =
class=3D"Apple-converted-space">&nbsp;</span></span><br =
style=3D"caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; =
font-size: 12px; font-style: normal; font-variant-caps: normal; =
font-weight: 400; letter-spacing: normal; text-align: start; =
text-indent: 0px; text-transform: none; white-space: normal; =
word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: =
none;"></div></blockquote><div><br></div>History attached. &nbsp;There =
is no replacement device (sub-vdev) until after the =E2=80=9Czpool =
replace=E2=80=9D starts, which it won=E2=80=99t.</div><div><br><blockquote=
 type=3D"cite"><div><blockquote type=3D"cite" style=3D"font-family: =
Menlo-Regular; font-size: 12px; font-style: normal; font-variant-caps: =
normal; font-weight: 400; letter-spacing: normal; orphans: auto; =
text-align: start; text-indent: 0px; text-transform: none; white-space: =
normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; =
text-decoration: none;">I didn=E2=80=99t initiate a replace until after =
the disks were physically changed. &nbsp;Although in this conversation =
realize that things likely got confused by the replacement in the =
kernel=E2=80=99s mind of da3 with what used to be da4. =
&nbsp;:-/<br></blockquote><br style=3D"caret-color: rgb(0, 0, 0); =
font-family: Menlo-Regular; font-size: 12px; font-style: normal; =
font-variant-caps: normal; font-weight: 400; letter-spacing: normal; =
text-align: start; text-indent: 0px; text-transform: none; white-space: =
normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; =
text-decoration: none;"><span style=3D"caret-color: rgb(0, 0, 0); =
font-family: Menlo-Regular; font-size: 12px; font-style: normal; =
font-variant-caps: normal; font-weight: 400; letter-spacing: normal; =
text-align: start; text-indent: 0px; text-transform: none; white-space: =
normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; =
text-decoration: none; float: none; display: inline !important;">This is =
why your zpool history will be helpful. What did you actually try to =
replace, and what did you mean to replace.<span =
class=3D"Apple-converted-space">&nbsp;</span></span><br></div></blockquote=
><br></div><div>All of my history since the last previous boot in =
May.</div><div><br></div><div><div>2024-09-05.09:40:14 zpool offline =
tank da3</div><div>2024-09-05.14:26:44 zpool import -c =
/etc/zfs/zpool.cache -a -N</div><div>2024-09-05.14:32:45 zpool import -c =
/etc/zfs/zpool.cache -a -N</div><div>2024-09-05.14:52:18 zpool offline =
tank da3</div><div>2024-09-05.14:53:51 zpool offline tank =
da3</div><div>2024-09-05.14:59:43 zpool offline -f tank =
da3</div><div>2024-09-05.15:02:53 zpool clear =
tank</div><div>2024-09-05.15:07:41 zpool online tank =
da3</div><div>2024-09-05.15:10:00 zpool add tank spare =
da10</div><div>2024-09-05.15:10:20 zpool offline -f tank =
da3</div><div>2024-09-05.15:35:23 zpool remove tank =
da10</div><div>2024-09-05.15:54:35 zpool scrub =
tank</div><div>2024-09-05.16:01:12 zpool set autoreplace=3Don =
tank</div><div>2024-09-05.16:01:24 zpool set autoexpand=3Don =
tank</div><div>2024-09-05.16:02:16 zpool add -o ashift=3D9 tank spare =
da10</div><div>2024-09-06.10:10:20 zpool remove tank =
da10</div><div><br></div><div>So, I offline=E2=80=99d the =
disk-to-be-replaced at 09:40 yesterday, then I shut the system down, =
removed that physical device replacing it with a larger disk, and =
rebooted. &nbsp;I suspect the =E2=80=9Coffline=E2=80=9Ds after that are =
me experimenting when it was telling me it couldn=E2=80=99t start the =
replace action I was asking for.</div><div><br></div><div>The scrub I =
started yesterday just because the replace says sometihng about an =
operation in progress, so I did that. &nbsp;It completed with no issues, =
but nothing changed w.r.t. my current =
problem.</div><div><br></div><div>I=E2=80=99m pretty sure the problem =
here is that the old da3 went away, and a new da3 came online as a =
member of raidz1-1. &nbsp;The new disk I added came online as da10, for =
some reason. &nbsp;I had to resolve the issue of the UFS disk which used =
to be da10 now being da9, but that was easy enough. &nbsp;Just =
unexpected.</div><div><br></div><div>&nbsp; &nbsp; &nbsp; - =
Chris</div></div></body></html>=

--Apple-Mail=_26064149-B678-4693-B259-B3840725655C--




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E93A9CA8-6705-4C26-9F33-B620A365F4BD>