Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 17 Apr 2023 19:16:01 -0600
From:      Warner Losh <imp@bsdimp.com>
To:        Rick Macklem <rick.macklem@gmail.com>
Cc:        Cy Schubert <Cy.Schubert@cschubert.com>, Pawel Jakub Dawidek <pjd@freebsd.org>,  Mateusz Guzik <mjguzik@gmail.com>, FreeBSD Current <freebsd-current@freebsd.org>,  Glen Barber <gjb@freebsd.org>
Subject:   Re: another crash and going forward with zfs
Message-ID:  <CANCZdfqT14tJW5kcgp5yD3NuD3yDD7GWUMkNTBKJLqsrh0KfkA@mail.gmail.com>
In-Reply-To: <CAM5tNy7GoSuWZMKdmUeSWG241FbdEQXxSj6aW7qirk%2Bfk8AZKg@mail.gmail.com>
References:  <CAGudoHH8vurcn4ydavi-xkGHYA6DVfOQF1mEEXkwPvGUTjKZNA@mail.gmail.com> <48e02888-c49f-ab2b-fc2d-ad6db6f0e10b@dawidek.net> <CAGudoHEWFNcdrFcK30wLSN8%2B56%2BK4CfqwUDsvb1%2BZwS1Gt4NXg@mail.gmail.com> <b57b06bd-7e73-ae2d-2fba-bd226883ff34@dawidek.net> <20230417232859.18262E2@slippy.cwsent.com> <CAM5tNy7GoSuWZMKdmUeSWG241FbdEQXxSj6aW7qirk%2Bfk8AZKg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
--000000000000a9a06505f9920d11
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Mon, Apr 17, 2023, 5:37 PM Rick Macklem <rick.macklem@gmail.com> wrote:

> On Mon, Apr 17, 2023 at 4:29=E2=80=AFPM Cy Schubert <Cy.Schubert@cschuber=
t.com>
> wrote:
> >
> > In message <b57b06bd-7e73-ae2d-2fba-bd226883ff34@dawidek.net>, Pawel
> Jakub
> > Dawi
> > dek writes:
> > > On 4/18/23 05:14, Mateusz Guzik wrote:
> > > > On 4/17/23, Pawel Jakub Dawidek <pjd@freebsd.org> wrote:
> > > >> Correct me if I'm wrong, but from my understanding there were zero
> > > >> problems with block cloning when it wasn't in use or now disabled.
> > > >>
> > > >> The reason I've introduced vfs.zfs.bclone_enabled sysctl, was to
> exactly
> > > >> avoid mess like this and give us more time to sort all the problem=
s
> out
> > > >> while making it easy for people to try it.
> > > >>
> > > >> If there is no plan to revert the whole import, I don't see what
> value
> > > >> removing just block cloning will bring if it is now disabled by
> default
> > > >> and didn't cause any problems when disabled.
> > > >>
> > > >
> > > > The feature definitely was not properly stress tested and what not
> and
> > > > trying to do it keeps running into panics. Given the complexity of
> the
> > > > feature I would expect there are many bug lurking, some of which
> > > > possibly related to the on disk format. Not having to deal with any
> of
> > > > this is can be arranged as described above and is imo the most
> > > > sensible route given the timeline for 14.0
> > >
> > > Block cloning doesn't create, remove or modify any on-disk data until
> it
> > > is in use.
> > >
> > > Again, if we are not going to revert the whole merge, I see no point =
in
> > > reverting block cloning as until it is enabled, its code is not
> > > executed. This allow people who upgraded the pools to do nothing
> special
> > > and it will allow people to test it easily.
> >
> > In this case zpool upgrade and zpool status should return no feature
> > upgrades are available instead of enticing users to zpool upgrade. The
> > userland zpool command should test for this sysctl and print nothing
> > regarding block_cloning. I can see a scenario when a user zpool upgrade=
s
> > their pools, notices the sysctl and does the unthinkable. Not only woul=
d
> > this fill the mailing lists with angry chatter but it would spawn a
> number
> > of PRs plus give us a lot of bad press for data loss.
> >
> > Should we keep the new ZFS in 14, we should:
> >
> > 1. Make sure that zpool(8) does not mention or offer block_cloning in a=
ny
> > way if the sysctl is disabled.
> >
> > 2. Print a cautionary note in release notes advising people not to enab=
le
> > this experimental sysctl. Maybe even have it print "(experimental)" to
> warn
> > users that it will hurt.
> >
> > 3. Update the man pages to caution that block_cloning is experimental a=
nd
> > unstable.
> I would suggest going a step further and making the sysctl RO for
> FreeBSD14.
> (This could be changed for FreeBSD14.n if/when block_cloning is believed =
to
>  be debugged.)
>
> I would apply all 3 of the above to "main", since some that install "main=
"
> will not know how "bleeding edge" this is unless the above is done.
> (Yes, I know "main" is "bleeding edge", but some still expect a stable
>  test system will result from installing it.)
>
> Thanks go to all that tracked this problem down, rick
>

Related question: what zfs branch is stable/14 going to track? With 13 it
was whatever the next stable branch was.

Warner


>
> > It's not enough to have a sysctl without hiding block_cloning completel=
y
> > from view. Only expose it in zpool(8) when the sysctl is enabled. Let's
> > avoid people mistakenly enabling it.
> >
> >
> > --
> > Cheers,
> > Cy Schubert <Cy.Schubert@cschubert.com>
> > FreeBSD UNIX:  <cy@FreeBSD.org>   Web:  https://FreeBSD.org
> > NTP:           <cy@nwtime.org>    Web:  https://nwtime.org
> >
> >                         e^(i*pi)+1=3D0
> >
> >
> >
>
>

--000000000000a9a06505f9920d11
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"auto"><div><br><br><div class=3D"gmail_quote"><div dir=3D"ltr" =
class=3D"gmail_attr">On Mon, Apr 17, 2023, 5:37 PM Rick Macklem &lt;<a href=
=3D"mailto:rick.macklem@gmail.com">rick.macklem@gmail.com</a>&gt; wrote:<br=
></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-=
left:1px #ccc solid;padding-left:1ex">On Mon, Apr 17, 2023 at 4:29=E2=80=AF=
PM Cy Schubert &lt;<a href=3D"mailto:Cy.Schubert@cschubert.com" target=3D"_=
blank" rel=3D"noreferrer">Cy.Schubert@cschubert.com</a>&gt; wrote:<br>
&gt;<br>
&gt; In message &lt;<a href=3D"mailto:b57b06bd-7e73-ae2d-2fba-bd226883ff34@=
dawidek.net" target=3D"_blank" rel=3D"noreferrer">b57b06bd-7e73-ae2d-2fba-b=
d226883ff34@dawidek.net</a>&gt;, Pawel Jakub<br>
&gt; Dawi<br>
&gt; dek writes:<br>
&gt; &gt; On 4/18/23 05:14, Mateusz Guzik wrote:<br>
&gt; &gt; &gt; On 4/17/23, Pawel Jakub Dawidek &lt;<a href=3D"mailto:pjd@fr=
eebsd.org" target=3D"_blank" rel=3D"noreferrer">pjd@freebsd.org</a>&gt; wro=
te:<br>
&gt; &gt; &gt;&gt; Correct me if I&#39;m wrong, but from my understanding t=
here were zero<br>
&gt; &gt; &gt;&gt; problems with block cloning when it wasn&#39;t in use or=
 now disabled.<br>
&gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt;&gt; The reason I&#39;ve introduced vfs.zfs.bclone_enabled sy=
sctl, was to exactly<br>
&gt; &gt; &gt;&gt; avoid mess like this and give us more time to sort all t=
he problems out<br>
&gt; &gt; &gt;&gt; while making it easy for people to try it.<br>
&gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt;&gt; If there is no plan to revert the whole import, I don&#3=
9;t see what value<br>
&gt; &gt; &gt;&gt; removing just block cloning will bring if it is now disa=
bled by default<br>
&gt; &gt; &gt;&gt; and didn&#39;t cause any problems when disabled.<br>
&gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; The feature definitely was not properly stress tested and wh=
at not and<br>
&gt; &gt; &gt; trying to do it keeps running into panics. Given the complex=
ity of the<br>
&gt; &gt; &gt; feature I would expect there are many bug lurking, some of w=
hich<br>
&gt; &gt; &gt; possibly related to the on disk format. Not having to deal w=
ith any of<br>
&gt; &gt; &gt; this is can be arranged as described above and is imo the mo=
st<br>
&gt; &gt; &gt; sensible route given the timeline for 14.0<br>
&gt; &gt;<br>
&gt; &gt; Block cloning doesn&#39;t create, remove or modify any on-disk da=
ta until it<br>
&gt; &gt; is in use.<br>
&gt; &gt;<br>
&gt; &gt; Again, if we are not going to revert the whole merge, I see no po=
int in<br>
&gt; &gt; reverting block cloning as until it is enabled, its code is not<b=
r>
&gt; &gt; executed. This allow people who upgraded the pools to do nothing =
special<br>
&gt; &gt; and it will allow people to test it easily.<br>
&gt;<br>
&gt; In this case zpool upgrade and zpool status should return no feature<b=
r>
&gt; upgrades are available instead of enticing users to zpool upgrade. The=
<br>
&gt; userland zpool command should test for this sysctl and print nothing<b=
r>
&gt; regarding block_cloning. I can see a scenario when a user zpool upgrad=
es<br>
&gt; their pools, notices the sysctl and does the unthinkable. Not only wou=
ld<br>
&gt; this fill the mailing lists with angry chatter but it would spawn a nu=
mber<br>
&gt; of PRs plus give us a lot of bad press for data loss.<br>
&gt;<br>
&gt; Should we keep the new ZFS in 14, we should:<br>
&gt;<br>
&gt; 1. Make sure that zpool(8) does not mention or offer block_cloning in =
any<br>
&gt; way if the sysctl is disabled.<br>
&gt;<br>
&gt; 2. Print a cautionary note in release notes advising people not to ena=
ble<br>
&gt; this experimental sysctl. Maybe even have it print &quot;(experimental=
)&quot; to warn<br>
&gt; users that it will hurt.<br>
&gt;<br>
&gt; 3. Update the man pages to caution that block_cloning is experimental =
and<br>
&gt; unstable.<br>
I would suggest going a step further and making the sysctl RO for FreeBSD14=
.<br>
(This could be changed for FreeBSD14.n if/when block_cloning is believed to=
<br>
=C2=A0be debugged.)<br>
<br>
I would apply all 3 of the above to &quot;main&quot;, since some that insta=
ll &quot;main&quot;<br>
will not know how &quot;bleeding edge&quot; this is unless the above is don=
e.<br>
(Yes, I know &quot;main&quot; is &quot;bleeding edge&quot;, but some still =
expect a stable<br>
=C2=A0test system will result from installing it.)<br>
<br>
Thanks go to all that tracked this problem down, rick<br></blockquote></div=
></div><div dir=3D"auto"><br></div><div dir=3D"auto">Related question: what=
 zfs branch is stable/14 going to track? With 13 it was whatever the next s=
table branch was.</div><div dir=3D"auto"><br></div><div dir=3D"auto">Warner=
</div><div dir=3D"auto"><br></div><div dir=3D"auto"><br></div><div dir=3D"a=
uto"><div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" style=3D"=
margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
&gt;<br>
&gt; It&#39;s not enough to have a sysctl without hiding block_cloning comp=
letely<br>
&gt; from view. Only expose it in zpool(8) when the sysctl is enabled. Let&=
#39;s<br>
&gt; avoid people mistakenly enabling it.<br>
&gt;<br>
&gt;<br>
&gt; --<br>
&gt; Cheers,<br>
&gt; Cy Schubert &lt;<a href=3D"mailto:Cy.Schubert@cschubert.com" target=3D=
"_blank" rel=3D"noreferrer">Cy.Schubert@cschubert.com</a>&gt;<br>
&gt; FreeBSD UNIX:=C2=A0 &lt;cy@FreeBSD.org&gt;=C2=A0 =C2=A0Web:=C2=A0 <a h=
ref=3D"https://FreeBSD.org" rel=3D"noreferrer noreferrer" target=3D"_blank"=
>https://FreeBSD.org</a><br>;
&gt; NTP:=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0&lt;<a href=3D"mailto:cy@=
nwtime.org" target=3D"_blank" rel=3D"noreferrer">cy@nwtime.org</a>&gt;=C2=
=A0 =C2=A0 Web:=C2=A0 <a href=3D"https://nwtime.org" rel=3D"noreferrer nore=
ferrer" target=3D"_blank">https://nwtime.org</a><br>;
&gt;<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0e^(i*pi)+1=3D0<br>
&gt;<br>
&gt;<br>
&gt;<br>
<br>
</blockquote></div></div></div>

--000000000000a9a06505f9920d11--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfqT14tJW5kcgp5yD3NuD3yDD7GWUMkNTBKJLqsrh0KfkA>