Date: Mon, 17 Apr 2023 19:16:01 -0600 From: Warner Losh <imp@bsdimp.com> To: Rick Macklem <rick.macklem@gmail.com> Cc: Cy Schubert <Cy.Schubert@cschubert.com>, Pawel Jakub Dawidek <pjd@freebsd.org>, Mateusz Guzik <mjguzik@gmail.com>, FreeBSD Current <freebsd-current@freebsd.org>, Glen Barber <gjb@freebsd.org> Subject: Re: another crash and going forward with zfs Message-ID: <CANCZdfqT14tJW5kcgp5yD3NuD3yDD7GWUMkNTBKJLqsrh0KfkA@mail.gmail.com> In-Reply-To: <CAM5tNy7GoSuWZMKdmUeSWG241FbdEQXxSj6aW7qirk%2Bfk8AZKg@mail.gmail.com> References: <CAGudoHH8vurcn4ydavi-xkGHYA6DVfOQF1mEEXkwPvGUTjKZNA@mail.gmail.com> <48e02888-c49f-ab2b-fc2d-ad6db6f0e10b@dawidek.net> <CAGudoHEWFNcdrFcK30wLSN8%2B56%2BK4CfqwUDsvb1%2BZwS1Gt4NXg@mail.gmail.com> <b57b06bd-7e73-ae2d-2fba-bd226883ff34@dawidek.net> <20230417232859.18262E2@slippy.cwsent.com> <CAM5tNy7GoSuWZMKdmUeSWG241FbdEQXxSj6aW7qirk%2Bfk8AZKg@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --] On Mon, Apr 17, 2023, 5:37 PM Rick Macklem <rick.macklem@gmail.com> wrote: > On Mon, Apr 17, 2023 at 4:29 PM Cy Schubert <Cy.Schubert@cschubert.com> > wrote: > > > > In message <b57b06bd-7e73-ae2d-2fba-bd226883ff34@dawidek.net>, Pawel > Jakub > > Dawi > > dek writes: > > > On 4/18/23 05:14, Mateusz Guzik wrote: > > > > On 4/17/23, Pawel Jakub Dawidek <pjd@freebsd.org> wrote: > > > >> Correct me if I'm wrong, but from my understanding there were zero > > > >> problems with block cloning when it wasn't in use or now disabled. > > > >> > > > >> The reason I've introduced vfs.zfs.bclone_enabled sysctl, was to > exactly > > > >> avoid mess like this and give us more time to sort all the problems > out > > > >> while making it easy for people to try it. > > > >> > > > >> If there is no plan to revert the whole import, I don't see what > value > > > >> removing just block cloning will bring if it is now disabled by > default > > > >> and didn't cause any problems when disabled. > > > >> > > > > > > > > The feature definitely was not properly stress tested and what not > and > > > > trying to do it keeps running into panics. Given the complexity of > the > > > > feature I would expect there are many bug lurking, some of which > > > > possibly related to the on disk format. Not having to deal with any > of > > > > this is can be arranged as described above and is imo the most > > > > sensible route given the timeline for 14.0 > > > > > > Block cloning doesn't create, remove or modify any on-disk data until > it > > > is in use. > > > > > > Again, if we are not going to revert the whole merge, I see no point in > > > reverting block cloning as until it is enabled, its code is not > > > executed. This allow people who upgraded the pools to do nothing > special > > > and it will allow people to test it easily. > > > > In this case zpool upgrade and zpool status should return no feature > > upgrades are available instead of enticing users to zpool upgrade. The > > userland zpool command should test for this sysctl and print nothing > > regarding block_cloning. I can see a scenario when a user zpool upgrades > > their pools, notices the sysctl and does the unthinkable. Not only would > > this fill the mailing lists with angry chatter but it would spawn a > number > > of PRs plus give us a lot of bad press for data loss. > > > > Should we keep the new ZFS in 14, we should: > > > > 1. Make sure that zpool(8) does not mention or offer block_cloning in any > > way if the sysctl is disabled. > > > > 2. Print a cautionary note in release notes advising people not to enable > > this experimental sysctl. Maybe even have it print "(experimental)" to > warn > > users that it will hurt. > > > > 3. Update the man pages to caution that block_cloning is experimental and > > unstable. > I would suggest going a step further and making the sysctl RO for > FreeBSD14. > (This could be changed for FreeBSD14.n if/when block_cloning is believed to > be debugged.) > > I would apply all 3 of the above to "main", since some that install "main" > will not know how "bleeding edge" this is unless the above is done. > (Yes, I know "main" is "bleeding edge", but some still expect a stable > test system will result from installing it.) > > Thanks go to all that tracked this problem down, rick > Related question: what zfs branch is stable/14 going to track? With 13 it was whatever the next stable branch was. Warner > > > It's not enough to have a sysctl without hiding block_cloning completely > > from view. Only expose it in zpool(8) when the sysctl is enabled. Let's > > avoid people mistakenly enabling it. > > > > > > -- > > Cheers, > > Cy Schubert <Cy.Schubert@cschubert.com> > > FreeBSD UNIX: <cy@FreeBSD.org> Web: https://FreeBSD.org > > NTP: <cy@nwtime.org> Web: https://nwtime.org > > > > e^(i*pi)+1=0 > > > > > > > > [-- Attachment #2 --] <div dir="auto"><div><br><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Apr 17, 2023, 5:37 PM Rick Macklem <<a href="mailto:rick.macklem@gmail.com">rick.macklem@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Mon, Apr 17, 2023 at 4:29 PM Cy Schubert <<a href="mailto:Cy.Schubert@cschubert.com" target="_blank" rel="noreferrer">Cy.Schubert@cschubert.com</a>> wrote:<br> ><br> > In message <<a href="mailto:b57b06bd-7e73-ae2d-2fba-bd226883ff34@dawidek.net" target="_blank" rel="noreferrer">b57b06bd-7e73-ae2d-2fba-bd226883ff34@dawidek.net</a>>, Pawel Jakub<br> > Dawi<br> > dek writes:<br> > > On 4/18/23 05:14, Mateusz Guzik wrote:<br> > > > On 4/17/23, Pawel Jakub Dawidek <<a href="mailto:pjd@freebsd.org" target="_blank" rel="noreferrer">pjd@freebsd.org</a>> wrote:<br> > > >> Correct me if I'm wrong, but from my understanding there were zero<br> > > >> problems with block cloning when it wasn't in use or now disabled.<br> > > >><br> > > >> The reason I've introduced vfs.zfs.bclone_enabled sysctl, was to exactly<br> > > >> avoid mess like this and give us more time to sort all the problems out<br> > > >> while making it easy for people to try it.<br> > > >><br> > > >> If there is no plan to revert the whole import, I don't see what value<br> > > >> removing just block cloning will bring if it is now disabled by default<br> > > >> and didn't cause any problems when disabled.<br> > > >><br> > > ><br> > > > The feature definitely was not properly stress tested and what not and<br> > > > trying to do it keeps running into panics. Given the complexity of the<br> > > > feature I would expect there are many bug lurking, some of which<br> > > > possibly related to the on disk format. Not having to deal with any of<br> > > > this is can be arranged as described above and is imo the most<br> > > > sensible route given the timeline for 14.0<br> > ><br> > > Block cloning doesn't create, remove or modify any on-disk data until it<br> > > is in use.<br> > ><br> > > Again, if we are not going to revert the whole merge, I see no point in<br> > > reverting block cloning as until it is enabled, its code is not<br> > > executed. This allow people who upgraded the pools to do nothing special<br> > > and it will allow people to test it easily.<br> ><br> > In this case zpool upgrade and zpool status should return no feature<br> > upgrades are available instead of enticing users to zpool upgrade. The<br> > userland zpool command should test for this sysctl and print nothing<br> > regarding block_cloning. I can see a scenario when a user zpool upgrades<br> > their pools, notices the sysctl and does the unthinkable. Not only would<br> > this fill the mailing lists with angry chatter but it would spawn a number<br> > of PRs plus give us a lot of bad press for data loss.<br> ><br> > Should we keep the new ZFS in 14, we should:<br> ><br> > 1. Make sure that zpool(8) does not mention or offer block_cloning in any<br> > way if the sysctl is disabled.<br> ><br> > 2. Print a cautionary note in release notes advising people not to enable<br> > this experimental sysctl. Maybe even have it print "(experimental)" to warn<br> > users that it will hurt.<br> ><br> > 3. Update the man pages to caution that block_cloning is experimental and<br> > unstable.<br> I would suggest going a step further and making the sysctl RO for FreeBSD14.<br> (This could be changed for FreeBSD14.n if/when block_cloning is believed to<br> be debugged.)<br> <br> I would apply all 3 of the above to "main", since some that install "main"<br> will not know how "bleeding edge" this is unless the above is done.<br> (Yes, I know "main" is "bleeding edge", but some still expect a stable<br> test system will result from installing it.)<br> <br> Thanks go to all that tracked this problem down, rick<br></blockquote></div></div><div dir="auto"><br></div><div dir="auto">Related question: what zfs branch is stable/14 going to track? With 13 it was whatever the next stable branch was.</div><div dir="auto"><br></div><div dir="auto">Warner</div><div dir="auto"><br></div><div dir="auto"><br></div><div dir="auto"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> ><br> > It's not enough to have a sysctl without hiding block_cloning completely<br> > from view. Only expose it in zpool(8) when the sysctl is enabled. Let's<br> > avoid people mistakenly enabling it.<br> ><br> ><br> > --<br> > Cheers,<br> > Cy Schubert <<a href="mailto:Cy.Schubert@cschubert.com" target="_blank" rel="noreferrer">Cy.Schubert@cschubert.com</a>><br> > FreeBSD UNIX: <cy@FreeBSD.org> Web: <a href="https://FreeBSD.org" rel="noreferrer noreferrer" target="_blank">https://FreeBSD.org</a><br> > NTP: <<a href="mailto:cy@nwtime.org" target="_blank" rel="noreferrer">cy@nwtime.org</a>> Web: <a href="https://nwtime.org" rel="noreferrer noreferrer" target="_blank">https://nwtime.org</a><br> ><br> > e^(i*pi)+1=0<br> ><br> ><br> ><br> <br> </blockquote></div></div></div>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfqT14tJW5kcgp5yD3NuD3yDD7GWUMkNTBKJLqsrh0KfkA>
