Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 8 Feb 2024 11:20:23 -0700
From:      Warner Losh <imp@bsdimp.com>
To:        John Baldwin <jhb@freebsd.org>
Cc:        Andriy Gapon <avg@freebsd.org>, src-committers@freebsd.org,  dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org
Subject:   Re: git: e4ab361e5394 - main - fix poweroff regression from 9cdf326b4f by delaying shutdown_halt
Message-ID:  <CANCZdfpzVn=Ho4dXwcu2qcyZie1FHmRMNLCFGuFPYwMPT2zUeA@mail.gmail.com>
In-Reply-To: <175dce9b-ee44-434c-b6b2-20717a04f6aa@FreeBSD.org>
References:  <72def5a9-ffcc-4dcc-9b85-875ba7f46539@FreeBSD.org> <175dce9b-ee44-434c-b6b2-20717a04f6aa@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
--0000000000000d2c090610e2de7c
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Hey John,

On Thu, Feb 8, 2024 at 10:52=E2=80=AFAM John Baldwin <jhb@freebsd.org> wrot=
e:

> On 2/6/24 2:13 AM, Andriy Gapon wrote:
> > On 06/02/2024 11:41, Andriy Gapon wrote:
> >> The branch main has been updated by avg:
> >>
> >> URL:
> https://cgit.FreeBSD.org/src/commit/?id=3De4ab361e53945a6c3e9d68c5e5ffc11=
de40a35f2
> >>
> >> commit e4ab361e53945a6c3e9d68c5e5ffc11de40a35f2
> >> Author:     Andriy Gapon <avg@FreeBSD.org>
> >> AuthorDate: 2024-02-06 08:55:13 +0000
> >> Commit:     Andriy Gapon <avg@FreeBSD.org>
> >> CommitDate: 2024-02-06 08:55:13 +0000
> >>
> >>       fix poweroff regression from 9cdf326b4f by delaying shutdown_hal=
t
> >>
> >>       The regression affected ACPI-based systems without EFI poweroff
> support
> >>       (including VMs).
> >>
> >>       The key reason for the regression is that I overlooked that
> poweroff is
> >>       requested by RB_POWEROFF | RB_HALT combination of flags.  In my
> opinion,
> >>       that command is a bit bipolar, but since we've been doing that
> forever,
> >>       then so be it.  Because of that flag combination, the order of
> >>       shutdown_final handlers that check for either flag does matter.
> >>
> >>       Some additional complexity comes from platform-specific
> shutdown_final
> >>       handlers that aim to handle multiple reboot options at once.
> E.g.,
> >>       acpi_shutdown_final handles both poweroff and reboot / reset.  A=
s
> >>       explained in 9cdf326b4f, such a handler must run after
> shutdown_panic to
> >>       give it a chance.  But as the change revealed, the handler must
> also run
> >>       before shutdown_halt, so that the system can actually power off
> before
> >>       entering the halt limbo.
> >>
> >>       Previously, shutdown_panic and shutdown_halt had the same
> priority which
> >>       appears to be incompatible with handlers that can do both
> poweroff and
> >>       reset.
> >
> > I want to add that having many handlers with priorities expressed like
> > SHUTDOWN_PRI_LAST =C2=B1 N while some of those handlers have implicit
> > inter-dependencies (interactions, interference) also does not help to
> see a
> > clear picture.
> >
> > Perhaps it would be better to handle all (reasonable) RB flag
> combinations
> > centrally in kern_reboot and then dispatch events like shutdown_reset,
> > shutdown_poweroff, etc.  Handlers for those events would have a single
> and
> > simple job of performing that one action (perhaps failing and letting
> another
> > handler try).
>
> I think having separate eventhandlers for shutdown, reset, and poweroff
> seems
> sensible.  It also permits a given driver to use different priorities
> (maybe it
> wants to be first for poweroff but last for reset, etc.)
>

I'd come to this conclusion as well. The handlers shouldn't even look at
the flags
IMHO. We can create a hierarchy of power cycle > reset > power off > halt
with
power unchanged easily enough, and call the handlers in that order, letting
individual
drivers duke it out.


> > Also, I would split reboot howto into command and flag portions, so tha=
t
> only
> > one command can be specified at a time.  E.g., I would consider
> RB_AUTOBOOT
> > ("RB_REBOOT"), RB_POWEROFF, RB_HALT to be distinct commands.  Then,
> flags like
> > RB_NOSYNC or RB_DUMP could be optional flags.
> >
> > As an aside, some flags documented for reboot(2) do not seem to have
> much to do
> > with reboot.  E.g., RB_DFLTROOT affects how a system boots up, but not
> how the
> > system goes for a reboot.  Not surprisingly, that option is not handled
> by
> > anything kicked off with reboot(2).
> > Maybe, it would make more sense if we had fast reboot support and the
> running
> > kernel could instruct the next kernel directly.  But, it's still a bit
> weird
> > that flags like RB_POWEROFF and RB_DFLTROOT belong in the same domain
> and can be
> > set together.
>
> I would suggest deprecating flags that are no-ops.  In modern systems if
> you
> want to control the next boot you do it via other means (nextboot,
> efibootmgr,
> etc.) and reboot(2) is not a good API for that.
>

Part of the problem is that they aren't NO-OPs. We use the same howto flags
in the early boot that we use for reboot. There the flags mean something.
This
is passed in by the boot loader, and in this case, still does something.
This dates
as near as I can tell, to the VAX and other early Unix machines being able
to pass
a word (and maybe a little more) from one kernel to the next, a feature
that's
fallen out of fashion.


> It might be hard to fully cleanup some of the hackiness here, but if you
> can
> at least isolate the flag weirdness handling in kern_reboot by having the
> more
> specific eventhandlers then that might fix most of the ugliness.
>

Yea, I think we should isolate the drivers from looking at 'howto' and have
separate handlers for the following cases: power cycle, power off, reset
and halt.
I agree that some of the features that were hung on this word should be tor=
n
down and only done via boot next or possibly from the boot loader -> kernel
handoff only.

Now, what we do with the 'reboot' system call? It seems like we should mayb=
e
rework it in some way?

Warner

--0000000000000d2c090610e2de7c
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr">Hey John,<br></div><br><div class=3D"gmai=
l_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Thu, Feb 8, 2024 at 10:52=
=E2=80=AFAM John Baldwin &lt;<a href=3D"mailto:jhb@freebsd.org">jhb@freebsd=
.org</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"mar=
gin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1=
ex">On 2/6/24 2:13 AM, Andriy Gapon wrote:<br>
&gt; On 06/02/2024 11:41, Andriy Gapon wrote:<br>
&gt;&gt; The branch main has been updated by avg:<br>
&gt;&gt;<br>
&gt;&gt; URL: <a href=3D"https://cgit.FreeBSD.org/src/commit/?id=3De4ab361e=
53945a6c3e9d68c5e5ffc11de40a35f2" rel=3D"noreferrer" target=3D"_blank">http=
s://cgit.FreeBSD.org/src/commit/?id=3De4ab361e53945a6c3e9d68c5e5ffc11de40a3=
5f2</a><br>
&gt;&gt;<br>
&gt;&gt; commit e4ab361e53945a6c3e9d68c5e5ffc11de40a35f2<br>
&gt;&gt; Author:=C2=A0 =C2=A0 =C2=A0Andriy Gapon &lt;avg@FreeBSD.org&gt;<br=
>
&gt;&gt; AuthorDate: 2024-02-06 08:55:13 +0000<br>
&gt;&gt; Commit:=C2=A0 =C2=A0 =C2=A0Andriy Gapon &lt;avg@FreeBSD.org&gt;<br=
>
&gt;&gt; CommitDate: 2024-02-06 08:55:13 +0000<br>
&gt;&gt;<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0fix poweroff regression from 9cdf326b4f =
by delaying shutdown_halt<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0The regression affected ACPI-based syste=
ms without EFI poweroff support<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0(including VMs).<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0The key reason for the regression is tha=
t I overlooked that poweroff is<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0requested by RB_POWEROFF | RB_HALT combi=
nation of flags.=C2=A0 In my opinion,<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0that command is a bit bipolar, but since=
 we&#39;ve been doing that forever,<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0then so be it.=C2=A0 Because of that fla=
g combination, the order of<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0shutdown_final handlers that check for e=
ither flag does matter.<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0Some additional complexity comes from pl=
atform-specific shutdown_final<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0handlers that aim to handle multiple reb=
oot options at once.=C2=A0 E.g.,<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0acpi_shutdown_final handles both powerof=
f and reboot / reset.=C2=A0 As<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0explained in 9cdf326b4f, such a handler =
must run after shutdown_panic to<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0give it a chance.=C2=A0 But as the chang=
e revealed, the handler must also run<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0before shutdown_halt, so that the system=
 can actually power off before<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0entering the halt limbo.<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0Previously, shutdown_panic and shutdown_=
halt had the same priority which<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0appears to be incompatible with handlers=
 that can do both poweroff and<br>
&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0reset.<br>
&gt; <br>
&gt; I want to add that having many handlers with priorities expressed like=
<br>
&gt; SHUTDOWN_PRI_LAST =C2=B1 N while some of those handlers have implicit<=
br>
&gt; inter-dependencies (interactions, interference) also does not help to =
see a<br>
&gt; clear picture.<br>
&gt; <br>
&gt; Perhaps it would be better to handle all (reasonable) RB flag combinat=
ions<br>
&gt; centrally in kern_reboot and then dispatch events like shutdown_reset,=
<br>
&gt; shutdown_poweroff, etc.=C2=A0 Handlers for those events would have a s=
ingle and<br>
&gt; simple job of performing that one action (perhaps failing and letting =
another<br>
&gt; handler try).<br>
<br>
I think having separate eventhandlers for shutdown, reset, and poweroff see=
ms<br>
sensible.=C2=A0 It also permits a given driver to use different priorities =
(maybe it<br>
wants to be first for poweroff but last for reset, etc.)<br></blockquote><d=
iv><br></div><div>I&#39;d come to this conclusion as well. The handlers sho=
uldn&#39;t even look at the flags</div><div>IMHO. We can create a hierarchy=
 of power cycle &gt; reset &gt; power off &gt; halt with</div><div>power un=
changed easily enough, and call the handlers in that order, letting individ=
ual</div><div>drivers duke it out.=C2=A0</div><div>=C2=A0</div><blockquote =
class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px sol=
id rgb(204,204,204);padding-left:1ex">
&gt; Also, I would split reboot howto into command and flag portions, so th=
at only<br>
&gt; one command can be specified at a time.=C2=A0 E.g., I would consider R=
B_AUTOBOOT<br>
&gt; (&quot;RB_REBOOT&quot;), RB_POWEROFF, RB_HALT to be distinct commands.=
=C2=A0 Then, flags like<br>
&gt; RB_NOSYNC or RB_DUMP could be optional flags.<br>
&gt; <br>
&gt; As an aside, some flags documented for reboot(2) do not seem to have m=
uch to do<br>
&gt; with reboot.=C2=A0 E.g., RB_DFLTROOT affects how a system boots up, bu=
t not how the<br>
&gt; system goes for a reboot.=C2=A0 Not surprisingly, that option is not h=
andled by<br>
&gt; anything kicked off with reboot(2).<br>
&gt; Maybe, it would make more sense if we had fast reboot support and the =
running<br>
&gt; kernel could instruct the next kernel directly.=C2=A0 But, it&#39;s st=
ill a bit weird<br>
&gt; that flags like RB_POWEROFF and RB_DFLTROOT belong in the same domain =
and can be<br>
&gt; set together.<br>
<br>
I would suggest deprecating flags that are no-ops.=C2=A0 In modern systems =
if you<br>
want to control the next boot you do it via other means (nextboot, efibootm=
gr,<br>
etc.) and reboot(2) is not a good API for that.<br></blockquote><div><br></=
div><div>Part of the problem is that they aren&#39;t NO-OPs. We use the sam=
e howto flags</div><div>in the early boot that we use for reboot. There the=
 flags mean something. This</div><div>is passed in by the boot loader, and =
in this case, still does something. This dates</div><div>as near as I can t=
ell, to the VAX and other early Unix machines being able to pass</div><div>=
a word (and maybe a little more) from one kernel to the next, a feature tha=
t&#39;s</div><div>fallen out of fashion.</div><div>=C2=A0</div><blockquote =
class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px sol=
id rgb(204,204,204);padding-left:1ex">
It might be hard to fully cleanup some of the hackiness here, but if you ca=
n<br>
at least isolate the flag weirdness handling in kern_reboot by having the m=
ore<br>
specific eventhandlers then that might fix most of the ugliness.<br></block=
quote><div><br></div><div>Yea, I think we should isolate the drivers from l=
ooking at &#39;howto&#39; and have</div><div>separate handlers for the foll=
owing cases: power cycle, power off, reset and halt.=C2=A0</div><div>I agre=
e that some of the features that were hung on this word should be torn</div=
><div>down and only done via boot next or possibly from the boot loader -&g=
t; kernel</div><div>handoff only.</div><div><br></div><div>Now, what we do =
with the &#39;reboot&#39; system call? It seems like we should maybe</div><=
div>rework it in some way?</div><div><br></div><div>Warner</div></div></div=
>

--0000000000000d2c090610e2de7c--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfpzVn=Ho4dXwcu2qcyZie1FHmRMNLCFGuFPYwMPT2zUeA>