Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 6 Dec 2022 09:09:36 -0700
From:      Warner Losh <imp@bsdimp.com>
To:        Hans Petter Selasky <hps@selasky.org>
Cc:        Warner Losh <imp@freebsd.org>, src-committers@freebsd.org,  dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org
Subject:   Re: git: 3cf97e91fac5 - main - Revert "newbus: Change attach failure behavior"
Message-ID:  <CANCZdfo_UK31uRoZ4CPBFqSujsc6SruWwiixVTdC3PzQ1yD2sA@mail.gmail.com>
In-Reply-To: <facb65c0-8a01-6f15-2c51-0e8bb425fd4f@selasky.org>
References:  <202212060209.2B629pnu053879@gitrepo.freebsd.org> <facb65c0-8a01-6f15-2c51-0e8bb425fd4f@selasky.org>

next in thread | previous in thread | raw e-mail | index | archive | help
--00000000000064aa5a05ef2b0876
Content-Type: text/plain; charset="UTF-8"

On Tue, Dec 6, 2022 at 3:57 AM Hans Petter Selasky <hps@selasky.org> wrote:

> On 12/6/22 03:09, Warner Losh wrote:
> > The branch main has been updated by imp:
> >
> > URL:
> https://cgit.FreeBSD.org/src/commit/?id=3cf97e91fac5f53fc0375bc816cc541a8864ffc4
> >
> > commit 3cf97e91fac5f53fc0375bc816cc541a8864ffc4
> > Author:     Warner Losh <imp@FreeBSD.org>
> > AuthorDate: 2022-12-05 23:57:58 +0000
> > Commit:     Warner Losh <imp@FreeBSD.org>
> > CommitDate: 2022-12-06 00:00:26 +0000
> >
> >      Revert "newbus: Change attach failure behavior"
> >
> >      This reverts commit 68c3f0302106643207dcdfe3b414810e245228e5. There
> are
> >      some weird crashes when KVMs switch caused by this, so revert this
> >      commit until they are sorted out.
> >
> >      Reported by:            cy@
> >      Sponsored by:           Netflix
> > ---
> >   UPDATING            | 2 ++
> >   sys/kern/subr_bus.c | 2 +-
> >   2 files changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/UPDATING b/UPDATING
> > index 099066031b8e..001ec9f6de3a 100644
> > --- a/UPDATING
> > +++ b/UPDATING
> > @@ -43,6 +43,8 @@ NOTE TO PEOPLE WHO THINK THAT FreeBSD 14.x IS SLOW:
> >       needs to use devctl to re-enable the device, and reprobe it (or set
> >       the sysctl/tunable hw.bus.disable_failed_devices=false).
> >
> > +     NOTE: This was reverted 20221205 due to unexpected compatibility
> issues
> > +
> >   20221122:
> >       pf no longer accepts 'scrub fragment crop' or 'scrub fragment
> drop-ovl'.
> >       These configurations are no longer automatically reinterpreted as
> > diff --git a/sys/kern/subr_bus.c b/sys/kern/subr_bus.c
> > index 6a5ec4efc38d..b9615b033007 100644
> > --- a/sys/kern/subr_bus.c
> > +++ b/sys/kern/subr_bus.c
> > @@ -69,7 +69,7 @@ SYSCTL_NODE(_hw, OID_AUTO, bus, CTLFLAG_RW |
> CTLFLAG_MPSAFE, NULL,
> >   SYSCTL_ROOT_NODE(OID_AUTO, dev, CTLFLAG_RW | CTLFLAG_MPSAFE, NULL,
> >       NULL);
> >
> > -static bool disable_failed_devs = true;
> > +static bool disable_failed_devs = false;
> >   SYSCTL_BOOL(_hw_bus, OID_AUTO, disable_failed_devices, CTLFLAG_RWTUN,
> &disable_failed_devs,
> >       0, "Do not retry attaching devices that return an error from
> DEVICE_ATTACH the first time");
> >
>
> Thinking about it, this flag shouldn't be set for USB devices and HUBS
> and such. Probably only makes sense for PCI devices, though there is
> something called thunderbolt too, which may fail during probe/attach,
> because the user yanked the device.
>

I think it makes perfect sense for all devices everywhere. When a device
goes
away like you say, it's device_t will be gone soonish and this flag will
clear if
it is reinserted in the future. The bus will get a signal for that yanking
and will
remove the device_t (now maybe we have a bug in device deletion when that
happens, which is what I suspected when I saw this and a couple other
tracebacks).


> Regarding the assert in the USB stack, maybe the state was not correctly
> set on the device_t ?
>

It's unclear to me. Newbus doesn't guarantee certain states to the bus
drivers, so
maybe the assert in the USB stack is incorrectly strict on what states
it assumes the device is in? I'm unsure. I haven't looked deeply enough to
know
what exactly is going on. Since there were problems and I didn't have time
to do
the proper deep dive, I just reverted for now and will revisit when I have
the time.

Warner

--00000000000064aa5a05ef2b0876
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr"><br></div><br><div class=3D"gmail_quote">=
<div dir=3D"ltr" class=3D"gmail_attr">On Tue, Dec 6, 2022 at 3:57 AM Hans P=
etter Selasky &lt;<a href=3D"mailto:hps@selasky.org">hps@selasky.org</a>&gt=
; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px=
 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 12/6=
/22 03:09, Warner Losh wrote:<br>
&gt; The branch main has been updated by imp:<br>
&gt; <br>
&gt; URL: <a href=3D"https://cgit.FreeBSD.org/src/commit/?id=3D3cf97e91fac5=
f53fc0375bc816cc541a8864ffc4" rel=3D"noreferrer" target=3D"_blank">https://=
cgit.FreeBSD.org/src/commit/?id=3D3cf97e91fac5f53fc0375bc816cc541a8864ffc4<=
/a><br>
&gt; <br>
&gt; commit 3cf97e91fac5f53fc0375bc816cc541a8864ffc4<br>
&gt; Author:=C2=A0 =C2=A0 =C2=A0Warner Losh &lt;imp@FreeBSD.org&gt;<br>
&gt; AuthorDate: 2022-12-05 23:57:58 +0000<br>
&gt; Commit:=C2=A0 =C2=A0 =C2=A0Warner Losh &lt;imp@FreeBSD.org&gt;<br>
&gt; CommitDate: 2022-12-06 00:00:26 +0000<br>
&gt; <br>
&gt;=C2=A0 =C2=A0 =C2=A0 Revert &quot;newbus: Change attach failure behavio=
r&quot;<br>
&gt;=C2=A0 =C2=A0 =C2=A0 <br>
&gt;=C2=A0 =C2=A0 =C2=A0 This reverts commit 68c3f0302106643207dcdfe3b41481=
0e245228e5. There are<br>
&gt;=C2=A0 =C2=A0 =C2=A0 some weird crashes when KVMs switch caused by this=
, so revert this<br>
&gt;=C2=A0 =C2=A0 =C2=A0 commit until they are sorted out.<br>
&gt;=C2=A0 =C2=A0 =C2=A0 <br>
&gt;=C2=A0 =C2=A0 =C2=A0 Reported by:=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 cy@<br>
&gt;=C2=A0 =C2=A0 =C2=A0 Sponsored by:=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0Netflix<br>
&gt; ---<br>
&gt;=C2=A0 =C2=A0UPDATING=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | 2 ++<b=
r>
&gt;=C2=A0 =C2=A0sys/kern/subr_bus.c | 2 +-<br>
&gt;=C2=A0 =C2=A02 files changed, 3 insertions(+), 1 deletion(-)<br>
&gt; <br>
&gt; diff --git a/UPDATING b/UPDATING<br>
&gt; index 099066031b8e..001ec9f6de3a 100644<br>
&gt; --- a/UPDATING<br>
&gt; +++ b/UPDATING<br>
&gt; @@ -43,6 +43,8 @@ NOTE TO PEOPLE WHO THINK THAT FreeBSD 14.x IS SLOW:<=
br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0needs to use devctl to re-enable the device,=
 and reprobe it (or set<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0the sysctl/tunable hw.bus.disable_failed_dev=
ices=3Dfalse).<br>
&gt;=C2=A0 =C2=A0<br>
&gt; +=C2=A0 =C2=A0 =C2=A0NOTE: This was reverted 20221205 due to unexpecte=
d compatibility issues<br>
&gt; +<br>
&gt;=C2=A0 =C2=A020221122:<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0pf no longer accepts &#39;scrub fragment cro=
p&#39; or &#39;scrub fragment drop-ovl&#39;.<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0These configurations are no longer automatic=
ally reinterpreted as<br>
&gt; diff --git a/sys/kern/subr_bus.c b/sys/kern/subr_bus.c<br>
&gt; index 6a5ec4efc38d..b9615b033007 100644<br>
&gt; --- a/sys/kern/subr_bus.c<br>
&gt; +++ b/sys/kern/subr_bus.c<br>
&gt; @@ -69,7 +69,7 @@ SYSCTL_NODE(_hw, OID_AUTO, bus, CTLFLAG_RW | CTLFLAG=
_MPSAFE, NULL,<br>
&gt;=C2=A0 =C2=A0SYSCTL_ROOT_NODE(OID_AUTO, dev, CTLFLAG_RW | CTLFLAG_MPSAF=
E, NULL,<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0NULL);<br>
&gt;=C2=A0 =C2=A0<br>
&gt; -static bool disable_failed_devs =3D true;<br>
&gt; +static bool disable_failed_devs =3D false;<br>
&gt;=C2=A0 =C2=A0SYSCTL_BOOL(_hw_bus, OID_AUTO, disable_failed_devices, CTL=
FLAG_RWTUN, &amp;disable_failed_devs,<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A00, &quot;Do not retry attaching devices that=
 return an error from DEVICE_ATTACH the first time&quot;);<br>
&gt;=C2=A0 =C2=A0<br>
<br>
Thinking about it, this flag shouldn&#39;t be set for USB devices and HUBS =
<br>
and such. Probably only makes sense for PCI devices, though there is <br>
something called thunderbolt too, which may fail during probe/attach, <br>
because the user yanked the device.<br></blockquote><div><br></div><div>I t=
hink it makes perfect sense for all devices everywhere. When a device goes<=
/div><div>away like you say, it&#39;s device_t will be gone soonish and thi=
s flag will clear if</div><div>it is reinserted in the future. The bus will=
 get a signal for that yanking and will</div><div>remove the device_t (now =
maybe we have a bug in device deletion when that</div><div>happens, which i=
s what I suspected when I saw this and a couple other</div><div>tracebacks)=
.</div><div>=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0=
px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Regarding the assert in the USB stack, maybe the state was not correctly <b=
r>
set on the device_t ?<br></blockquote><div><br></div><div>It&#39;s unclear =
to me. Newbus doesn&#39;t guarantee certain states to the bus drivers, so</=
div><div>maybe the assert in the USB stack is incorrectly strict on what st=
ates</div><div>it assumes the device is in? I&#39;m unsure. I haven&#39;t l=
ooked deeply enough to know</div><div>what exactly is going on. Since there=
 were problems and I didn&#39;t have time to do</div><div>the proper deep d=
ive, I just reverted for now and will revisit when I have the time.</div><d=
iv><br></div><div>Warner</div></div></div>

--00000000000064aa5a05ef2b0876--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfo_UK31uRoZ4CPBFqSujsc6SruWwiixVTdC3PzQ1yD2sA>