Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 25 May 2022 09:49:56 -0600
From:      Warner Losh <imp@bsdimp.com>
To:        Matteo Riondato <matteo@freebsd.org>
Cc:        matti k <mattik@gwsit.com.au>, Alexander Motin <mav@freebsd.org>,  FreeBSD Current <freebsd-current@freebsd.org>, Jim Harris <jimharris@freebsd.org>
Subject:   Re: nvme INVALID_FIELD in dmesg.boot
Message-ID:  <CANCZdfqf%2Bqy6%2B9wWu65g4JRtTc6Gx4wJFhAq%2BN91o--Zqoziow@mail.gmail.com>
In-Reply-To: <20220525153920.sxzi7fhsfzv6yidv@ubertino.local>
References:  <20220525122529.t2kwfg2q65dfiyyt@host-ubertino-mac-88e9fe7361f5.eduroam.ssid.10net.amherst.edu> <d8462935-2874-2e5c-a7aa-d5352bd0a3c2@FreeBSD.org> <20220526001715.4ffee96a@ws1.wobblyboot.net> <CANCZdfrYP-Wz7a-%2B_WEKbT=Yb=mrk0YYifDkzekV6H2q865sDg@mail.gmail.com> <20220525153920.sxzi7fhsfzv6yidv@ubertino.local>

next in thread | previous in thread | raw e-mail | index | archive | help
--0000000000001678bb05dfd807e0
Content-Type: text/plain; charset="UTF-8"

On Wed, May 25, 2022 at 9:39 AM Matteo Riondato <matteo@freebsd.org> wrote:

> On 2022-05-25 at 11:29 EDT, Warner Losh <imp@bsdimp.com> wrote:
> >
> >SET FEATURES (opcode 9) feature 0xb is indeed async event
> >configuration.
> >0x31f is:
> >SMART WARNING for available spares (0x1)
> >SMART warning for temperature (0x2)
> >SMART WARNING for device reliability (0x4)
> >SMART WARNING for being read only (0x8)
> >SMART WARNING for volatile memory backup (0x10)
> >Namespace attribute change events (0x100)
> >Firmware activation events (0x200)
> >
> >I wonder which one of those it doesn't like. My reading of the standard
> >suggests that those should always be supported for a 1.2 and later
> >drive... Thought maybe with the possible exception of the volatile
> >memory backup, so let me do some digging here...
> >
> >We can get the last two items from OAES field of the controller
> >identificaiton data. This is bytes 95:92, which if I'm counting right
> >is the last word on the 040: line in the nvmecontrol identify -x nvmeX
> >command:
> >
> >040: 4e474e4b 30303150 000cca07 00230000 00010200 005b8d80 0030d400
> >00000100
>
> >----------------------------------------------------------------------------------------------------------^^^^^^^^^
>
> On my system:
>
> 040: 31564456 30373130 5cd2e400 00000500 00010200 001e8480 002dc6c0
> 00000200
>

Yea, 0x200 and we send 0x300, so maybe that's the cause of the message....


> (same for all nvmeX, as far as I can tell)
>
> >It looks like we don't currently test these bits before we add the last
> >two (we do it unconditionally for >= 1.2, and maybe we should check
> >these bits >= 1.2).
> >
> >Would you be able to test a fix for this?
>
> Yes, I would be happy to, but I cannot do it for a couple of weeks
> (running simulations for a deadline).
>

There's  no real rush... Your system will be fine without these events
given what
I think you are doing with it. You might want to check the smart log page
to see
if any of the drives have indicators of trouble... but most trouble you'd
care about
would likely torpedo your simulation very very shortly after they happen so
even
that likely isn't strictly required.

Warner

--0000000000001678bb05dfd807e0
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr"><br></div><br><div class=3D"gmail_quote">=
<div dir=3D"ltr" class=3D"gmail_attr">On Wed, May 25, 2022 at 9:39 AM Matte=
o Riondato &lt;<a href=3D"mailto:matteo@freebsd.org">matteo@freebsd.org</a>=
&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px =
0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 2=
022-05-25 at 11:29 EDT, Warner Losh &lt;<a href=3D"mailto:imp@bsdimp.com" t=
arget=3D"_blank">imp@bsdimp.com</a>&gt; wrote:<br>
&gt;<br>
&gt;SET FEATURES (opcode 9) feature 0xb is indeed async event <br>
&gt;configuration.<br>
&gt;0x31f is:<br>
&gt;SMART WARNING for available spares (0x1)<br>
&gt;SMART warning for temperature (0x2)<br>
&gt;SMART WARNING for device reliability (0x4)<br>
&gt;SMART WARNING for being read only (0x8)<br>
&gt;SMART WARNING for volatile memory backup (0x10)<br>
&gt;Namespace attribute change events (0x100)<br>
&gt;Firmware activation events (0x200)<br>
&gt;<br>
&gt;I wonder which one of those it doesn&#39;t like. My reading of the stan=
dard <br>
&gt;suggests that those should always be supported for a 1.2 and later <br>
&gt;drive... Thought maybe with the possible exception of the volatile <br>
&gt;memory backup, so let me do some digging here...<br>
&gt;<br>
&gt;We can get the last two items from OAES field of the controller <br>
&gt;identificaiton data. This is bytes 95:92, which if I&#39;m counting rig=
ht <br>
&gt;is the last word on the 040: line in the nvmecontrol identify -x nvmeX =
<br>
&gt;command:<br>
&gt;<br>
&gt;040: 4e474e4b 30303150 000cca07 00230000 00010200 005b8d80 0030d400 <br=
>
&gt;00000100<br>
&gt;-----------------------------------------------------------------------=
-----------------------------------^^^^^^^^^<br>
<br>
On my system:<br>
<br>
040: 31564456 30373130 5cd2e400 00000500 00010200 001e8480 002dc6c0 <br>
00000200<br></blockquote><div><br></div><div>Yea, 0x200 and we send 0x300, =
so maybe that&#39;s the cause of the message....</div><div>=C2=A0</div><blo=
ckquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left=
:1px solid rgb(204,204,204);padding-left:1ex">
(same for all nvmeX, as far as I can tell)<br>
<br>
&gt;It looks like we don&#39;t currently test these bits before we add the =
last <br>
&gt;two (we do it unconditionally for &gt;=3D 1.2, and maybe we should chec=
k <br>
&gt;these bits &gt;=3D 1.2).<br>
&gt;<br>
&gt;Would you be able to test a fix for this?<br>
<br>
Yes, I would be happy to, but I cannot do it for a couple of weeks <br>
(running simulations for a deadline).<br></blockquote><div><br></div><div>T=
here&#39;s=C2=A0 no real rush... Your system will be fine without these eve=
nts given what</div><div>I think you are doing with it. You might want to c=
heck the smart log page to see</div><div>if any of the drives have indicato=
rs of trouble... but most trouble you&#39;d care about</div><div>would like=
ly torpedo your simulation very very shortly after they happen so even</div=
><div>that likely isn&#39;t strictly required.</div><div><br></div><div>War=
ner</div></div></div>

--0000000000001678bb05dfd807e0--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfqf%2Bqy6%2B9wWu65g4JRtTc6Gx4wJFhAq%2BN91o--Zqoziow>