Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 20 Jun 2022 17:19:05 -0700
From:      Ultima <ultima1252@gmail.com>
To:        Larry Rosenman <ler@lerctr.org>
Cc:        Freebsd current <freebsd-current@freebsd.org>
Subject:   Re: MCE: Does this look possibly like a slot issue?
Message-ID:  <CANJ8om774CyUB4VBdAztEhipPFDW1PAMZQsXbk8%2Boro-3Tg8gA@mail.gmail.com>
In-Reply-To: <c9d183a8a8083056a08946321694b70d@lerctr.org>
References:  <c9d183a8a8083056a08946321694b70d@lerctr.org>

next in thread | previous in thread | raw e-mail | index | archive | help
--000000000000cdd90a05e1ea2b38
Content-Type: text/plain; charset="UTF-8"

Hey Larry,

 It is possible it's the motherboard itself, but it's rare. The way I
would determine this is to swap the DIMM module with another
populated slot on the motherboard and see if the error migrated
to the new slot or not. Also, this error doesn't necessarily mean
there is a problem that needs to be addressed. If you have been
running the system for many months and you see ECC errors a
handful of times, it can probably be safely ignored.

Best regards,
Richard Gallamore

On Mon, Jun 20, 2022 at 3:14 PM Larry Rosenman <ler@lerctr.org> wrote:

> I've gotten a BUNCH of these on my TrueNAS server.  I've replaced this
> DIMM a couple of times, and still the MCE's continue.
> Is it possible it's Motherboard slot issue?
>
> Hardware event. This is not a software error.
> MCE 8
> CPU 22 BANK 8 TSC 5aa4ecdd795a
> MISC ac29890200046646 ADDR ee2f6e800
> TIME 1655762472 Mon Jun 20 17:01:12 2022
> MCG status:
> Memory read ECC error
> Memory corrected error count (CORE_ERR_CNT): 1
> Memory transaction Tracker ID (RTId): 46
> Memory DIMM ID of error: 0
> Memory channel ID of error: 1
> Memory ECC syndrome: ac298902
> STATUS 8c0000400001009f MCGSTATUS 0
> MCGCAP 1c09 APICID 34 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 44 Step 2
> DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
> Device Locator: P2-DIMM2C
> Bank Locator: BANK14
> Manufacturer: Hyundai
> Serial Number: 40F3C20F
> Asset Tag:
> Part Number: HMT151R7BFR4C-H9
>
>
>
> --
> Larry Rosenman                     http://www.lerctr.org/~ler
> Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org
> US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
>
>

--000000000000cdd90a05e1ea2b38
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>Hey Larry,</div><div><br></div><div>=C2=A0It is possi=
ble it&#39;s the motherboard itself, but it&#39;s rare. The way I</div><div=
>would determine this is to swap the DIMM module with another</div><div>pop=
ulated slot on the motherboard and see if the error migrated</div><div>to t=
he new slot or not. Also, this error doesn&#39;t necessarily mean</div><div=
>there is a problem that needs to be addressed. If you have been</div><div>=
running the system for many months and you see ECC errors a</div><div>handf=
ul of times, it can probably be safely ignored.</div><div><br></div><div>Be=
st regards,</div><div>Richard Gallamore<br> </div></div><br><div class=3D"g=
mail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Mon, Jun 20, 2022 at 3=
:14 PM Larry Rosenman &lt;<a href=3D"mailto:ler@lerctr.org">ler@lerctr.org<=
/a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0=
px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">I=
&#39;ve gotten a BUNCH of these on my TrueNAS server.=C2=A0 I&#39;ve replac=
ed this <br>
DIMM a couple of times, and still the MCE&#39;s continue.<br>
Is it possible it&#39;s Motherboard slot issue?<br>
<br>
Hardware event. This is not a software error.<br>
MCE 8<br>
CPU 22 BANK 8 TSC 5aa4ecdd795a<br>
MISC ac29890200046646 ADDR ee2f6e800<br>
TIME 1655762472 Mon Jun 20 17:01:12 2022<br>
MCG status:<br>
Memory read ECC error<br>
Memory corrected error count (CORE_ERR_CNT): 1<br>
Memory transaction Tracker ID (RTId): 46<br>
Memory DIMM ID of error: 0<br>
Memory channel ID of error: 1<br>
Memory ECC syndrome: ac298902<br>
STATUS 8c0000400001009f MCGSTATUS 0<br>
MCGCAP 1c09 APICID 34 SOCKETID 0<br>
CPUID Vendor Intel Family 6 Model 44 Step 2<br>
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB<br>
Device Locator: P2-DIMM2C<br>
Bank Locator: BANK14<br>
Manufacturer: Hyundai<br>
Serial Number: 40F3C20F<br>
Asset Tag:<br>
Part Number: HMT151R7BFR4C-H9<br>
<br>
<br>
<br>
-- <br>
Larry Rosenman=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0<a href=3D"http://www.lerctr.org/~ler" rel=3D"noreferrer" =
target=3D"_blank">http://www.lerctr.org/~ler</a><br>;
Phone: +1 214-642-9640=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0E-Mail: <a href=3D"mailto:ler@lerctr.org" target=3D"_blank">ler@l=
erctr.org</a><br>
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106<br>
<br>
</blockquote></div>

--000000000000cdd90a05e1ea2b38--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANJ8om774CyUB4VBdAztEhipPFDW1PAMZQsXbk8%2Boro-3Tg8gA>