Date: Mon, 20 Jun 2022 19:59:52 -0500 From: Larry Rosenman <ler@lerctr.org> To: Ultima <ultima1252@gmail.com> Cc: Freebsd current <freebsd-current@freebsd.org> Subject: Re: MCE: Does this look possibly like a slot issue? Message-ID: <983660c80cc6717e3b49821f7957ee80@lerctr.org> In-Reply-To: <CANJ8om7_=O95K3MOPZU8KgYWQAUkJTuTeyQRrgRr_S-=BHkN3A@mail.gmail.com> References: <c9d183a8a8083056a08946321694b70d@lerctr.org> <CANJ8om774CyUB4VBdAztEhipPFDW1PAMZQsXbk8%2Boro-3Tg8gA@mail.gmail.com> <c29f59fbb209874549f5f68efd14a3c2@lerctr.org> <CANJ8om7t339RDA0JtoeNT1SfBdgkGM4-bn6JZXG0ZqOTgt82SA@mail.gmail.com> <49c1dfe9bbe9912282b0f0339a0c077b@lerctr.org> <CANJ8om7_=O95K3MOPZU8KgYWQAUkJTuTeyQRrgRr_S-=BHkN3A@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --] SuperMicro X8DTN+ 2 Processors, 6-core/12-Thread. CPU: Intel(R) Xeon(R) CPU E5645 @ 2.40GHz (2400.20-MHz K8-class CPU) I'll bring it down and swap DIMMS around On 06/20/2022 7:57 pm, Ultima wrote: > Hey Larry, > > One red flag I am seeing is that the error is being produced on > the same CPU/bank with each error you have provided so far. > > Can you try and follow my original recommendation and swap > currently installed DIMM with the problem DIMM slot and see > if anything changes? > > Can you also provide the motherboard model? Also, do you > have multiple CPUs installed in this system? > > Best regards, > Richard Gallamore > > On Mon, Jun 20, 2022 at 5:41 PM Larry Rosenman <ler@lerctr.org> wrote: > > Yes and Yes. > > On 06/20/2022 7:37 pm, Ultima wrote: > > Are you sure that the module you replaced it with was good? > Are you sure you replaced the correct module? > > Best regards, > Richard Gallamore > > On Mon, Jun 20, 2022 at 5:23 PM Larry Rosenman <ler@lerctr.org> wrote: > > I'm seeing them constantly: > > root@freenas[~]# mcelog --dmi > Hardware event. This is not a software error. > MCE 0 > CPU 22 BANK 8 TSC 20aab486464a > MISC ac29890200046444 ADDR ee2f6e800 > TIME 1655770989 Mon Jun 20 19:23:09 2022 > MCG status: > Memory read ECC error > Memory corrected error count (CORE_ERR_CNT): 1 > Memory transaction Tracker ID (RTId): 44 > Memory DIMM ID of error: 0 > Memory channel ID of error: 1 > Memory ECC syndrome: ac298902 > STATUS 8c0000400001009f MCGSTATUS 0 > MCGCAP 1c09 APICID 34 SOCKETID 0 > CPUID Vendor Intel Family 6 Model 44 Step 2 > WARNING: SMBIOS data is often unreliable. Take with a grain of salt! > DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB > Device Locator: P2-DIMM2C > Bank Locator: BANK14 > Manufacturer: Hyundai > Serial Number: 40F3C20F > Asset Tag: > Part Number: HMT151R7BFR4C-H9 > Hardware event. This is not a software error. > MCE 1 > CPU 22 BANK 8 TSC 296dfcc82582 > MISC ac29890200041381 ADDR ee2f6e800 > TIME 1655770989 Mon Jun 20 19:23:09 2022 > MCG status: > Memory read ECC error > Memory corrected error count (CORE_ERR_CNT): 1 > Memory transaction Tracker ID (RTId): 81 > Memory DIMM ID of error: 0 > Memory channel ID of error: 1 > Memory ECC syndrome: ac298902 > STATUS 8c0000400001009f MCGSTATUS 0 > MCGCAP 1c09 APICID 34 SOCKETID 0 > CPUID Vendor Intel Family 6 Model 44 Step 2 > DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB > Device Locator: P2-DIMM2C > Bank Locator: BANK14 > Manufacturer: Hyundai > Serial Number: 40F3C20F > Asset Tag: > Part Number: HMT151R7BFR4C-H9 > Hardware event. This is not a software error. > MCE 2 > CPU 22 BANK 8 TSC 2a5604a6a070 > MISC ac29890200044281 > TIME 1655770989 Mon Jun 20 19:23:09 2022 > MCG status: > Memory ECC error occurred during scrub > Memory corrected error count (CORE_ERR_CNT): 1 > Memory transaction Tracker ID (RTId): 81 > Memory DIMM ID of error: 0 > Memory channel ID of error: 1 > Memory ECC syndrome: ac298902 > STATUS 88000040000200cf MCGSTATUS 0 > MCGCAP 1c09 APICID 34 SOCKETID 0 > CPUID Vendor Intel Family 6 Model 44 Step 2 > Hardware event. This is not a software error. > MCE 3 > CPU 22 BANK 8 TSC 31e141418eb8 > MISC ac29890200046a4a ADDR ee2f6e800 > TIME 1655770989 Mon Jun 20 19:23:09 2022 > MCG status: > Memory read ECC error > Memory corrected error count (CORE_ERR_CNT): 1 > Memory transaction Tracker ID (RTId): 4a > Memory DIMM ID of error: 0 > Memory channel ID of error: 1 > Memory ECC syndrome: ac298902 > STATUS 8c0000400001009f MCGSTATUS 0 > MCGCAP 1c09 APICID 34 SOCKETID 0 > CPUID Vendor Intel Family 6 Model 44 Step 2 > DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB > Device Locator: P2-DIMM2C > Bank Locator: BANK14 > Manufacturer: Hyundai > Serial Number: 40F3C20F > Asset Tag: > Part Number: HMT151R7BFR4C-H9 > Hardware event. This is not a software error. > MCE 4 > CPU 22 BANK 8 TSC 3a014afee106 > MISC ac29890200046646 ADDR ee2f6e800 > TIME 1655770989 Mon Jun 20 19:23:09 2022 > MCG status: > Memory read ECC error > Memory corrected error count (CORE_ERR_CNT): 1 > Memory transaction Tracker ID (RTId): 46 > Memory DIMM ID of error: 0 > Memory channel ID of error: 1 > Memory ECC syndrome: ac298902 > STATUS 8c0000400001009f MCGSTATUS 0 > MCGCAP 1c09 APICID 34 SOCKETID 0 > CPUID Vendor Intel Family 6 Model 44 Step 2 > DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB > Device Locator: P2-DIMM2C > Bank Locator: BANK14 > Manufacturer: Hyundai > Serial Number: 40F3C20F > Asset Tag: > Part Number: HMT151R7BFR4C-H9 > Hardware event. This is not a software error. > MCE 5 > CPU 22 BANK 8 TSC 41d1dbef1a6a > MISC ac29890200046141 ADDR ee2f6e800 > TIME 1655770989 Mon Jun 20 19:23:09 2022 > MCG status: > Memory read ECC error > Memory corrected error count (CORE_ERR_CNT): 1 > Memory transaction Tracker ID (RTId): 41 > Memory DIMM ID of error: 0 > Memory channel ID of error: 1 > Memory ECC syndrome: ac298902 > STATUS 8c0000400001009f MCGSTATUS 0 > MCGCAP 1c09 APICID 34 SOCKETID 0 > CPUID Vendor Intel Family 6 Model 44 Step 2 > DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB > Device Locator: P2-DIMM2C > Bank Locator: BANK14 > Manufacturer: Hyundai > Serial Number: 40F3C20F > Asset Tag: > Part Number: HMT151R7BFR4C-H9 > Hardware event. This is not a software error. > MCE 6 > CPU 22 BANK 8 TSC 4a1b1ecef446 > MISC ac29890200046a4a ADDR ee2f6e800 > TIME 1655770989 Mon Jun 20 19:23:09 2022 > MCG status: > Memory read ECC error > Memory corrected error count (CORE_ERR_CNT): 1 > Memory transaction Tracker ID (RTId): 4a > Memory DIMM ID of error: 0 > Memory channel ID of error: 1 > Memory ECC syndrome: ac298902 > STATUS 8c0000400001009f MCGSTATUS 0 > MCGCAP 1c09 APICID 34 SOCKETID 0 > CPUID Vendor Intel Family 6 Model 44 Step 2 > DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB > Device Locator: P2-DIMM2C > Bank Locator: BANK14 > Manufacturer: Hyundai > Serial Number: 40F3C20F > Asset Tag: > Part Number: HMT151R7BFR4C-H9 > Hardware event. This is not a software error. > MCE 7 > CPU 22 BANK 8 TSC 527bc27db776 > MISC ac29890200040386 ADDR ee2f6e800 > TIME 1655770989 Mon Jun 20 19:23:09 2022 > MCG status: > Memory read ECC error > Memory corrected error count (CORE_ERR_CNT): 1 > Memory transaction Tracker ID (RTId): 86 > Memory DIMM ID of error: 0 > Memory channel ID of error: 1 > Memory ECC syndrome: ac298902 > STATUS 8c0000400001009f MCGSTATUS 0 > MCGCAP 1c09 APICID 34 SOCKETID 0 > CPUID Vendor Intel Family 6 Model 44 Step 2 > DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB > Device Locator: P2-DIMM2C > Bank Locator: BANK14 > Manufacturer: Hyundai > Serial Number: 40F3C20F > Asset Tag: > Part Number: HMT151R7BFR4C-H9 > Hardware event. This is not a software error. > MCE 8 > CPU 22 BANK 8 TSC 5aa4ecdd795a > MISC ac29890200046646 ADDR ee2f6e800 > TIME 1655770989 Mon Jun 20 19:23:09 2022 > MCG status: > Memory read ECC error > Memory corrected error count (CORE_ERR_CNT): 1 > Memory transaction Tracker ID (RTId): 46 > Memory DIMM ID of error: 0 > Memory channel ID of error: 1 > Memory ECC syndrome: ac298902 > STATUS 8c0000400001009f MCGSTATUS 0 > MCGCAP 1c09 APICID 34 SOCKETID 0 > CPUID Vendor Intel Family 6 Model 44 Step 2 > DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB > Device Locator: P2-DIMM2C > Bank Locator: BANK14 > Manufacturer: Hyundai > Serial Number: 40F3C20F > Asset Tag: > Part Number: HMT151R7BFR4C-H9 > root@freenas[~]# > > and I replaced the DIMM yesterday :( > > On 06/20/2022 7:19 pm, Ultima wrote: > > Hey Larry, > > It is possible it's the motherboard itself, but it's rare. The way I > would determine this is to swap the DIMM module with another > populated slot on the motherboard and see if the error migrated > to the new slot or not. Also, this error doesn't necessarily mean > there is a problem that needs to be addressed. If you have been > running the system for many months and you see ECC errors a > handful of times, it can probably be safely ignored. > > Best regards, > Richard Gallamore > > On Mon, Jun 20, 2022 at 3:14 PM Larry Rosenman <ler@lerctr.org> wrote: > I've gotten a BUNCH of these on my TrueNAS server. I've replaced this > DIMM a couple of times, and still the MCE's continue. > Is it possible it's Motherboard slot issue? > > Hardware event. This is not a software error. > MCE 8 > CPU 22 BANK 8 TSC 5aa4ecdd795a > MISC ac29890200046646 ADDR ee2f6e800 > TIME 1655762472 Mon Jun 20 17:01:12 2022 > MCG status: > Memory read ECC error > Memory corrected error count (CORE_ERR_CNT): 1 > Memory transaction Tracker ID (RTId): 46 > Memory DIMM ID of error: 0 > Memory channel ID of error: 1 > Memory ECC syndrome: ac298902 > STATUS 8c0000400001009f MCGSTATUS 0 > MCGCAP 1c09 APICID 34 SOCKETID 0 > CPUID Vendor Intel Family 6 Model 44 Step 2 > DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB > Device Locator: P2-DIMM2C > Bank Locator: BANK14 > Manufacturer: Hyundai > Serial Number: 40F3C20F > Asset Tag: > Part Number: HMT151R7BFR4C-H9 > > -- > Larry Rosenman http://www.lerctr.org/~ler > Phone: +1 214-642-9640 E-Mail: ler@lerctr.org > US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106 -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 E-Mail: ler@lerctr.org US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106 -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 E-Mail: ler@lerctr.org US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106 -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 E-Mail: ler@lerctr.org US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106 [-- Attachment #2 --] <html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /></head><body style='font-size: 10pt; font-family: Arial,Helvetica,sans-serif'> <p>SuperMicro X8DTN+</p> <p>2 Processors, 6-core/12-Thread. CPU: Intel(R) Xeon(R) CPU E5645 @ 2.40GHz (2400.20-MHz K8-class CPU)</p> <p><br /></p> <p>I'll bring it down and swap DIMMS around</p> <p><br /></p> <p id="reply-intro">On 06/20/2022 7:57 pm, Ultima wrote:</p> <blockquote type="cite" style="padding: 0 0.4em; border-left: #1010ff 2px solid; margin: 0"> <div id="replybody1"> <div dir="ltr"> <div>Hey Larry,</div> <div> </div> <div>One red flag I am seeing is that the error is being produced on</div> <div>the same CPU/bank with each error you have provided so far.</div> <div> </div> <div>Can you try and follow my original recommendation and swap</div> <div>currently installed DIMM with the problem DIMM slot and see</div> <div>if anything changes?</div> <div> </div> <div>Can you also provide the motherboard model? Also, do you</div> <div>have multiple CPUs installed in this system?</div> <div> </div> <div>Best regards,</div> <div>Richard Gallamore</div> <div> <div> </div> </div> </div> <br /> <div class="v1gmail_quote"> <div class="v1gmail_attr" dir="ltr">On Mon, Jun 20, 2022 at 5:41 PM Larry Rosenman <<a href="mailto:ler@lerctr.org" rel="noreferrer">ler@lerctr.org</a>> wrote:</div> <blockquote class="v1gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left: 1px solid #cccccc; padding-left: 1ex;"> <div style="font-size: 10pt; font-family: Arial,Helvetica,sans-serif;"> <p>Yes and Yes.</p> <p><br /></p> <p id="v1gmail-m_3316908189451833722reply-intro">On 06/20/2022 7:37 pm, Ultima wrote:</p> <blockquote style="padding: 0px 0.4em; border-left: 2px solid #1010ff; margin: 0px;"> <div id="v1gmail-m_3316908189451833722replybody1"> <div dir="ltr"> <div>Are you sure that the module you replaced it with was good?</div> <div>Are you sure you replaced the correct module?</div> <div> </div> <div>Best regards,</div> <div>Richard Gallamore</div> </div> <br /> <div> <div dir="ltr">On Mon, Jun 20, 2022 at 5:23 PM Larry Rosenman <<a href="mailto:ler@lerctr.org" rel="noreferrer">ler@lerctr.org</a>> wrote:</div> <blockquote style="margin: 0px 0px 0px 0.8ex; border-left: 1px solid #cccccc; padding-left: 1ex;"> <div style="font-size: 10pt; font-family: Arial,Helvetica,sans-serif;"> <p>I'm seeing them constantly:</p> <p>root@freenas[~]# mcelog --dmi<br />Hardware event. This is not a software error.<br />MCE 0<br />CPU 22 BANK 8 TSC 20aab486464a<br />MISC ac29890200046444 ADDR ee2f6e800<br />TIME 1655770989 Mon Jun 20 19:23:09 2022<br />MCG status:<br />Memory read ECC error<br />Memory corrected error count (CORE_ERR_CNT): 1<br />Memory transaction Tracker ID (RTId): 44<br />Memory DIMM ID of error: 0<br />Memory channel ID of error: 1<br />Memory ECC syndrome: ac298902<br />STATUS 8c0000400001009f MCGSTATUS 0<br />MCGCAP 1c09 APICID 34 SOCKETID 0<br />CPUID Vendor Intel Family 6 Model 44 Step 2<br />WARNING: SMBIOS data is often unreliable. Take with a grain of salt!<br />DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB<br />Device Locator: P2-DIMM2C<br />Bank Locator: BANK14<br />Manufacturer: Hyundai<br />Serial Number: 40F3C20F<br />Asset Tag:<br />Part Number: HMT151R7BFR4C-H9<br />Hardware event. This is not a software error.<br />MCE 1<br />CPU 22 BANK 8 TSC 296dfcc82582<br />MISC ac29890200041381 ADDR ee2f6e800<br />TIME 1655770989 Mon Jun 20 19:23:09 2022<br />MCG status:<br />Memory read ECC error<br />Memory corrected error count (CORE_ERR_CNT): 1<br />Memory transaction Tracker ID (RTId): 81<br />Memory DIMM ID of error: 0<br />Memory channel ID of error: 1<br />Memory ECC syndrome: ac298902<br />STATUS 8c0000400001009f MCGSTATUS 0<br />MCGCAP 1c09 APICID 34 SOCKETID 0<br />CPUID Vendor Intel Family 6 Model 44 Step 2<br />DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB<br />Device Locator: P2-DIMM2C<br />Bank Locator: BANK14<br />Manufacturer: Hyundai<br />Serial Number: 40F3C20F<br />Asset Tag:<br />Part Number: HMT151R7BFR4C-H9<br />Hardware event. This is not a software error.<br />MCE 2<br />CPU 22 BANK 8 TSC 2a5604a6a070<br />MISC ac29890200044281<br />TIME 1655770989 Mon Jun 20 19:23:09 2022<br />MCG status:<br />Memory ECC error occurred during scrub<br />Memory corrected error count (CORE_ERR_CNT): 1<br />Memory transaction Tracker ID (RTId): 81<br />Memory DIMM ID of error: 0<br />Memory channel ID of error: 1<br />Memory ECC syndrome: ac298902<br />STATUS 88000040000200cf MCGSTATUS 0<br />MCGCAP 1c09 APICID 34 SOCKETID 0<br />CPUID Vendor Intel Family 6 Model 44 Step 2<br />Hardware event. This is not a software error.<br />MCE 3<br />CPU 22 BANK 8 TSC 31e141418eb8<br />MISC ac29890200046a4a ADDR ee2f6e800<br />TIME 1655770989 Mon Jun 20 19:23:09 2022<br />MCG status:<br />Memory read ECC error<br />Memory corrected error count (CORE_ERR_CNT): 1<br />Memory transaction Tracker ID (RTId): 4a<br />Memory DIMM ID of error: 0<br />Memory channel ID of error: 1<br />Memory ECC syndrome: ac298902<br />STATUS 8c0000400001009f MCGSTATUS 0<br />MCGCAP 1c09 APICID 34 SOCKETID 0<br />CPUID Vendor Intel Family 6 Model 44 Step 2<br />DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB<br />Device Locator: P2-DIMM2C<br />Bank Locator: BANK14<br />Manufacturer: Hyundai<br />Serial Number: 40F3C20F<br />Asset Tag:<br />Part Number: HMT151R7BFR4C-H9<br />Hardware event. This is not a software error.<br />MCE 4<br />CPU 22 BANK 8 TSC 3a014afee106<br />MISC ac29890200046646 ADDR ee2f6e800<br />TIME 1655770989 Mon Jun 20 19:23:09 2022<br />MCG status:<br />Memory read ECC error<br />Memory corrected error count (CORE_ERR_CNT): 1<br />Memory transaction Tracker ID (RTId): 46<br />Memory DIMM ID of error: 0<br />Memory channel ID of error: 1<br />Memory ECC syndrome: ac298902<br />STATUS 8c0000400001009f MCGSTATUS 0<br />MCGCAP 1c09 APICID 34 SOCKETID 0<br />CPUID Vendor Intel Family 6 Model 44 Step 2<br />DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB<br />Device Locator: P2-DIMM2C<br />Bank Locator: BANK14<br />Manufacturer: Hyundai<br />Serial Number: 40F3C20F<br />Asset Tag:<br />Part Number: HMT151R7BFR4C-H9<br />Hardware event. This is not a software error.<br />MCE 5<br />CPU 22 BANK 8 TSC 41d1dbef1a6a<br />MISC ac29890200046141 ADDR ee2f6e800<br />TIME 1655770989 Mon Jun 20 19:23:09 2022<br />MCG status:<br />Memory read ECC error<br />Memory corrected error count (CORE_ERR_CNT): 1<br />Memory transaction Tracker ID (RTId): 41<br />Memory DIMM ID of error: 0<br />Memory channel ID of error: 1<br />Memory ECC syndrome: ac298902<br />STATUS 8c0000400001009f MCGSTATUS 0<br />MCGCAP 1c09 APICID 34 SOCKETID 0<br />CPUID Vendor Intel Family 6 Model 44 Step 2<br />DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB<br />Device Locator: P2-DIMM2C<br />Bank Locator: BANK14<br />Manufacturer: Hyundai<br />Serial Number: 40F3C20F<br />Asset Tag:<br />Part Number: HMT151R7BFR4C-H9<br />Hardware event. This is not a software error.<br />MCE 6<br />CPU 22 BANK 8 TSC 4a1b1ecef446<br />MISC ac29890200046a4a ADDR ee2f6e800<br />TIME 1655770989 Mon Jun 20 19:23:09 2022<br />MCG status:<br />Memory read ECC error<br />Memory corrected error count (CORE_ERR_CNT): 1<br />Memory transaction Tracker ID (RTId): 4a<br />Memory DIMM ID of error: 0<br />Memory channel ID of error: 1<br />Memory ECC syndrome: ac298902<br />STATUS 8c0000400001009f MCGSTATUS 0<br />MCGCAP 1c09 APICID 34 SOCKETID 0<br />CPUID Vendor Intel Family 6 Model 44 Step 2<br />DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB<br />Device Locator: P2-DIMM2C<br />Bank Locator: BANK14<br />Manufacturer: Hyundai<br />Serial Number: 40F3C20F<br />Asset Tag:<br />Part Number: HMT151R7BFR4C-H9<br />Hardware event. This is not a software error.<br />MCE 7<br />CPU 22 BANK 8 TSC 527bc27db776<br />MISC ac29890200040386 ADDR ee2f6e800<br />TIME 1655770989 Mon Jun 20 19:23:09 2022<br />MCG status:<br />Memory read ECC error<br />Memory corrected error count (CORE_ERR_CNT): 1<br />Memory transaction Tracker ID (RTId): 86<br />Memory DIMM ID of error: 0<br />Memory channel ID of error: 1<br />Memory ECC syndrome: ac298902<br />STATUS 8c0000400001009f MCGSTATUS 0<br />MCGCAP 1c09 APICID 34 SOCKETID 0<br />CPUID Vendor Intel Family 6 Model 44 Step 2<br />DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB<br />Device Locator: P2-DIMM2C<br />Bank Locator: BANK14<br />Manufacturer: Hyundai<br />Serial Number: 40F3C20F<br />Asset Tag:<br />Part Number: HMT151R7BFR4C-H9<br />Hardware event. This is not a software error.<br />MCE 8<br />CPU 22 BANK 8 TSC 5aa4ecdd795a<br />MISC ac29890200046646 ADDR ee2f6e800<br />TIME 1655770989 Mon Jun 20 19:23:09 2022<br />MCG status:<br />Memory read ECC error<br />Memory corrected error count (CORE_ERR_CNT): 1<br />Memory transaction Tracker ID (RTId): 46<br />Memory DIMM ID of error: 0<br />Memory channel ID of error: 1<br />Memory ECC syndrome: ac298902<br />STATUS 8c0000400001009f MCGSTATUS 0<br />MCGCAP 1c09 APICID 34 SOCKETID 0<br />CPUID Vendor Intel Family 6 Model 44 Step 2<br />DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB<br />Device Locator: P2-DIMM2C<br />Bank Locator: BANK14<br />Manufacturer: Hyundai<br />Serial Number: 40F3C20F<br />Asset Tag:<br />Part Number: HMT151R7BFR4C-H9<br /><a href="#v1m_3316908189451833722_NOP">root@freenas[~]#</a></p> <p><br /></p> <p>and I replaced the DIMM yesterday :( </p> <p><br /></p> <p><br /></p> <p id="v1gmail-m_3316908189451833722v1gmail-m_2643139469111892975reply-intro">On 06/20/2022 7:19 pm, Ultima wrote:</p> <blockquote style="padding: 0px 0.4em; border-left: 2px solid #1010ff; margin: 0px;"> <div id="v1gmail-m_3316908189451833722v1gmail-m_2643139469111892975replybody1"> <div dir="ltr"> <div>Hey Larry,</div> <div> </div> <div> It is possible it's the motherboard itself, but it's rare. The way I</div> <div>would determine this is to swap the DIMM module with another</div> <div>populated slot on the motherboard and see if the error migrated</div> <div>to the new slot or not. Also, this error doesn't necessarily mean</div> <div>there is a problem that needs to be addressed. If you have been</div> <div>running the system for many months and you see ECC errors a</div> <div>handful of times, it can probably be safely ignored.</div> <div> </div> <div>Best regards,</div> <div>Richard Gallamore</div> </div> <br /> <div> <div dir="ltr">On Mon, Jun 20, 2022 at 3:14 PM Larry Rosenman <<a href="mailto:ler@lerctr.org" rel="noreferrer">ler@lerctr.org</a>> wrote:</div> <blockquote style="margin: 0px 0px 0px 0.8ex; border-left: 1px solid #cccccc; padding-left: 1ex;">I've gotten a BUNCH of these on my TrueNAS server. I've replaced this <br />DIMM a couple of times, and still the MCE's continue.<br />Is it possible it's Motherboard slot issue?<br /><br />Hardware event. This is not a software error.<br />MCE 8<br />CPU 22 BANK 8 TSC 5aa4ecdd795a<br />MISC ac29890200046646 ADDR ee2f6e800<br />TIME 1655762472 Mon Jun 20 17:01:12 2022<br />MCG status:<br />Memory read ECC error<br />Memory corrected error count (CORE_ERR_CNT): 1<br />Memory transaction Tracker ID (RTId): 46<br />Memory DIMM ID of error: 0<br />Memory channel ID of error: 1<br />Memory ECC syndrome: ac298902<br />STATUS 8c0000400001009f MCGSTATUS 0<br />MCGCAP 1c09 APICID 34 SOCKETID 0<br />CPUID Vendor Intel Family 6 Model 44 Step 2<br />DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB<br />Device Locator: P2-DIMM2C<br />Bank Locator: BANK14<br />Manufacturer: Hyundai<br />Serial Number: 40F3C20F<br />Asset Tag:<br />Part Number: HMT151R7BFR4C-H9<br /><br /><br /><br />-- <br />Larry Rosenman <a href="http://www.lerctr.org/~ler" target="_blank" rel="noopener noreferrer">http://www.lerctr.org/~ler</a><br />Phone: +1 214-642-9640 E-Mail: <a href="mailto:ler@lerctr.org" rel="noreferrer">ler@lerctr.org</a><br />US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106<br /><br /></blockquote> </div> </div> </blockquote> <p><br /></p> <div id="v1gmail-m_3316908189451833722v1gmail-m_2643139469111892975signature"> <div style="margin: 0px; padding: 0px; font-family: monospace;"><span>-- <br />Larry Rosenman <a href="http://www.lerctr.org/~ler" target="_blank" rel="noopener noreferrer">http://www.lerctr.org/~ler</a><br />Phone: +1 214-642-9640 E-Mail: <a href="mailto:ler@lerctr.org" rel="noreferrer">ler@lerctr.org</a><br />US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106<br /></span></div> </div> </div> </blockquote> </div> </div> </blockquote> <p><br /></p> <div id="v1gmail-m_3316908189451833722signature"> <div style="margin: 0px; padding: 0px; font-family: monospace;"><span>-- <br />Larry Rosenman <a href="http://www.lerctr.org/~ler" target="_blank" rel="noopener noreferrer">http://www.lerctr.org/~ler</a><br />Phone: +1 214-642-9640 E-Mail: <a href="mailto:ler@lerctr.org" rel="noreferrer">ler@lerctr.org</a><br />US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106<br /></span></div> </div> </div> </blockquote> </div> </div> </blockquote> <p><br /></p> <div id="signature"> <div class="pre" style="margin: 0; padding: 0; font-family: monospace"><span class="sig">-- <br />Larry Rosenman <a href="http://www.lerctr.org/~ler" target="_blank" rel="noopener noreferrer">http://www.lerctr.org/~ler</a><br />Phone: +1 214-642-9640 E-Mail: <a href="mailto:ler@lerctr.org">ler@lerctr.org</a><br />US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106<br /></span></div> </div> </body></html>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?983660c80cc6717e3b49821f7957ee80>
