Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 21 Jun 2022 11:23:24 -0700
From:      Chris <bsd-lists@bsdforge.com>
To:        Larry Rosenman <ler@lerctr.org>
Cc:        Ultima <ultima1252@gmail.com>, Freebsd current <freebsd-current@freebsd.org>
Subject:   Re: MCE: Does this look possibly like a slot issue?
Message-ID:  <696947ca2bfea895d0062a88c06673f7@bsdforge.com>
In-Reply-To: <c29f59fbb209874549f5f68efd14a3c2@lerctr.org>
References:  <c9d183a8a8083056a08946321694b70d@lerctr.org> <CANJ8om774CyUB4VBdAztEhipPFDW1PAMZQsXbk8%2Boro-3Tg8gA@mail.gmail.com> <c29f59fbb209874549f5f68efd14a3c2@lerctr.org>

next in thread | previous in thread | raw e-mail | index | archive | help
--=_9cc2793dafa4de5e14b797d4bbeb2c22
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII;
 format=flowed

On 2022-06-20 17:23, Larry Rosenman wrote:
> I'm seeing them constantly:
FWIW it looks like a sync(ing) problem between your
RAM && CPU cache. Are are your clocks set correctly
for your CPU && RAM? Is your CPU too hot? Is the CPU
cache ECC?
> 
> root@freenas[~]# mcelog --dmi
> Hardware event. This is not a software error.
> MCE 0
> CPU 22 BANK 8 TSC 20aab486464a
> MISC ac29890200046444 ADDR ee2f6e800
> TIME 1655770989 Mon Jun 20 19:23:09 2022
> MCG status:
> Memory read ECC error
> Memory corrected error count (CORE_ERR_CNT): 1
> Memory transaction Tracker ID (RTId): 44
> Memory DIMM ID of error: 0
> Memory channel ID of error: 1
> Memory ECC syndrome: ac298902
> STATUS 8c0000400001009f MCGSTATUS 0
> MCGCAP 1c09 APICID 34 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 44 Step 2
> WARNING: SMBIOS data is often unreliable. Take with a grain of salt!
> DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
> Device Locator: P2-DIMM2C
> Bank Locator: BANK14
> Manufacturer: Hyundai
> Serial Number: 40F3C20F
> Asset Tag:
> Part Number: HMT151R7BFR4C-H9
> Hardware event. This is not a software error.
> MCE 1
> CPU 22 BANK 8 TSC 296dfcc82582
> MISC ac29890200041381 ADDR ee2f6e800
> TIME 1655770989 Mon Jun 20 19:23:09 2022
> MCG status:
> Memory read ECC error
> Memory corrected error count (CORE_ERR_CNT): 1
> Memory transaction Tracker ID (RTId): 81
> Memory DIMM ID of error: 0
> Memory channel ID of error: 1
> Memory ECC syndrome: ac298902
> STATUS 8c0000400001009f MCGSTATUS 0
> MCGCAP 1c09 APICID 34 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 44 Step 2
> DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
> Device Locator: P2-DIMM2C
> Bank Locator: BANK14
> Manufacturer: Hyundai
> Serial Number: 40F3C20F
> Asset Tag:
> Part Number: HMT151R7BFR4C-H9
> Hardware event. This is not a software error.
> MCE 2
> CPU 22 BANK 8 TSC 2a5604a6a070
> MISC ac29890200044281
> TIME 1655770989 Mon Jun 20 19:23:09 2022
> MCG status:
> Memory ECC error occurred during scrub
> Memory corrected error count (CORE_ERR_CNT): 1
> Memory transaction Tracker ID (RTId): 81
> Memory DIMM ID of error: 0
> Memory channel ID of error: 1
> Memory ECC syndrome: ac298902
> STATUS 88000040000200cf MCGSTATUS 0
> MCGCAP 1c09 APICID 34 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 44 Step 2
> Hardware event. This is not a software error.
> MCE 3
> CPU 22 BANK 8 TSC 31e141418eb8
> MISC ac29890200046a4a ADDR ee2f6e800
> TIME 1655770989 Mon Jun 20 19:23:09 2022
> MCG status:
> Memory read ECC error
> Memory corrected error count (CORE_ERR_CNT): 1
> Memory transaction Tracker ID (RTId): 4a
> Memory DIMM ID of error: 0
> Memory channel ID of error: 1
> Memory ECC syndrome: ac298902
> STATUS 8c0000400001009f MCGSTATUS 0
> MCGCAP 1c09 APICID 34 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 44 Step 2
> DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
> Device Locator: P2-DIMM2C
> Bank Locator: BANK14
> Manufacturer: Hyundai
> Serial Number: 40F3C20F
> Asset Tag:
> Part Number: HMT151R7BFR4C-H9
> Hardware event. This is not a software error.
> MCE 4
> CPU 22 BANK 8 TSC 3a014afee106
> MISC ac29890200046646 ADDR ee2f6e800
> TIME 1655770989 Mon Jun 20 19:23:09 2022
> MCG status:
> Memory read ECC error
> Memory corrected error count (CORE_ERR_CNT): 1
> Memory transaction Tracker ID (RTId): 46
> Memory DIMM ID of error: 0
> Memory channel ID of error: 1
> Memory ECC syndrome: ac298902
> STATUS 8c0000400001009f MCGSTATUS 0
> MCGCAP 1c09 APICID 34 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 44 Step 2
> DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
> Device Locator: P2-DIMM2C
> Bank Locator: BANK14
> Manufacturer: Hyundai
> Serial Number: 40F3C20F
> Asset Tag:
> Part Number: HMT151R7BFR4C-H9
> Hardware event. This is not a software error.
> MCE 5
> CPU 22 BANK 8 TSC 41d1dbef1a6a
> MISC ac29890200046141 ADDR ee2f6e800
> TIME 1655770989 Mon Jun 20 19:23:09 2022
> MCG status:
> Memory read ECC error
> Memory corrected error count (CORE_ERR_CNT): 1
> Memory transaction Tracker ID (RTId): 41
> Memory DIMM ID of error: 0
> Memory channel ID of error: 1
> Memory ECC syndrome: ac298902
> STATUS 8c0000400001009f MCGSTATUS 0
> MCGCAP 1c09 APICID 34 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 44 Step 2
> DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
> Device Locator: P2-DIMM2C
> Bank Locator: BANK14
> Manufacturer: Hyundai
> Serial Number: 40F3C20F
> Asset Tag:
> Part Number: HMT151R7BFR4C-H9
> Hardware event. This is not a software error.
> MCE 6
> CPU 22 BANK 8 TSC 4a1b1ecef446
> MISC ac29890200046a4a ADDR ee2f6e800
> TIME 1655770989 Mon Jun 20 19:23:09 2022
> MCG status:
> Memory read ECC error
> Memory corrected error count (CORE_ERR_CNT): 1
> Memory transaction Tracker ID (RTId): 4a
> Memory DIMM ID of error: 0
> Memory channel ID of error: 1
> Memory ECC syndrome: ac298902
> STATUS 8c0000400001009f MCGSTATUS 0
> MCGCAP 1c09 APICID 34 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 44 Step 2
> DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
> Device Locator: P2-DIMM2C
> Bank Locator: BANK14
> Manufacturer: Hyundai
> Serial Number: 40F3C20F
> Asset Tag:
> Part Number: HMT151R7BFR4C-H9
> Hardware event. This is not a software error.
> MCE 7
> CPU 22 BANK 8 TSC 527bc27db776
> MISC ac29890200040386 ADDR ee2f6e800
> TIME 1655770989 Mon Jun 20 19:23:09 2022
> MCG status:
> Memory read ECC error
> Memory corrected error count (CORE_ERR_CNT): 1
> Memory transaction Tracker ID (RTId): 86
> Memory DIMM ID of error: 0
> Memory channel ID of error: 1
> Memory ECC syndrome: ac298902
> STATUS 8c0000400001009f MCGSTATUS 0
> MCGCAP 1c09 APICID 34 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 44 Step 2
> DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
> Device Locator: P2-DIMM2C
> Bank Locator: BANK14
> Manufacturer: Hyundai
> Serial Number: 40F3C20F
> Asset Tag:
> Part Number: HMT151R7BFR4C-H9
> Hardware event. This is not a software error.
> MCE 8
> CPU 22 BANK 8 TSC 5aa4ecdd795a
> MISC ac29890200046646 ADDR ee2f6e800
> TIME 1655770989 Mon Jun 20 19:23:09 2022
> MCG status:
> Memory read ECC error
> Memory corrected error count (CORE_ERR_CNT): 1
> Memory transaction Tracker ID (RTId): 46
> Memory DIMM ID of error: 0
> Memory channel ID of error: 1
> Memory ECC syndrome: ac298902
> STATUS 8c0000400001009f MCGSTATUS 0
> MCGCAP 1c09 APICID 34 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 44 Step 2
> DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
> Device Locator: P2-DIMM2C
> Bank Locator: BANK14
> Manufacturer: Hyundai
> Serial Number: 40F3C20F
> Asset Tag:
> Part Number: HMT151R7BFR4C-H9
> root@freenas[~]#
> 
> and I replaced the DIMM yesterday :(
> 
> On 06/20/2022 7:19 pm, Ultima wrote:
> 
>> Hey Larry,
>> 
>> It is possible it's the motherboard itself, but it's rare. The way I
>> would determine this is to swap the DIMM module with another
>> populated slot on the motherboard and see if the error migrated
>> to the new slot or not. Also, this error doesn't necessarily mean
>> there is a problem that needs to be addressed. If you have been
>> running the system for many months and you see ECC errors a
>> handful of times, it can probably be safely ignored.
>> 
>> Best regards,
>> Richard Gallamore
>> 
>> On Mon, Jun 20, 2022 at 3:14 PM Larry Rosenman <ler@lerctr.org> wrote:
>> 
>>> I've gotten a BUNCH of these on my TrueNAS server.  I've replaced this
>>> DIMM a couple of times, and still the MCE's continue.
>>> Is it possible it's Motherboard slot issue?
>>> 
>>> Hardware event. This is not a software error.
>>> MCE 8
>>> CPU 22 BANK 8 TSC 5aa4ecdd795a
>>> MISC ac29890200046646 ADDR ee2f6e800
>>> TIME 1655762472 Mon Jun 20 17:01:12 2022
>>> MCG status:
>>> Memory read ECC error
>>> Memory corrected error count (CORE_ERR_CNT): 1
>>> Memory transaction Tracker ID (RTId): 46
>>> Memory DIMM ID of error: 0
>>> Memory channel ID of error: 1
>>> Memory ECC syndrome: ac298902
>>> STATUS 8c0000400001009f MCGSTATUS 0
>>> MCGCAP 1c09 APICID 34 SOCKETID 0
>>> CPUID Vendor Intel Family 6 Model 44 Step 2
>>> DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
>>> Device Locator: P2-DIMM2C
>>> Bank Locator: BANK14
>>> Manufacturer: Hyundai
>>> Serial Number: 40F3C20F
>>> Asset Tag:
>>> Part Number: HMT151R7BFR4C-H9
>>> 
>>> --
>>> Larry Rosenman                     http://www.lerctr.org/~ler
>>> Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org
>>> US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

--=_9cc2793dafa4de5e14b797d4bbeb2c22
Content-Transfer-Encoding: 7bit
Content-Type: application/pgp-keys;
 name=0xBDE49540.asc
Content-Disposition: attachment;
 filename=0xBDE49540.asc;
 size=5028

-----BEGIN PGP PUBLIC KEY BLOCK-----

mQENBGDTzGEBCADHlXdS4V57s2soaEK2wi3o9rr9zo7to/giBSxCpFYJxOnPkL5A
2ibbvflrL8sWvAczx47wgDS7iIhzICBBRdnXtcFGnoeeriV27LSn+PcpnIB+DaWZ
xe+6TDC0Z0JUJ7qDTjUBFzhnQGYlrVvc4WbnWTjJaB1LEwgIX8JqX5S3SX0/oXgs
+OtqDuENZ4/a5te5xPnspTv/5NJHjqYGxjHP0Vw0KjRKS1AoJ1SBPSMQV5373AX9
5NzFS+CjqeQhjfHFPeRajQ8t4T6eqhKA7LtKMO1egeAwNehk9ZoEqEBT2+ojuKUd
oSuzqvhhx+eUIYLFqoPSzMKR+YbStzergsbnABEBAAG0KUNocmlzIEh1dGNoaW5z
b24gPGNocmlzaEB1bHRpbWF0ZWRucy5uZXQ+iQFrBBABCABVBgsJBwgDAgQVCAoC
AxYCAQIZAQIbAwIeARgYaGtwczovL2tleXMub3BlbnBncC5vcmcWIQQGJAsyyBlk
cuwsSYsYdR58veSVQAUCYNQl+wUJA8LAmgAKCRAYdR58veSVQN3NB/sFTeXrZeDk
ml/dshET8QbkOPgXlnibk8+Mauf+y9LjS9WT7R8EmqhK7T7aw115JQ1RWTM6kpQM
jyDBjYF7piJEpNKI9YDeSnODKir1fWQqm9+wd68wAKGvV4m8kg9uOHCvXG4J++MG
zDFH+PuGVxKirFnaz46DpS0Zw7wTtjNiNFvCooYov3IeYGfqcchd3hwBuXgWLexZ
vI8JW7lL9oXl7B/wcbSxg9rwy6/QLYGg6sEtYRcFYyvQWefSMJaLWjU/pZN2iSxM
lXm55iZv1BXHupfeD1ldRiGs6ejrcpa8+U1ju291WbLzcIsU8IDljeW9/WB2dLFT
hJmY1wRk158AtB5DaHJpcyA8YnNkLWxpc3RzQGJzZGZvcmdlLmNvbT6JAWgEEAEI
AFIGCwkHCAMCBBUICgIDFgIBAhsDAh4BGBhoa3BzOi8va2V5cy5vcGVucGdwLm9y
ZxYhBAYkCzLIGWRy7CxJixh1Hny95JVABQJg1CX7BQkDwsCaAAoJEBh1Hny95JVA
aI0H/AlJAOfc5TcMKa479Itw31mwccKb+u0DPN9Gkm/RfWIBjeqqozxCM8G8jVFr
dt/J6KmBO3dQtRZHlXdD57RAfDDl5Vm3uws0s+UIFOxMiua/YxyuDcKLsE8Bjkzx
z+vuJ8f6cg4WlygPr3bo3l81AOuU/wOsTrNkQvVJxgATlooATSVxs0yNn2uoso9f
nhMGUYsmT4c35JYh0k6Lq7Z2LS+ELipMTQ7M7iCWSP1O/zSEvPD4NBo52xCvjLka
KcL4fRl7UN+6ouwGr5aUn83tztE/IR0AK45gFvL5yxI4g/zm1t3j2+hhhW1pBU8w
uQWkD2DyLTWy7xs1uVF5m1ojHp60H0NocmlzIDxrbm90QHRhY29tYXdpcmVsZXNz
Lm5ldD6JAWgEEAEIAFIGCwkHCAMCBBUICgIDFgIBAhsDAh4BGBhoa3BzOi8va2V5
cy5vcGVucGdwLm9yZxYhBAYkCzLIGWRy7CxJixh1Hny95JVABQJg1CX7BQkDwsCa
AAoJEBh1Hny95JVA5m8H/iENaTD4j5QHfaHfiDIdxGx36GnETyRK0vAzr2b6pzG+
7VHNCm4ZfuMsXDJ1ZD8fjTipvg0f4w31xCQI0NgNdAqudBqE075Jwcr9pE9j8VN1
Nvejto01cgLHODbLPhokrkFz1K023VjCdy5RaVuCZ6ajTif7Kq+BEOE8TumYx4ly
zdhnh/9ICohqfVvEMh347wI36D7HuezHB773hOsHdqTy9T+0Qu0Vu+wud45MUy1f
vRF11OkJFtKL0bh4yMSGVY1xte1Mt/qC6rd43TDtAW3ekw1o/exh764kp7XXQsmP
wwe4Y040PZafcygJlEW9bBtjjxKnzDTvqeb5dMi6d7a0GENocmlzIDxvaWRldkBz
dW5vcy5pbmZvPokBaAQQAQgAUgYLCQcIAwIEFQgKAgMWAgECGwMCHgEYGGhrcHM6
Ly9rZXlzLm9wZW5wZ3Aub3JnFiEEBiQLMsgZZHLsLEmLGHUefL3klUAFAmDUJfsF
CQPCwJoACgkQGHUefL3klUB74wf8DSvT36bYZp7oqZ+35HNhTekJ2dbTzUhauF0S
+Z9R1AGnNnINgua75CyQGdNCIgcZxo4qG9sePl7SllQ9i0qhmiw0mzmvky8bAZQV
V/2Coc1C/81b+PI19VczYrbZC20jApsnbAIkKZgSh9XQoiLd3meY7G2lX2k6CXYL
xSeBEh+N3BU8vLxExm82U71Qzm43u0kA1TlbTSqpBvg/tfAzTCsYQLSlB6b4ZL2W
D6U7b7ZYF5oZNonVNWSHxpjUN3Evkta9xWS2+cgYQdlP1/ku5w5ZWwzmYG7awh0J
/YuSNIp6Ks6D/PSBduu6XbH+FJHaXmq+ZCKpNBh5EKH+GhOfq7QfQ2hyaXMgPHBv
cnRtYXN0ZXJAYnNkZm9yZ2UuY29tPokBaAQQAQgAUgYLCQcIAwIEFQgKAgMWAgEC
GwMCHgEYGGhrcHM6Ly9rZXlzLm9wZW5wZ3Aub3JnFiEEBiQLMsgZZHLsLEmLGHUe
fL3klUAFAmDUJfwFCQPCwJoACgkQGHUefL3klUC3GggAo4Y+hslaoV7Namp7qWYZ
Vei4ZwPfsYW7/HtmFORSGV8C8xR+LSkwzN1Hc7Qxvwv+DXuk7Hzd1Ag/xe8XhbNG
/NMrXENY/8ym9TRbxtrBIhQyhkyShSUT+N+g16GRNZKuNL2MOIHc/RCS/YyyaTtu
TzIxFbP7Gb2LO1LiiZsFVOGirHfxyiww7CAm3HXY2K4smOiKs6swZMpStVy3dd6A
BcB1LPGs3ywDglFfKCRbVmjsPgsi61r4kUBVO6ML7lAmPDXLXOa+7iAtBN479QxC
MVeH3Y3SMrvu61Vyf1xL79rIznU3u8C34zfxqsoIV0zCZe2YDLbFfLhZYqatYYEo
e7QjImNocmlzLmgiIDxjaHJpcy5oQHVsdGltYXRlZG5zLm5ldD6JAWgEEAEIAFIG
CwkHCAMCBBUICgIDFgIBAhsDAh4BGBhoa3BzOi8va2V5cy5vcGVucGdwLm9yZxYh
BAYkCzLIGWRy7CxJixh1Hny95JVABQJg1CX8BQkDwsCaAAoJEBh1Hny95JVAkUEH
/jkzYrRh7muqoebwEgVeULzPbAs/nYJm9SMME2ypB2FS8kusO7lE+33UJO7PhHkJ
0nJ+tPfP8UV+fCzVjKjabzpvUGuiMWKRZEK9xNoxwi/epOrRw87msHA2LPqEob+F
sVh09Nc58s75koUgSYp5h0FjsLK0+fwsQ6PtTfpY5W6JJVJRQnMwGKk5czrukBSM
79kJvphgul2xuzqo5K7rM98dL75AwCJmJZnbyXpUJIhtY/G01nURupBiQGgNixYs
Zeo6OR669TFrMRWxueXtlHD0WaX7JNSlR5uyzpVaDCH0Kxa6ozmZtD+a6dAXg630
zbLGHg51JIm38Uvi1i47Jaa0KCJILlIuIENvbW11bmljYXRpb25zIiA8ZG5zQGRu
c3dhdGNoLmNvbT6JAWgEEAEIAFIGCwkHCAMCBBUICgIDFgIBAhsDAh4BGBhoa3Bz
Oi8va2V5cy5vcGVucGdwLm9yZxYhBAYkCzLIGWRy7CxJixh1Hny95JVABQJg1CX8
BQkDwsCaAAoJEBh1Hny95JVAABoH/iOWA+9BKxLIAIFgW2nxTFDrGvbxXL/mVSFt
SOInKX8UqqfLCcikfpWLsj2D7mg5rKFMCu+31UYYlnrXl4YY1qruq0vh41L72qNy
yHYol+xW4BSbZXf2q2ph7+lnPsFoodw7acVun5F8M8NH0roo5AOSbgRlK69ZFIcq
fDEJdtk4oul7pqGArdeTCCdrSaeR3zrRN8P0PDOkGKSdlpeOE6XHnbbmAPZIhr/9
KsSpX1BGyipda3k5kOB4TsGVo+cRJMkK+GMpsZ+lJ7ZzRbjHbC+b52TiAIjMtXCK
3A3LrDUeMoJwvRKoO1tzquF6HqHJSg0ArZOvAB3BHlwUyUtA/o25AQ0EYNPMYQEI
ANFpucNRdYEOubTNluoK97N9JmDb0WRXPPow+3XfBom6ZBSrWqNBgqDbjxSsLB00
QXbA8EB5W/Oolp/0epwEtgNAxyKVPowE/un+rY1PqvGjeAR4gBhY9Za1Lg1Q3vnR
/WzsY7RIQCqhWUbfdGn1u6r/EgTBVrwUp4U/3ggfSz/PcUt4pUhlgxfYvjSjOgEZ
wbqaQIwWud11FKMARNAUJzvJL/fDGeKLMvgRUwynIDGzCq7e67hhEEo5jwkZ0gEl
8RxXHKFuYkbb/q7rpdifXYYT6QCFlEZhiRbtH5Us7kgKuRD2XUFEQnN4U/rxuydH
4XOP6iOhiZfYnK/y9HBeRCMAEQEAAYkBPAQYAQgAJgIbDBYhBAYkCzLIGWRy7CxJ
ixh1Hny95JVABQJg1CYkBQkDwsDDAAoJEBh1Hny95JVApBsH/iEg2ANRkHByfXB+
sH3PMf2Jsg5NSuj8OiNeKKGGIKCJkSAPjtv5rvKLNcvIcTR5Vnhr0e6AteFcK2te
iFWDmj0QuFoQNvIOHQ3nHBPSpai2Ubq12nvYfg4bYK28AMi4xPMssgQ8awFgAI2V
k9okq5XwC0Cc1MGhupEWYYSaFLIDQvFvRRSw1Lyc/W3SKa4d2dgesIPnB/rdv0Zq
u8ftsSmurKxA2hQeNIcn06Ew7AbWUIjFX/bDXJlg/3Sj/spU2ur23TmaADBKhT5P
DvfdaFTkk0SBfpN1j2S0DNXBHSrWvRp15zZmU4hwELiUY/H2/j/XpOGV3Q0i2iob
1hJ30C8=
=aMQi
-----END PGP PUBLIC KEY BLOCK-----

--=_9cc2793dafa4de5e14b797d4bbeb2c22--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?696947ca2bfea895d0062a88c06673f7>