From nobody Tue Jan 18 08:25:46 2022 X-Original-To: stable@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id D12F0196F080 for ; Tue, 18 Jan 2022 08:25:49 +0000 (UTC) (envelope-from fluffy@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4JdMKn5JTlz3mJk; Tue, 18 Jan 2022 08:25:49 +0000 (UTC) (envelope-from fluffy@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1642494349; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=TEsBEBm80TImnhmuHPLoaXV1KfAPMjhpqaoGlmcE/jo=; b=ntPQO4ix05s6JHvXkn6H+FKk0/dHCXM74Z3Efnr/pUOxjHmt2LmMpthaq6NmOrm6pmw6kg aFdSJjjBC1YmUA7s/I+GxBjWzftvG6zk9i5XtU/mxmeMcNmeZal+alJaDCHjGpOgN5Sq01 WuZoh0OyL7kI46Fhl9u4euiUzrowomBsjIdNlvcqXv8zTQ228jQQx3VKK9QilAze9ad8Hl mYjFbc40PdwdA/TX8NaCeLTWgSDS0wcleyl1hXXyY2anyoTX6eJmKObfq/eTL1kygWndj0 xlJV7QZ8tfL8eTYxUl47ITjb3uwXGXKOIRrUNmqPSIXw+d5jy67+/zK7Yafzkw== Received: from [192.168.0.149] (208-187-187-93.customer.senator-telecom.com [93.187.187.208]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) (Authenticated sender: fluffy) by smtp.freebsd.org (Postfix) with ESMTPSA id BD15324CD4; Tue, 18 Jan 2022 08:25:48 +0000 (UTC) (envelope-from fluffy@FreeBSD.org) Date: Tue, 18 Jan 2022 11:25:46 +0300 From: Dima Panov To: Willem Jan Withagen , Tomoaki AOKI Cc: Eugene Grosbein , stable@freebsd.org Message-ID: In-Reply-To: <8096cd7e-bc11-5fa7-cc96-6bcdf1278ffc@digiware.nl> References: <8ac447b6-eaaf-0a8f-da69-27db15dd6f55@digiware.nl> <2ec39eef-d2e2-c55e-b032-43de86e71a57@digiware.nl> <3d87a0b3-7bed-453b-df23-4a258ea46fbb@grosbein.net> <802cf542-979d-b8e1-3f71-616b026eb852@grosbein.net> <48f57581-1f39-9f57-0e44-19c2c2bb3aeb@digiware.nl> <78a47e83-a339-0c79-0ee0-9e55be80c78b@grosbein.net> <2f49fd20-cb5a-5ccc-7f9b-0229bc8e14b1@grosbein.net> <86766549-be58-1125-867e-ae4c415e1bb4@digiware.nl> <7903a41f-94ba-2caf-9270-a1bd9582c600@grosbein.net> <229c3042-3297-7903-9778-9b55d5c3f998@digiware.nl> <71d1e25c-f1f6-2371-486e-2382d67a3fc5@grosbein.net> <9d73e9ba-af23-ea90-e5fa-cf3a04a8513b@grosbein.net> <7c5d9cc0-be85-c855-a294-71a93f2c5440@digiware.nl> <20220118021432.197aa1241d53b1cba6e8c562@dec.sakura.ne.jp> <8096cd7e-bc11-5fa7-cc96-6bcdf1278ffc@digiware.nl> Subject: Re: Trying to boot a supermicro H8DMT board List-Id: Production branch of FreeBSD source code List-Archive: https://lists.freebsd.org/archives/freebsd-stable List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org MIME-Version: 1.0 Content-Type: multipart/signed; boundary="61e6798a_327b23c6_93a"; protocol="application/pgp-signature" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1642494349; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=TEsBEBm80TImnhmuHPLoaXV1KfAPMjhpqaoGlmcE/jo=; b=slGbKblStncrVlDPlzlAvZcXDo0qjA8nikxP28Ds5PeeUMOsxFmpsgtVV/I0FqXRTuh6Ru UajPBe1VU8rSUsMynyMxdkKxrGwnjWsw2SpDGYgHYkMEFNUcchfFApGmQnXRq6M6Cdg3dK iawAsMROz+9miXx5qoHfGB/nsYvCJ2UIiiWzYVBNoC8AM6NEsjz3Z7E/mfGO1GRxnMlS8Z Y3HDTdsVVe/sDsoEPqu+FQGX45CinNaFbE1c9XvnxZSCaUpEm9VtsN7eegiOCXUWT+6BTu NFRDVa6VPeiQv08GdKGu4EYqBJ3SsUufirXhL3dL6kntBUCl7xovYLxGyJZgBg== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1642494349; a=rsa-sha256; cv=none; b=jmnAfIIOLPwx/hWdhqS/fc+0b0fpwph9070U/93wuUbmhEocSwuXcSdaV+SXAbg1J23st+ P1b7HRx3CCAZefU950wcH8tOKaZDdkb+bSf1kvZA+49yXOWl2biPRceouxEh3+QNLCUn7q 0Z5WDbQ3VDyUxRoII0Km7gWqPbLxN+Ndq7ce4X1BwWEhn+G09Ca4gT/IZ+/z88u8Ewqh50 0JfYgCpDb7x1LPiC7guejAA9xXFwgz0PllIZ7B0EJLTxGqjWi/Q0efjN6GnLNbvlVTRWhi 7K9SLLbQess/zizF5Oot0PNkuRfo9lXWXx7lYz8tOxmob3opXXI45Vsk+U+BIA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N --61e6798a_327b23c6_93a Content-Type: multipart/alternative; boundary="61e6798a_6b8b4567_93a" --61e6798a_6b8b4567_93a Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Moin=21 As mobo manual says, u could use cpu1/1b, cpu1/1a, cpu2/1b, cpu2/1a due t= o hard dependency on pair interleaving support. Slots fills from far to close order from each cpu by pairs only -- Dima. (desktop, kde, x11, office, ports-secteam)=40=46reeBSD team (fluffy=40=46reeBSD.org, https://t.me/dima=5Fpanov) > On Monday, Jan 17, 2022 at 9:47 PM, Willem Jan Withagen wrote: > On 17-1-2022 18:14, Tomoaki AOKI wrote: > > On Mon, 17 Jan 2022 15:04:16 +0100 > > Willem Jan Withagen wrote: > > > > > On 17-1-2022 14:46, Eugene Grosbein wrote: > > > > 17.01.2022 20:24, Willem Jan Withagen wrote: > > > > > > > > > > Well, perform independent hardware (memory) testing with some= thing like memtest86+ > > > > > > and if it is all right, you show ask someone more knowledgeab= le. Maybe CC: arch=40freebsd.org > > > > > Perhaps should have done that when I started, but supplier assu= red me that > > > > > the they just retired the boards with out any issues. > > > > > Memtest86 found the faulty DIMM in 30 secs... > > > > > > > > > > Not sure if we could/want educate vm=5Fmem=5Finit() to actually= detect this. > > > > > It is still in the part where everthing is still running on the= first CPU. > > > > > Making things a bit easier to understand what is going on. > > > > > > > > > > Lets see if the box will run on 3 DIMMs for the rime being. > > > > > Then figure out with DMIdecode what we need expand again. > > > > Is it ECC memory or non-ECC=3F > > > > The kernel already have full memory testing performed at boot tim= e > > > > unless disabled with another loader knob: > > > > > > > > hw.memtest.tests=3D0 > > > > > > > > Try booting it with memory testing disabled and without hw.physme= m limitation. > > > > Maybe it will boot. > > > > > > > > With ECC, it could be hardware interrupt while kernel runs that t= est > > > > and wrong in-kernel processing of the interrupt. > > > Swapped the DIMM with 3 others, but still the same errors. > > > Then I changed DIMM slot, and the errors went away. > > > So definitely a hardware issue > > > > > > when booted =46reeBSD reported already only 12Gb in system ( there = are 4 > > > 4GB dimms) > > > Using 8Gb. DIMMs are ECC. > > > But then still it would only boot when mem set to 8G. > > > > > > Waiting for memtest to finish at least one pass. > > > Usually that will take quite some time. > > > > > > --WjW > > > > > > > > Not sure this is the case, but some motherboards have severe limitati= on > > about DIMM slot usage, if not fully used. > > > > =46or example, assuming slot No. are B0-0, 1, 2, 3 and B1-0, 1, 2, 3,= > > > > *Must use =22interleaved. If 4 in 8 slots are to be used, > > B0-0, B0-2, B1-0, B1-2 shall be used. > > (Some forced B0-1, B0-3, B1-1, B1-3, IIRC) > > > > *Must NOT use =22interleaved. > > B0-0, B0-1, B1-0, B1-1 shall be used. > > > > *Must NOT use B1 unless B0 is full of DIMs. > > B0-0. B0-1, B0-2, B0-3 shall be used. > > > > and so on, depending on motherboard vendor (at worst, per model.) > > Yup, I know... I used the board in the configuration I got it. > And its a DUAL processor board with 2 opterons. > The config works correct for the first Opteron (Called CPU1) > using slots: CPU1/DIMM1A and CPU1/DIMM1B > But on the second CPU I have to use the third slot.... > so using slots: CPU2/DIMM1B and CPU2/DIMM2B > > And my memtest86 has complete 1 full pass over 16G without errors. > So I'm guessing that the order is not majorly picky. > > But you are correct in noting this, so I will read up ont this in the > manual. > > Thanx, > --WjW > > --61e6798a_6b8b4567_93a Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline <= meta name=3D=22viewport=22 content=3D=22width=3Ddevice-width, initial-sca= le=3D1.0, user-scalable=3Dno=22> 3D=22=22
Moin=21

As mobo manual says, u could use cpu1/1b, cpu1/1a, cpu2/1b, cpu2/1a due = to hard dependency on pair interleaving support.=C2=A0
Slots fi= lls from far to close order from each cpu by pairs only

--
Dima. (desktop, kde, x11, office, ports-secteam)= =40=46reeBSD team
(fluffy=40=46reeBSD.org, https://t.me/dima=5F= panov)

On Mond= ay, Jan 17, 2022 at 9:47 PM, Willem Jan Withagen <wjw=40digiware.nl> wrote:
On 1= 7-1-2022 18:14, Tomoaki AOKI wrote:
On = Mon, 17 Jan 2022 15:04:16 +0100
Willem Jan Withagen <wjw=40digiwar= e.nl> wrote:

On 17-1-2022 14:46= , Eugene Grosbein wrote:
17.01.2022 20:= 24, Willem Jan Withagen wrote:

Well, perform independent hardware (memory) te= sting with something like memtest86+
and if it is all right, you show= ask someone more knowledgeable. Maybe CC: arch=40freebsd.org
Perhaps should have done that when I started, but supplier assured = me that
the they just retired the boards with out any issues.
Mem= test86 found the faulty DIMM in 30 secs...

Not sure if we could/= want educate vm=5Fmem=5Finit() to actually detect this.
It is still i= n the part where everthing is still running on the first CPU.
Making = things a bit easier to understand what is going on.

Lets see if = the box will run on 3 DIMMs for the rime being.
Then figure out with = DMIdecode what we need expand again.
Is it ECC memory or= non-ECC=3F
The kernel already have full memory testing performed at = boot time
unless disabled with another loader knob:

hw.memte= st.tests=3D0

Try booting it with memory testing disabled and wit= hout hw.physmem limitation.
Maybe it will boot.

With ECC, it= could be hardware interrupt while kernel runs that test
and wrong in= -kernel processing of the interrupt.
Swapped the DIMM wi= th 3 others, but still the same errors.
Then I changed DIMM slot, and= the errors went away.
So definitely a hardware issue

when b= ooted =46reeBSD reported already only 12Gb in system ( there are 4
4G= B dimms)
Using 8Gb. DIMMs are ECC.
But then still it would only b= oot when mem set to 8G.

Waiting for memtest to finish at least o= ne pass.
Usually that will take quite some time.

--WjW
=

Not sure this is the case, but some motherboards ha= ve severe limitation
about DIMM slot usage, if not fully used.
<= br>=46or example, assuming slot No. are B0-0, 1, 2, 3 and B1-0, 1, 2, 3, =

*Must use =22interleaved. If 4 in 8 slots are to be used,
B0= -0, B0-2, B1-0, B1-2 shall be used.
(Some forced B0-1, B0-3, B1-1, B1= -3, IIRC)

*Must NOT use =22interleaved.
B0-0, B0-1, B1-0, B1= -1 shall be used.

*Must NOT use B1 unless B0 is full of DIMs. B0-0. B0-1, B0-2, B0-3 shall be used.

and so on, depending on = motherboard vendor (at worst, per model.)

Yup, I kn= ow... I used the board in the configuration I got it.
And its a DUAL = processor board with 2 opterons.
The config works correct for the fir= st Opteron (Called CPU1)
using slots: CPU1/DIMM1A and CPU1/DIMM1B But on the second CPU I have to use the third slot....
so using slot= s: CPU2/DIMM1B and CPU2/DIMM2B

And my memtest86 has complete 1 f= ull pass over 16G without errors.
So I'm guessing that the order is n= ot majorly picky.

But you are correct in noting this, so I will = read up ont this in the
manual.

Thanx,
--WjW

<= br>
--61e6798a_6b8b4567_93a-- --61e6798a_327b23c6_93a Content-Type: application/pgp-signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: Canary PGP V3 iQJVBAABCgA/OBxEaW1hIFBhbm92IChGcmVlQlNELk9SRyBDb21taXR0ZXIpIDxm bHVmZnlARnJlZUJTRC5PUkc+BQJh5nmKAAoJEPuLoJ3VOY8p4p0QANZ8ZB2cnK2J YC5yOE2A/T6Mw3L+efPTzg83KYTDxeW4+MpDUzJv4ubmQDUF2R8qs4GPZXQmdtPW dvQ9Gjr909LQV2DiJo6jn9LXx+9yLuwDFZwp3P10p/hATakwQiwZxp2qrqQvVgBb 7oXgWe7Y379dnNc1LGg/GS4rbNQ3WCKLM6BMyAyKYXqWFvD5PDG5w4vIR9g7bszh L0q2oR7y3F2w4Icn7We7OmG0saZl7YQZ8GFdIiiKLUhc7R+kxJvRZErPUwECHei7 IDU5TXijPQg9Bc5JbJXcrFSxBL9XwGHKy/m9xw/OGFe0Gj4Qo9zsT0W46VT1smJS rF04EI5gowgB9XYI9YAaeGFEq7I8ywRBHQMABXKfJIq3ltyTY7m5JevDnSHKdPUG ccEdAQnHcM59z/J1fvwoebq5tKvMDre613fTIv4x3aFNi8O+sfpjIyME1EfcQ8p3 DEVdm7L94YBActUgQ6do3gs5MqV2rIhcnWrfXtsGhZWIx/53SIL4q7kRqsfJHdso 9qEu//nRT/sA6Cd1dj7xCJKtw53aB5UpZ86xhpgYUd2SkB/scV12A90UVY3DekBp Kj1WSllgXkVs3FVu3S5CQ4Qqb1XQS9sSvWLEdqLrI0PhYV8Iby25Ze2wTDcp8NAt yCjqHabd8AiYDyHs82KuAFdae91aIgvk =73Jv -----END PGP SIGNATURE----- --61e6798a_327b23c6_93a--