Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Jan 2022 02:14:32 +0900
From:      Tomoaki AOKI <junchoon@dec.sakura.ne.jp>
To:        Willem Jan Withagen <wjw@digiware.nl>
Cc:        Eugene Grosbein <eugen@grosbein.net>, stable@freebsd.org
Subject:   Re: Trying to boot a supermicro H8DMT board
Message-ID:  <20220118021432.197aa1241d53b1cba6e8c562@dec.sakura.ne.jp>
In-Reply-To: <7c5d9cc0-be85-c855-a294-71a93f2c5440@digiware.nl>
References:  <8ac447b6-eaaf-0a8f-da69-27db15dd6f55@digiware.nl> <YeFd9mdywqEuIJ0H@in-addr.com> <2ec39eef-d2e2-c55e-b032-43de86e71a57@digiware.nl> <3d87a0b3-7bed-453b-df23-4a258ea46fbb@grosbein.net> <d8e6c746-3ec1-9c21-d5e7-44dc9600bb0b@digiware.nl> <802cf542-979d-b8e1-3f71-616b026eb852@grosbein.net> <48f57581-1f39-9f57-0e44-19c2c2bb3aeb@digiware.nl> <a0315a54-aefa-a3a3-2ac3-94d6e9410961@grosbein.net> <eac3dcef-9183-8fa5-b0de-b70650235960@digiware.nl> <78a47e83-a339-0c79-0ee0-9e55be80c78b@grosbein.net> <d0dbce19-f5e5-3ff0-99d6-55a9c94a4b48@digiware.nl> <2f49fd20-cb5a-5ccc-7f9b-0229bc8e14b1@grosbein.net> <86766549-be58-1125-867e-ae4c415e1bb4@digiware.nl> <7903a41f-94ba-2caf-9270-a1bd9582c600@grosbein.net> <229c3042-3297-7903-9778-9b55d5c3f998@digiware.nl> <71d1e25c-f1f6-2371-486e-2382d67a3fc5@grosbein.net> <c6588210-ac68-f081-f0a4-85669aa84eb3@digiware.nl> <9d73e9ba-af23-ea90-e5fa-cf3a04a8513b@grosbein.net> <7c5d9cc0-be85-c855-a294-71a93f2c5440@digiware.nl>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 17 Jan 2022 15:04:16 +0100
Willem Jan Withagen <wjw@digiware.nl> wrote:

> On 17-1-2022 14:46, Eugene Grosbein wrote:
> > 17.01.2022 20:24, Willem Jan Withagen wrote:
> >
> >>> Well, perform independent hardware (memory) testing with something like memtest86+
> >>> and if it is all right, you show ask someone more knowledgeable. Maybe CC: arch@freebsd.org
> >> Perhaps should have done that when I started, but supplier assured me that
> >> the they just retired the boards with out any issues.
> >> Memtest86 found the faulty DIMM in 30 secs...
> >>
> >> Not sure if we could/want educate vm_mem_init() to actually detect this.
> >> It is still in the part where everthing is still running on the first CPU.
> >> Making things a bit easier to understand what is going on.
> >>
> >> Lets see if the box will run on 3 DIMMs for the rime being.
> >> Then figure out with DMIdecode what we need expand again.
> > Is it ECC memory or non-ECC?
> > The kernel already have full memory testing performed at boot time
> > unless disabled with another loader knob:
> >
> > hw.memtest.tests=0
> >
> > Try booting it with memory testing disabled and without hw.physmem limitation.
> > Maybe it will boot.
> >
> > With ECC, it could be hardware interrupt while kernel runs that test
> > and wrong in-kernel processing of the interrupt.
> 
> Swapped the DIMM with 3 others, but still the same errors.
> Then I changed DIMM slot, and the errors went away.
> So definitely a hardware issue
> 
> when booted FreeBSD reported already only 12Gb in system ( there are 4 
> 4GB dimms)
> Using 8Gb. DIMMs are ECC.
> But then still it would only boot when mem set to 8G.
> 
> Waiting for memtest to finish at least one pass.
> Usually that will take quite some time.
> 
> --WjW
> 
> 

Not sure this is the case, but some motherboards have severe limitation
about DIMM slot usage, if not fully used.

For example, assuming slot No. are B0-0, 1, 2, 3 and B1-0, 1, 2, 3,

 *Must use "interleaved. If 4 in 8 slots are to be used,
  B0-0, B0-2, B1-0, B1-2 shall be used.
  (Some forced B0-1, B0-3, B1-1, B1-3, IIRC)

 *Must NOT use "interleaved.
  B0-0, B0-1, B1-0, B1-1 shall be used.

 *Must NOT use B1 unless B0 is full of DIMs.
  B0-0. B0-1, B0-2, B0-3 shall be used.

and so on, depending on motherboard vendor (at worst, per model.)


-- 
Tomoaki AOKI    <junchoon@dec.sakura.ne.jp>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20220118021432.197aa1241d53b1cba6e8c562>