Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 6 Jun 2019 10:50:16 -0400
From:      Mark Johnston <markj@freebsd.org>
To:        Mark Saad <nonesuch@longcount.org>
Cc:        FreeBSD Hackers <freebsd-hackers@freebsd.org>
Subject:   Re: Kernel panic on 12-STABLE-r348203 amd64 E5-2690v4 with Cluster on die mode enabled
Message-ID:  <20190606145016.GA4116@raichu>
In-Reply-To: <CAMXt9NYg5rK%2BjdAJKVwCaWGaE4GZ5W6Np3=0_RZQoz=%2B00uQxw@mail.gmail.com>
References:  <CAMXt9NYg5rK%2BjdAJKVwCaWGaE4GZ5W6Np3=0_RZQoz=%2B00uQxw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jun 06, 2019 at 10:43:26AM -0400, Mark Saad wrote:
> All
>  I posed this yesterday but; I am not sure what happened. Here is the
> short version.
> I received two new Dell r630's each with a E5-2690 v4 . The E5-2690 v4
> has 14 Cores in two packages on the one chip. I  don't remember the
> exact topology however as a result the BIOS supports the NUMA / Memory
> mode know as Cluster on Die were each package on the one chip shows up
> as its own NUMA domain.  The issue is this when enabled the box boots
> 12-RELEASE a-ok. When I rebuilt 12.0-STABLE-r348203 it would panic
> early in the boot process.
> Here is a dump of the console
> ===============================
> Loading kernel...
> /boot/kernel/kernel text=0x168d811 data=0x1cf968+0x768c80
> syms=[0x8+0x1778e8+0x8   /
> +0x194f1d]
> Loading configured modules...
> /boot/kernel/ipmi.ko size 0x11e10 at 0x2645000
> loading required module 'smbus'
> /boot/kernel/smbus.ko size 0x2ef0 at 0x2657000
> /boot/entropy size=0x1000
> /boot/kernel/cc_httcp.ko size 0x2330 at 0x265b000
> ---<<BOOT>>---c_hmodule 'smbus'
> Copyright (c) 1992-2019 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
>         The Regents of the University of California. All rights reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 12.0-STABLE r348693 GENERIC amd64
> FreeBSD clang version 8.0.0 (tags/RELEASE_800/final 356365) (based on
> LLVM 8.0.0)
> panic: UMA zone "UMA Zones": Increase vm.boot_pages
> cpuid = 0
> time = 1
> KDB: stack backtrace:
> #0 0xffffffff80c16df7 at ??+0
> #1 0xffffffff80bcaccd at ??+0
> #2 0xffffffff80bcab23 at ??+0
> #3 0xffffffff80f0b03c at ??+0
> #4 0xffffffff80f08d8d at ??+0
> #5 0xffffffff80f0bb3d at ??+0
> #6 0xffffffff80f0b301 at ??+0
> #7 0xffffffff80f0b3d1 at ??+0
> #8 0xffffffff80f066c4 at ??+0
> #9 0xffffffff80f0543f at ??+0
> #10 0xffffffff80f23aef at ??+0
> #11 0xffffffff80f1133b at ??+0
> #12 0xffffffff80b619c8 at ??+0
> #13 0xffffffff8036a02c at ??+0
> Uptime: 1s
> 
> ===============================
> 
> The only solution was to mess with vm.boot_pages . I got it booted
> with 128 as the value.
> Also to be clear if I switched back to Home Snoop, Early Snoop the box
> is fine. Its only
> unhappy whit Cluster on Die and 12.0-STABLE .
> 
> Anyone know whats going on ?

Could you build a kernel with "options DIAGNOSTIC" configured and boot
in verbose mode?  The kernel should print its boot page allocations to
the console.  Then, compare the output with a boot with cluster on die
disabled.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190606145016.GA4116>