Date: Thu, 6 Jun 2019 10:50:16 -0400 From: Mark Johnston <markj@freebsd.org> To: Mark Saad <nonesuch@longcount.org> Cc: FreeBSD Hackers <freebsd-hackers@freebsd.org> Subject: Re: Kernel panic on 12-STABLE-r348203 amd64 E5-2690v4 with Cluster on die mode enabled Message-ID: <20190606145016.GA4116@raichu> In-Reply-To: <CAMXt9NYg5rK%2BjdAJKVwCaWGaE4GZ5W6Np3=0_RZQoz=%2B00uQxw@mail.gmail.com> References: <CAMXt9NYg5rK%2BjdAJKVwCaWGaE4GZ5W6Np3=0_RZQoz=%2B00uQxw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jun 06, 2019 at 10:43:26AM -0400, Mark Saad wrote: > All > I posed this yesterday but; I am not sure what happened. Here is the > short version. > I received two new Dell r630's each with a E5-2690 v4 . The E5-2690 v4 > has 14 Cores in two packages on the one chip. I don't remember the > exact topology however as a result the BIOS supports the NUMA / Memory > mode know as Cluster on Die were each package on the one chip shows up > as its own NUMA domain. The issue is this when enabled the box boots > 12-RELEASE a-ok. When I rebuilt 12.0-STABLE-r348203 it would panic > early in the boot process. > Here is a dump of the console > =============================== > Loading kernel... > /boot/kernel/kernel text=0x168d811 data=0x1cf968+0x768c80 > syms=[0x8+0x1778e8+0x8 / > +0x194f1d] > Loading configured modules... > /boot/kernel/ipmi.ko size 0x11e10 at 0x2645000 > loading required module 'smbus' > /boot/kernel/smbus.ko size 0x2ef0 at 0x2657000 > /boot/entropy size=0x1000 > /boot/kernel/cc_httcp.ko size 0x2330 at 0x265b000 > ---<<BOOT>>---c_hmodule 'smbus' > Copyright (c) 1992-2019 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD is a registered trademark of The FreeBSD Foundation. > FreeBSD 12.0-STABLE r348693 GENERIC amd64 > FreeBSD clang version 8.0.0 (tags/RELEASE_800/final 356365) (based on > LLVM 8.0.0) > panic: UMA zone "UMA Zones": Increase vm.boot_pages > cpuid = 0 > time = 1 > KDB: stack backtrace: > #0 0xffffffff80c16df7 at ??+0 > #1 0xffffffff80bcaccd at ??+0 > #2 0xffffffff80bcab23 at ??+0 > #3 0xffffffff80f0b03c at ??+0 > #4 0xffffffff80f08d8d at ??+0 > #5 0xffffffff80f0bb3d at ??+0 > #6 0xffffffff80f0b301 at ??+0 > #7 0xffffffff80f0b3d1 at ??+0 > #8 0xffffffff80f066c4 at ??+0 > #9 0xffffffff80f0543f at ??+0 > #10 0xffffffff80f23aef at ??+0 > #11 0xffffffff80f1133b at ??+0 > #12 0xffffffff80b619c8 at ??+0 > #13 0xffffffff8036a02c at ??+0 > Uptime: 1s > > =============================== > > The only solution was to mess with vm.boot_pages . I got it booted > with 128 as the value. > Also to be clear if I switched back to Home Snoop, Early Snoop the box > is fine. Its only > unhappy whit Cluster on Die and 12.0-STABLE . > > Anyone know whats going on ? Could you build a kernel with "options DIAGNOSTIC" configured and boot in verbose mode? The kernel should print its boot page allocations to the console. Then, compare the output with a boot with cluster on die disabled.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190606145016.GA4116>