Date: Fri, 05 Feb 2021 16:23:11 +0000 From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 253272] Page fault in _mca_init during boot Message-ID: <bug-253272-227@https.bugs.freebsd.org/bugzilla/>
index | next in thread | raw e-mail
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=253272 Bug ID: 253272 Summary: Page fault in _mca_init during boot Product: Base System Version: 12.2-RELEASE Hardware: Any OS: Any Status: New Severity: Affects Many People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: asomers@FreeBSD.org I saw the following panic during boot on a system running something close to 12.2-RELEASE. It doesn't happen every time. However, I suspect I've hit the same bug a few other times and not known, because the kernel normally reboots immediately since swap is not configured by this point. Fatal trap 12: page fault while in kernel mode cpuid = 26; apic id = 34 fault virtual address = 0xd0 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff8125a009 stack pointer = 0x28:0xfffffe0000b65f20 frame pointer = 0x28:0xfffffe0000b65f50 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = resume, IOPL = 0 current process = 11 (idle: cpu26) trap number = 12 panic: page fault cpuid = 26 time = 1 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0000b65be0 vpanic() at vpanic+0x17b/frame 0xfffffe0000b65c30 panic() at panic+0x43/frame 0xfffffe0000b65c90 trap_fatal() at trap_fatal+0x391/frame 0xfffffe0000b65cf0 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe0000b65d40 trap() at trap+0x286/frame 0xfffffe0000b65e50 calltrap() at calltrap+0x8/frame 0xfffffe0000b65e50 --- trap 0xc, rip = 0xffffffff8125a009, rsp = 0xfffffe0000b65f20, rbp = 0xfffffe0000b65f50 --- _mca_init() at _mca_init+0x5d9/frame 0xfffffe0000b65f50 init_secondary_tail() at init_secondary_tail+0xfd/frame 0xfffffe0000b65f80 init_secondary() at init_secondary+0x2d1/frame 0xfffffe0000b65ff0 KDB: enter: panic [ thread pid 11 tid 100029 ] Stopped at kdb_enter+0x37: movq $0,0x12bc1f6(%rip) The bug is caused because only one of my two CPUs reports support for the MCG_CMCI_P bit. On boot, it's random which CPU the kernel queries for support. If it queries the wrong one, then it doesn't allocate memory for the cmd state, but later calls cmci_setup() for the CPU that does support that bit. The following command shows the asymmetry between the CPUs: $ for x in $(jot $(sysctl -n hw.ncpu) 0) ; do sudo cpucontrol -m 0x179 /dev/cpuctl$x; done | uniq -c 16 MSR 0x179: 0x00000000 0x0f000c14 16 MSR 0x179: 0x00000000 0x0f000814 -- You are receiving this mail because: You are the assignee for the bug.help
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-253272-227>
