Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 05 Feb 2021 16:23:11 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 253272] Page fault in _mca_init during boot
Message-ID:  <bug-253272-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D253272

            Bug ID: 253272
           Summary: Page fault in _mca_init during boot
           Product: Base System
           Version: 12.2-RELEASE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Many People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: asomers@FreeBSD.org

I saw the following panic during boot on a system running something close to
12.2-RELEASE. It doesn't happen every time.  However, I suspect I've hit the
same bug a few other times and not known, because the kernel normally reboo=
ts
immediately since swap is not configured by this point.

Fatal trap 12: page fault while in kernel mode
cpuid =3D 26; apic id =3D 34
fault virtual address =3D 0xd0
fault code =3D supervisor read data, page not present
instruction pointer =3D 0x20:0xffffffff8125a009
stack pointer =3D 0x28:0xfffffe0000b65f20
frame pointer =3D 0x28:0xfffffe0000b65f50
code segment =3D base 0x0, limit 0xfffff, type 0x1b
=3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags =3D resume, IOPL =3D 0
current process =3D 11 (idle: cpu26)
trap number =3D 12
panic: page fault
cpuid =3D 26
time =3D 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0000b65=
be0
vpanic() at vpanic+0x17b/frame 0xfffffe0000b65c30
panic() at panic+0x43/frame 0xfffffe0000b65c90
trap_fatal() at trap_fatal+0x391/frame 0xfffffe0000b65cf0
trap_pfault() at trap_pfault+0x4f/frame 0xfffffe0000b65d40
trap() at trap+0x286/frame 0xfffffe0000b65e50
calltrap() at calltrap+0x8/frame 0xfffffe0000b65e50
--- trap 0xc, rip =3D 0xffffffff8125a009, rsp =3D 0xfffffe0000b65f20, rbp =
=3D
0xfffffe0000b65f50 ---
_mca_init() at _mca_init+0x5d9/frame 0xfffffe0000b65f50
init_secondary_tail() at init_secondary_tail+0xfd/frame 0xfffffe0000b65f80
init_secondary() at init_secondary+0x2d1/frame 0xfffffe0000b65ff0
KDB: enter: panic
[ thread pid 11 tid 100029 ]
Stopped at kdb_enter+0x37: movq $0,0x12bc1f6(%rip)

The bug is caused because only one of my two CPUs reports support for the
MCG_CMCI_P bit.  On boot, it's random which CPU the kernel queries for supp=
ort.
 If it queries the wrong one, then it doesn't allocate memory for the cmd
state, but later calls cmci_setup() for the CPU that does support that bit.=
=20
The following command shows the asymmetry between the CPUs:

$ for x in $(jot $(sysctl -n hw.ncpu) 0) ; do sudo cpucontrol -m 0x179
/dev/cpuctl$x; done | uniq -c
16 MSR 0x179: 0x00000000 0x0f000c14
16 MSR 0x179: 0x00000000 0x0f000814

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-253272-227>