Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 4 Feb 2021 13:34:13 -0800
From:      Matthew Macy <mmacy@freebsd.org>
To:        Alan Somers <asomers@freebsd.org>
Cc:        FreeBSD Stable ML <stable@freebsd.org>
Subject:   Re: Page fault in _mca_init during startup
Message-ID:  <CAPrugNofKuCZmdkb41j%2Bu%2BX0BPV-cK8WjgrBu7akuD=XezseMw@mail.gmail.com>
In-Reply-To: <CAOtMX2imwP3x-8LBKGFvMJ%2BjuD%2BsH_02yzs9XvMcCHY=jJs86A@mail.gmail.com>
References:  <CAOtMX2imwP3x-8LBKGFvMJ%2BjuD%2BsH_02yzs9XvMcCHY=jJs86A@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Feb 4, 2021 at 1:31 PM Alan Somers <asomers@freebsd.org> wrote:
>
> After upgrading a machine to FreeBSD, 12.2, it hit the following panic on
> its first reboot.  I suspect that a few other servers have hit this too,
> but since it happens before swap is mounted there are no core dumps, and
> they usually reboot immediately.  The code in question hasn't changed since
> 2018.  The panic happened in cmci_monitor at line 930.  Does anybody have
> any suggestions for how I could debug further?  I can't readily reproduce
> it, and I can't dump core, but I'd like to investigate it any way I can.
> The server in question has dual Xeon Gold 6142 CPUs.
>

I can't actually help :( but I can add a +1  with similar hardware or
equivalent specs. It's not frequent, but it's often enough to be
annoying.
-M

> if (!(ctl & MC_CTL2_CMCI_EN))
> /* This bank does not support CMCI. */
> return;
>
> cc = &cmc_state[PCPU_GET(cpuid)][i];    // <- panic here
>
> /* Determine maximum threshold. */
>
>
> Fatal trap 12: page fault while in kernel mode
> cpuid = 26; apic id = 34
> fault virtual address = 0xd0
> fault code = supervisor read data, page not present
> instruction pointer = 0x20:0xffffffff8125a009
> stack pointer        = 0x28:0xfffffe0000b65f20
> frame pointer        = 0x28:0xfffffe0000b65f50
> code segment = base 0x0, limit 0xfffff, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags = resume, IOPL = 0
> current process = 11 (idle: cpu26)
> trap number = 12
> panic: page fault
> cpuid = 26
> time = 1
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfffffe0000b65be0
> vpanic() at vpanic+0x17b/frame 0xfffffe0000b65c30
> panic() at panic+0x43/frame 0xfffffe0000b65c90
> trap_fatal() at trap_fatal+0x391/frame 0xfffffe0000b65cf0
> trap_pfault() at trap_pfault+0x4f/frame 0xfffffe0000b65d40
> trap() at trap+0x286/frame 0xfffffe0000b65e50
> calltrap() at calltrap+0x8/frame 0xfffffe0000b65e50
> --- trap 0xc, rip = 0xffffffff8125a009, rsp = 0xfffffe0000b65f20, rbp =
> 0xfffffe0000b65f50 ---
> _mca_init() at _mca_init+0x5d9/frame 0xfffffe0000b65f50
> init_secondary_tail() at init_secondary_tail+0xfd/frame 0xfffffe0000b65f80
> init_secondary() at init_secondary+0x2d1/frame 0xfffffe0000b65ff0
> KDB: enter: panic
> [ thread pid 11 tid 100029 ]
> Stopped at      kdb_enter+0x37: movq    $0,0x12bc1f6(%rip)
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAPrugNofKuCZmdkb41j%2Bu%2BX0BPV-cK8WjgrBu7akuD=XezseMw>