From owner-freebsd-bugs@freebsd.org Fri Jun 19 20:15:40 2020 Return-Path: Delivered-To: freebsd-bugs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 2D4EE3381D1 for ; Fri, 19 Jun 2020 20:15:40 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (mailman.nyi.freebsd.org [IPv6:2610:1c1:1:606c::50:13]) by mx1.freebsd.org (Postfix) with ESMTP id 49pVRb6cZ6z3WTR for ; Fri, 19 Jun 2020 20:15:39 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id E04953380F9; Fri, 19 Jun 2020 20:15:39 +0000 (UTC) Delivered-To: bugs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id DE5B63385BC for ; Fri, 19 Jun 2020 20:15:39 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49pVRb5J45z3Whk for ; Fri, 19 Jun 2020 20:15:39 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id AC2BA20A25 for ; Fri, 19 Jun 2020 20:15:39 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 05JKFd5P069763 for ; Fri, 19 Jun 2020 20:15:39 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 05JKFd6h069762 for bugs@FreeBSD.org; Fri, 19 Jun 2020 20:15:39 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 247432] panic: general protection fault in ucp_start_pmc for uncore on E5504 processor Date: Fri, 19 Jun 2020 20:15:39 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 12.1-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: dgmorris@earthlink.net X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Jun 2020 20:15:40 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D247432 Bug ID: 247432 Summary: panic: general protection fault in ucp_start_pmc for uncore on E5504 processor Product: Base System Version: 12.1-RELEASE Hardware: amd64 OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: dgmorris@earthlink.net Internal Dell FreeBSD-based product testing includes a pmc test that among other things does: for i in $(pmccontrol -L | grep -v -e "IAF" -e "IAP" -e "TSC" -e "U= NC" \ -e "UCF" -e "UCP" -e "SOFT"); do pmcstat -p $i ls process_cnt=3D`echo $?` # Error 71 is returned if counter is system specific and # not process specific so skip then if [ $process_cnt -ne 0 ] && [ $process_cnt -ne 71 ]; then atf_fail "PMC counter not working" fi done This produces a panic on E5504 processor systems. Reproducing locally to narrow it down, it became apparent that the uncore options are triggering the panic: mem_uncore_retired.local_dram mem_uncore_retired.other_core_l2_hitm mem_uncore_retired.remote_cache_local_home_hit mem_uncore_retired.remote_dram mem_uncore_retired.uncacheable Panic information: Fatal trap 9: general protection fault while in kernel mode cpuid =3D 0; apic id =3D 00 instruction pointer =3D 0x20:0xffffffff82c30604 stack pointer =3D 0x28:0xfffffe0044204640 frame pointer =3D 0x28:0xfffffe0044204640 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D resume, IOPL =3D 0 current process =3D 1115 (pmcstat) trap number =3D 9 panic: general protection fault cpuid =3D 0 time =3D 1592596633 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0044204= 350 vpanic() at vpanic+0x19d/frame 0xfffffe00442043a0 panic() at panic+0x43/frame 0xfffffe0044204400 trap_fatal() at trap_fatal+0x39c/frame 0xfffffe0044204460 trap() at trap+0x6c/frame 0xfffffe0044204570 calltrap() at calltrap+0x8/frame 0xfffffe0044204570 --- trap 0x9, rip =3D 0xffffffff82c30604, rsp =3D 0xfffffe0044204640, rbp = =3D 0xfffffe0044204640 --- ucp_start_pmc() at ucp_start_pmc+0xa4/frame 0xfffffe0044204640 pmc_hook_handler() at pmc_hook_handler+0xfda/frame 0xfffffe0044204700 sched_switch() at sched_switch+0x691/frame 0xfffffe00442047d0 mi_switch() at mi_switch+0xe2/frame 0xfffffe0044204800 sleepq_catch_signals() at sleepq_catch_signals+0x425/frame 0xfffffe00442048= 50 sleepq_wait_sig() at sleepq_wait_sig+0xf/frame 0xfffffe0044204880 _sleep() at _sleep+0x23a/frame 0xfffffe00442048f0 sbwait() at sbwait+0x4c/frame 0xfffffe0044204910 soreceive_generic() at soreceive_generic+0x286/frame 0xfffffe00442049e0 soreceive() at soreceive+0x44/frame 0xfffffe0044204a00 dofileread() at dofileread+0x95/frame 0xfffffe0044204a40 sys_read() at sys_read+0xc1/frame 0xfffffe0044204ab0 amd64_syscall() at amd64_syscall+0x364/frame 0xfffffe0044204bf0 fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0044204bf0 --- syscall (3, FreeBSD ELF64, sys_read), rip =3D 0x80095adfa, rsp =3D 0x7fffffffe3f8, rbp =3D 0x7fffffffe470 --- Uptime: 1m4s Dumping 435 out of 6085 MB:..4%..12%..23%..34%..41%..52%..63%..74%..81%..92% __curthread () at /usr/src/sys/amd64/include/pcpu.h:234 234 __asm("movq %%gs:%P1,%0" : "=3Dr" (td) : "n" (OFFSETOF_CURTHREAD)); (kgdb) bt #0 __curthread () at /usr/src/sys/amd64/include/pcpu.h:234 #1 doadump (textdump=3D1) at /usr/src/sys/kern/kern_shutdown.c:371 #2 0xffffffff80bdf95d in kern_reboot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c:451 #3 0xffffffff80bdfde9 in vpanic (fmt=3D, ap=3D) at /usr/src/sys/kern/kern_shutdown.c:877 #4 0xffffffff80bdfbe3 in panic (fmt=3D) at /usr/src/sys/kern/kern_shutdown.c:804 #5 0xffffffff810c93cc in trap_fatal (frame=3D0xfffffe0044204580, eva=3D0) = at /usr/src/sys/amd64/amd64/trap.c:943 #6 0xffffffff810c87dc in trap (frame=3D0xfffffe0044204580) at /usr/src/sys/amd64/amd64/trap.c:221 #7 #8 0xffffffff82c30604 in wrmsr (msr=3D960, newval=3D) at /usr/src/sys/amd64/include/cpufunc.h:433 #9 ucp_start_pmc (cpu=3D, ri=3D0) at /usr/src/sys/dev/hwpmc/hwpmc_uncore.c:707 #10 0xffffffff82c2556a in pmc_process_csw_in (td=3D) at /usr/src/sys/dev/hwpmc/hwpmc_mod.c:1492 #11 pmc_hook_handler (td=3D0xfffff80009bf75e0, function=3D, arg=3D) at /usr/src/sys/dev/hwpmc/hwpmc_mod.c:2210 #12 0xffffffff80c119f1 in sched_switch (td=3D0xfffff80009bf75e0, newtd=3D, flags=3D) at /usr/src/sys/kern/sched_ule.c:2120 #13 0xffffffff80beb922 in mi_switch (flags=3D260, newtd=3D0x0) at /usr/src/sys/kern/kern_synch.c:452 #14 0xffffffff80c3c265 in sleepq_catch_signals (wchan=3D0xfffff800097c053c, pri=3D-1) at /usr/src/sys/kern/subr_sleepqueue.c:528 #15 0xffffffff80c3bd9f in sleepq_wait_sig (wchan=3D0xfffff8000fbaf500, pri= =3D0) at /usr/src/sys/kern/subr_sleepqueue.c:719 #16 0xffffffff80beb34a in _sleep (ident=3D0xfffff800097c053c, lock=3D0xfffff800097c04c0, priority=3D360, wmesg=3D0xffffffff81258462 "sbwa= it", sbt=3D0, pr=3D0, flags=3D0) at /usr/src/sys/kern/kern_synch.c:215 #17 0xffffffff80c77cec in sbwait (sb=3D0x100000000) at /usr/src/sys/kern/uipc_sockbuf.c:267 #18 0xffffffff80c7d176 in soreceive_generic (so=3D, psa=3D0x= 0, uio=3D0xfffffe0044204a50, mp0=3D0x0, controlp=3D0x0, flagsp=3D0x0) at /usr/src/sys/kern/uipc_socket.c:1813 #19 0xffffffff80c7ef94 in soreceive (so=3D0xfffff8000fbaf500, psa=3D0x10000= 0000, uio=3D0x0, mp0=3D0x3c0, controlp=3D0x43200f, flagsp=3D0x0) at /usr/src/sys/kern/uipc_socket.c:2563 #20 0xffffffff80c4c505 in fo_read (fp=3D, uio=3D, active_cred=3D0x0, flags=3D, td=3D) at /usr/src/sys/sys/file.h:313 #21 dofileread (td=3D, fd=3D5, fp=3D, auio=3D0xfffffe0044204a50, offset=3D5, flags=3D) at /usr/src/sys/kern/sys_generic.c:368 #22 0xffffffff80c4c081 in kern_readv (td=3D, fd=3D5, auio=3D= ) at /usr/src/sys/kern/sys_generic.c:289 #23 sys_read (td=3D0xfffff80009bf75e0, uap=3D) at /usr/src/sys/kern/sys_generic.c:205 #24 0xffffffff810c9f84 in syscallenter (td=3D0xfffff80009bf75e0) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:135 #25 amd64_syscall (td=3D0xfffff80009bf75e0, traced=3D0) at /usr/src/sys/amd64/amd64/trap.c:1186 #26 #27 0x000000080095adfa in ?? () Backtrace stopped: Cannot access memory at address 0x7fffffffe3f8 (kgdb) frame 9 #9 ucp_start_pmc (cpu=3D, ri=3D0) at /usr/src/sys/dev/hwpmc/hwpmc_uncore.c:707 707 wrmsr(SELECTSEL(uncore_cputype) + ri, evsel); (kgdb) p ri $1 =3D 0 (kgdb) p uncore_cputype $2 =3D PMC_CPU_INTEL_COREI7 (kgdb) p evsel $3 =3D 4399119 (kgdb) p/x evsel $4 =3D 0x43200f Note that the 960 passed to wrmsr does properly correspond to 0x3c0 (UCP_EVSEL0) as SELECTSEL(PMC_CPU_INTEL_COREI7) should be returning. This reproduces 100% for me on a Z600 Workstation with: CPU: Intel(R) Xeon(R) CPU E5504 @ 2.00GHz (1995.04-MHz K8-class = CPU) Origin=3D"GenuineIntel" Id=3D0x106a5 Family=3D0x6 Model=3D0x1a Steppi= ng=3D5 =20 Features=3D0xbfebfbff =20 Features2=3D0x9ce3bd AMD Features=3D0x28100800 AMD Features2=3D0x1 VT-x: PAT,HLT,MTF,PAUSE,EPT,VPID TSC: P-state invariant, performance statistics I suspect it does for any other E5504 system as well. This is a dual socket motherboard with a single socket populated, but based on the Intel Software Manuals, the uncore stuff should be within the package - so I don't think t= hat should matter (just reporting it in case it rings a bell). Older hardware, I know - but figured it was worth reporting. --=20 You are receiving this mail because: You are the assignee for the bug.=