Date: Wed, 12 Mar 2008 03:07:51 GMT From: Adrian Chadd <adrian@FreeBSD.org> To: FreeBSD-gnats-submit@FreeBSD.org Subject: kern/121660: hwpmc(4) incorrectly handles PMC sampling events from AMD Message-ID: <200803120307.m2C37pO7006394@jacinta.home.cacheboy.net> Resent-Message-ID: <200803130400.m2D407SR023747@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 121660 >Category: kern >Synopsis: hwpmc(4) incorrectly handles PMC sampling events from AMD >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu Mar 13 04:00:07 UTC 2008 >Closed-Date: >Last-Modified: >Originator: Adrian Chadd >Release: FreeBSD 8.0-CURRENT i386 >Organization: FreeBSD >Environment: System: FreeBSD jacinta.home.cacheboy.net 8.0-CURRENT FreeBSD 8.0-CURRENT #5: Sun Mar 9 19:34:11 UTC 2008 adrian@jacinta.home.cacheboy.net:/data/1/obj/usr/src/sys/JACINTA i386 Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.0-CURRENT #5: Sun Mar 9 19:34:11 UTC 2008 adrian@jacinta.home.cacheboy.net:/data/1/obj/usr/src/sys/JACINTA Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Athlon(tm) XP 1800+ (1540.49-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0x681 Stepping = 1 Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE> AMD Features=0xc0400800<SYSCALL,MMX+,3DNow!+,3DNow!> real memory = 2147418112 (2047 MB) avail memory = 2093957120 (1996 MB) MPTable: <OEM00000 PROD00000000> ioapic0: Assuming intbase of 0 ioapic0 <Version 0.3> irqs 0-23 on motherboard kbd1 at kbdmux0 ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) cpu0 on motherboard pcib0: <MPTable Host-PCI bridge> pcibus 0 on motherboard pci0: <PCI bus> on pcib0 agp0: <VIA 8377 (Apollo KT400/KT400A/KT600) host to PCI bridge> on hostb0 agp0: aperture size is 256M pcib1: <MPTable PCI-PCI bridge> at device 1.0 on pci0 pci1: <PCI bus> on pcib1 vgapci0: <VGA-compatible display> mem 0xd8000000-0xd9ffffff,0xda000000-0xda003fff,0xdb000000-0xdb7fffff irq 16 at device 0.0 on pci1 em0: <Intel(R) PRO/1000 Network Connection 6.8.4> port 0xd000-0xd03f mem 0xde020000-0xde03ffff,0xde000000-0xde01ffff irq 19 at device 11.0 on pci0 em0: [FILTER] em0: Ethernet address: 00:0e:0c:b9:4c:f9 uhci0: <VIA 83C572 USB controller> port 0xd400-0xd41f irq 21 at device 16.0 on pci0 uhci0: [GIANT-LOCKED] uhci0: [ITHREAD] usb0: <VIA 83C572 USB controller> on uhci0 usb0: USB revision 1.0 uhub0: <VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb0 uhub0: 2 ports with 2 removable, self powered uhci1: <VIA 83C572 USB controller> port 0xd800-0xd81f irq 21 at device 16.1 on pci0 uhci1: [GIANT-LOCKED] uhci1: [ITHREAD] usb1: <VIA 83C572 USB controller> on uhci1 usb1: USB revision 1.0 uhub1: <VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb1 uhub1: 2 ports with 2 removable, self powered uhci2: <VIA 83C572 USB controller> port 0xdc00-0xdc1f irq 21 at device 16.2 on pci0 uhci2: [GIANT-LOCKED] uhci2: [ITHREAD] usb2: <VIA 83C572 USB controller> on uhci2 usb2: USB revision 1.0 uhub2: <VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb2 uhub2: 2 ports with 2 removable, self powered ehci0: <VIA VT6202 USB 2.0 controller> mem 0xde040000-0xde0400ff irq 19 at device 16.3 on pci0 ehci0: [GIANT-LOCKED] ehci0: [ITHREAD] usb3: EHCI version 1.0 usb3: companion controllers, 2 ports each: usb0 usb1 usb2 usb3: <VIA VT6202 USB 2.0 controller> on ehci0 usb3: USB revision 2.0 uhub3: <VIA EHCI root hub, class 9/0, rev 2.00/1.00, addr 1> on usb3 uhub3: 6 ports with 6 removable, self powered isab0: <PCI-ISA bridge> at device 17.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <VIA 8235 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xe000-0xe00f at device 17.1 on pci0 ata0: <ATA channel 0> on atapci0 ata0: [ITHREAD] ata1: <ATA channel 1> on atapci0 ata1: [ITHREAD] pci0: <multimedia, audio> at device 17.5 (no driver attached) pmtimer0 on isa0 orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff pnpid ORM0000 on isa0 atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] fdc0: <Enhanced floppy controller> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0 fdc0: [FILTER] ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0 ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode ppbus0: <Parallel port bus> on ppc0 ppbus0: [ITHREAD] plip0: <PLIP network interface> on ppbus0 plip0: WARNING: using obsoleted IFF_NEEDSGIANT flag lpt0: <Printer> on ppbus0 lpt0: Interrupt-driven port ppi0: <Parallel I/O> on ppbus0 ppc0: [GIANT-LOCKED] ppc0: [ITHREAD] sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A sio0: [FILTER] sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled sio1 at port 0x2f8-0x2ff irq 3 on isa0 sio1: type 16550A sio1: [FILTER] vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 unknown: <PNP0303> can't assign resources (port) unknown: <PNP0c01> can't assign resources (memory) unknown: <PNP0501> can't assign resources (port) unknown: <PNP0700> can't assign resources (port) unknown: <PNP0400> can't assign resources (port) unknown: <PNP0501> can't assign resources (port) Timecounter "TSC" frequency 1540490006 Hz quality 800 Timecounters tick every 1.000 msec ad0: 38166MB <Seagate ST340016A 3.10> at ata0-master UDMA100 Trying to mount root from ufs:/dev/ad0s1a WARNING: / was not properly dismounted WARNING: /data/1 was not properly dismounted WARNING: /home was not properly dismounted WARNING: /var was not properly dismounted ipfw2 (+ipv6) initialized, divert loadable, nat loadable, rule-based forwarding disabled, default to deny, logging disabled >Description: hwpmc returns nothing from a user-mode sample (pmcstat -P instructions -O sample.out -t <pid>) The hwpmc registers are 48 bit on at least the Athlon XP platform (and the code sets the width value in the class for all AMD CPUs to 48, not 64). The sampling is done by counting upwards until the counter loops and generates an interrupt. The code uses a 2s compliment trick to turn the sample period counter into a counter number useful for generating an NMI. It does this on a 64 bit value but as the counters are 48 bit, it will read back a 48 bit value with the high 16 bits set to 0 and the virtual PMC stuff quickly loses track. My email to -current has more info: Between my Athlon XP box giving me no useful pmc stats and my new Core 2 duo box not even working with pmc, I decided to poke at the Athlon XP support a bit to see if I could figure out what was going on. It seems that at least my revision of the Athlon XP has 48 bit performance counters (AMD Athlon Processor x86 Code Optimisation Guide, page 235 (Performance-Monitoring Counters: Overview) and the top 16 bits read back 0x0000. Since the code is taking the 2's compliment of the stored PMC value (which is so the value is incremented to 0xffffffffffffffff and wraps over, generating an NMI - mentioned on page 240), negating the value gives humerous results: (Note: some of these are my own debugging information.) Mar 9 16:09:43 jacinta kernel: hwpmc: TSC/1/0x20<REA> K7/4/0x1ff<INT,USR,SYS,EDG,THR,REA,WRI,INV,QUA> Mar 9 16:10:02 jacinta kernel: MDP:SWO:1: pc=0xc5814180 pp=0 enable-msr=0 Mar 9 16:10:02 jacinta kernel: local initial: ri 1: 65536 Mar 9 16:10:02 jacinta kernel: MDP:SWO:1: pc=0xc5814180 pp=0xc576a780 enable-msr=0 Mar 9 16:10:02 jacinta kernel: csw_in: ri 1; pmcval 65536 Mar 9 16:10:02 jacinta kernel: MDP:WRI:1: amd-write cpu=0 ri=1 v=ffffffffffff0000 Mar 9 16:10:02 jacinta kernel: MDP:SWI:1: pc=0xc5814180 pp=0xc576a780 enable-msr=0 Mar 9 16:10:02 jacinta kernel: MDP:REA:1: amd-read id=1 class=1 Mar 9 16:10:02 jacinta kernel: MDP:REA:2: amd-read id=1 -> ffff00000000ff01 Mar 9 16:10:02 jacinta kernel: read: ffff00000000ff01; saved 10000; diff -281474976710911 Mar 9 16:10:02 jacinta kernel: csw_out: ri 1: pp_pmcval 65536.. Mar 9 16:10:02 jacinta kernel: csw_out: ... ri 1: pp_pmcval now 281474976710911.. Mar 9 16:10:02 jacinta kernel: MDP:SWO:1: pc=0xc5814180 pp=0xc576a780 enable-msr=0 Mar 9 16:10:02 jacinta kernel: csw_in: ri 1; pmcval 281474976710911 Mar 9 16:10:02 jacinta kernel: MDP:WRI:1: amd-write cpu=0 ri=1 v=fffeffffffffff01 Mar 9 16:10:02 jacinta kernel: MDP:SWI:1: pc=0xc5814180 pp=0xc576a780 enable-msr=0 Mar 9 16:10:02 jacinta kernel: MDP:REA:1: amd-read id=1 class=1 Mar 9 16:10:02 jacinta kernel: MDP:REA:2: amd-read id=1 -> ffff00000000f47f Mar 9 16:10:02 jacinta kernel: read: ffff00000000f47f; saved 10000000000ff; diff -562949953358976 Mar 9 16:10:02 jacinta kernel: csw_out: ri 1: pp_pmcval 281474976710911.. Mar 9 16:10:02 jacinta kernel: csw_out: ... ri 1: pp_pmcval now 844424930004351.. >How-To-Repeat: pmcstat -P instructions -O sample.out -t pid >Fix: This attempts to "pretend" to be the expected value - and it begins recording sample events in the above test - but I don't believe its correct. If the value rolls over somehow then we'll be OR'ing in high bits inappropriately. I think it should be a sign-extend rather than my OR operation. Index: hwpmc_amd.c =================================================================== RCS file: /share/FreeBSD/cvsrepo/src/sys/dev/hwpmc/hwpmc_amd.c,v retrieving revision 1.14 diff -u -r1.14 hwpmc_amd.c --- hwpmc_amd.c 7 Dec 2007 08:20:15 -0000 1.14 +++ hwpmc_amd.c 10 Mar 2008 12:06:49 -0000 @@ -303,7 +303,12 @@ tmp = rdmsr(pd->pm_perfctr); /* RDMSR serializes */ if (PMC_IS_SAMPLING_MODE(mode)) - *v = AMD_PERFCTR_VALUE_TO_RELOAD_COUNT(tmp); + /* + * The counters are 48 bit - so we need to "pretend" the 48 bit value + * is 64 bit for the 2s compliment conversion to convert correctly. + * I don't think this is "correct" answer! + */ + *v = AMD_PERFCTR_VALUE_TO_RELOAD_COUNT(tmp | 0xffff000000000000); else *v = tmp; >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200803120307.m2C37pO7006394>