Date: Thu, 07 May 2015 18:10:21 +0000 From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 198149] [hwpmc] pmcstat -P -t (top mode, process sampling) stops after a while Message-ID: <bug-198149-8-9DSbrQme3L@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-198149-8@https.bugs.freebsd.org/bugzilla/> References: <bug-198149-8@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=198149 --- Comment #10 from John Baldwin <jhb@FreeBSD.org> --- So I think I somewhat understand what is wrong. I'm not yet sure how to fix it. What seems to happen is that on a context switch out, the read_pmc operation is returning a very large value. The result of this is that the PMC gets set to a value large enough that it won't expire during the next slice. This error gets recompounded on every switch out/in and the PMC stops firing as a result. Some snippets of KTR traces show the error in action: 238280 1 268934513552654 MDP:REA:1: iaf-read cpu=1 ri=2 msr=0x40000002 -> v=5dd31 238271 1 268934513429846 MDP:WRI:1: iaf-write cpu=1 ri=2 msr=0x40000002 v=4fffffff855b6 iafctrl=0 pmc=fffffff855b6 238262 1 268934513342102 MDP:REA:1: iaf-read cpu=1 ri=2 msr=0x40000002 -> v=2f2c 238247 1 268934510388202 MDP:WRI:1: iaf-write cpu=1 ri=2 msr=0x40000002 v=2fffffffc9345 iafctrl=0 pmc=fffffffc9345 238238 1 268934510294742 MDP:REA:1: iaf-read cpu=1 ri=2 msr=0x40000002 -> v=4125 238229 1 268934510220562 MDP:WRI:1: iaf-write cpu=1 ri=2 msr=0x40000002 v=1fffffffea910 iafctrl=0 pmc=fffffffea910 238220 1 268934510132922 MDP:REA:1: iaf-read cpu=1 ri=2 msr=0x40000002 -> v=fffffffffffe 238211 1 268934510048862 MDP:WRI:1: iaf-write cpu=1 ri=2 msr=0x40000002 v=ffffffffd489 iafctrl=0 pmc=ffffffffd489 238202 1 268934509967030 MDP:REA:1: iaf-read cpu=1 ri=2 msr=0x40000002 -> v=b34d 238193 1 268934509880238 MDP:WRI:1: iaf-write cpu=1 ri=2 msr=0x40000002 v=ffffffff109e iafctrl=0 pmc=ffffffff109e 238184 1 268934509789534 MDP:REA:1: iaf-read cpu=1 ri=2 msr=0x40000002 -> v=e848 238175 1 268934509749902 MDP:WRI:1: iaf-write cpu=1 ri=2 msr=0x40000002 v=ffffffff942b iafctrl=0 pmc=ffffffff942b 238166 1 268934509673986 MDP:REA:1: iaf-read cpu=1 ri=2 msr=0x40000002 -> v=62dd 238157 1 268934508267090 MDP:WRI:1: iaf-write cpu=1 ri=2 msr=0x40000002 v=ffffffff18a7 iafctrl=0 pmc=ffffffff18a7 238148 1 268934508103386 MDP:REA:1: iaf-read cpu=1 ri=2 msr=0x40000002 -> v=6d25 The error occurs at event 238220 when "-2" is converted to a large unsigned value. After this point, the PMC is programmed with progressively larger and larger values on each switch in and never fires again. By the end of the trace when I killed my test program it was quite far off: 116541 1 268945752955406 MDP:WRI:1: iaf-write cpu=1 ri=2 msr=0x40000002 v=ffed2e3cfd33 iafctrl=0 pmc=ffed2e3cfd33 116448 1 268945715324794 MDP:WRI:1: iaf-write cpu=1 ri=2 msr=0x40000002 v=ffef3dd2e030 iafctrl=0 pmc=ffef3dd2e030 116337 1 268945421271906 MDP:WRI:1: iaf-write cpu=1 ri=2 msr=0x40000002 v=fff2cff21be4 iafctrl=0 pmc=fff2cff21be4 116321 1 268945168850926 MDP:WRI:1: iaf-write cpu=1 ri=2 msr=0x40000002 v=fff2defc2fec iafctrl=0 pmc=fff2defc2fec 116276 1 268944964260070 MDP:WRI:1: iaf-write cpu=1 ri=2 msr=0x40000002 v=fff3210cd42b iafctrl=0 pmc=fff3210cd42b 116241 1 268944442945530 MDP:WRI:1: iaf-write cpu=1 ri=2 msr=0x40000002 v=fff353fadde0 iafctrl=0 pmc=fff353fadde0 116207 1 268944442823210 MDP:WRI:1: iaf-write cpu=1 ri=2 msr=0x40000002 v=fff3fa4fc2e3 iafctrl=0 pmc=fff3fa4fc2e3 ... I'm not really sure where the error is. I think it might be that iap_perfctr_value_to_reload_count needs to sign extend its return value so it can return -2 as the value of the PMC in this case instead of what it returned. Note that this seems specific to hwpmc_core.c. hwpmc_amd.c uses a different approach. It sign extends the value it reads from the PMC first and then negates it (which would have returned -2 in this case). -- You are receiving this mail because: You are the assignee for the bug.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-198149-8-9DSbrQme3L>