From owner-freebsd-bugs@FreeBSD.ORG Fri May 1 15:34:46 2015 Return-Path: Delivered-To: freebsd-bugs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BCE4A1A2 for ; Fri, 1 May 2015 15:34:46 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8D5841476 for ; Fri, 1 May 2015 15:34:46 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t41FYkVS022872 for ; Fri, 1 May 2015 15:34:46 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 198149] [hwpmc] pmcstat -P -t (top mode, process sampling) stops after a while Date: Fri, 01 May 2015 15:34:46 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: jhb@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 May 2015 15:34:46 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=198149 John Baldwin changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jhb@FreeBSD.org --- Comment #8 from John Baldwin --- I can reproduce this trivially on real hardware. I use a dummy test app (threadspin.c). On my x220 laptop, this usually breaks in a few seconds. Same on a server (fox2 in the netperf lab). If I run it in a reduced cpuset to limit the number of CPUs then pmcstat runs longer before it stops working. Pinning the threads to specific CPUs does not fix the issue either. I looked via kgdb at the software state of hwpmc when pmcstat stops reporting samples and it seems to think that the PMC is enabled and should be working. However, it seems that the PMC is no longer generating interrupts. I've ported all of the PMC debug stuff over to KTR and added a hack so that KTR auto-disables once a polling sample request fails. When run with all CPUs there is simply too much noise in the ktrdump and I can't find the last sample in my dump (it seems that the dump only contains events logged after interrupts stopped working). I tried restricting to a smaller set of CPUs but that failed to break. My plan is to try to find the smallest set of CPUs that do break while KTR is active and then see if I can tease anything out from the ktrdump. The changes to pmc to use KTR can be found at https://github.com/freebsd/freebsd/compare/master...bsdjhb:pr_198149 -- You are receiving this mail because: You are the assignee for the bug.