From owner-freebsd-current@FreeBSD.ORG Mon Dec 14 19:56:10 2009 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A477D1065679 for ; Mon, 14 Dec 2009 19:56:10 +0000 (UTC) (envelope-from julian@elischer.org) Received: from outF.internet-mail-service.net (outf.internet-mail-service.net [216.240.47.229]) by mx1.freebsd.org (Postfix) with ESMTP id 8A7F58FC1E for ; Mon, 14 Dec 2009 19:56:10 +0000 (UTC) Received: from idiom.com (mx0.idiom.com [216.240.32.160]) by out.internet-mail-service.net (Postfix) with ESMTP id 303BE2439; Mon, 14 Dec 2009 11:56:10 -0800 (PST) X-Client-Authorized: MaGic Cook1e X-Client-Authorized: MaGic Cook1e Received: from julian-mac.elischer.org (h-67-100-89-137.snfccasy.static.covad.net [67.100.89.137]) by idiom.com (Postfix) with ESMTP id AA8502D6011; Mon, 14 Dec 2009 11:56:09 -0800 (PST) Message-ID: <4B269866.2070809@elischer.org> Date: Mon, 14 Dec 2009 11:56:22 -0800 From: Julian Elischer User-Agent: Thunderbird 2.0.0.23 (Macintosh/20090812) MIME-Version: 1.0 To: Ryan Stone References: <4B25D32B.70306@elischer.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: FreeBSD Current Subject: Re: profiling kernel modules. X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Dec 2009 19:56:10 -0000 Ryan Stone wrote: > I find that the best way to profile the kernel is with pmc. You don't > need to compile anything with a special option(other than including > the hwpmc hooks in the kernel with the HWPMC_HOOKS option) so you can > use it at any time on the same code you'll be shipping. pmc does > statistical profiling; it uses whatever performance monitoring > counters are provided by the hardware. It has a pretty low overhead, > especially compared with other profiling techniques. It's really easy > to use, too: thanks for all this. BTW I just tried the old kgmon/gprof profiling as a control. it appears that on amd64 it doesn't work. gprof can't read the file that the kernel puts out. (useful!). > > 1) If hwpmc is not compiled into your kernel, kldload hwpmc > 2) Run pmcstat to begin taking samples(make sure that whatever you are > profiling is busy doing work first!): > > pmcstat -S unhalted-cycles -O /tmp/samples.out > > The -S option specifies what event you want to use to trigger > sampling. The unhalted-cycles is the best event to use if your > hardware supports it; pmc will take a sample every 64K non-idle CPU > cycles, which is basically equivalent to sampling based on time. If > the unhalted-cycles event is not supported by your hardware then the > instructions event will probably be the next best choice(although it's > nowhere near as good, as it will not be able to tell you, for example, > if a particular function is very expensive because it takes a lot of > cache misses compared to the rest of your program). One caveat with > the unhalted-cycles event is that time spent spinning on a spinlock or > adaptively spinning on a MTX_DEF mutex will not be counted by this > event, because most of the spinning time is spent executing an hlt > instruction that idles the CPU for a short period of time. > > Modern Intel and AMD CPUs offer a dizzying array of events. They're > mostly only useful if you suspect that a particular kind of event is > hurting your performance and you would like to know what is causing > those events. For example, if you suspect that data cache misses are > causing you problems you can take samples on cache misses. > Unfortunately on some of the newer CPUs(namely the Core2 family, > because that's what I'm doing most of my profiling on nowadays) I find > it difficult to figure out just what event to use to profile based on > cache misses. man pmc will give you an overview of pmc, and there are > manpages for every CPU family supported(eg man pmc.core2) > > 3) After you've run pmcstat for "long enough"(a proper definition of > long enough requires a statistician, which I most certainly am not, > but I find that for a busy system 10 seconds is enough), Control-C it > to stop it*. You can use pmcstat to post-process the samples into > human-readable text: > > pmcstat -R /tmp/samples.out -G /tmp/graph.txt > > The graph.txt file will show leaf functions on the left and their > callers beneath them, indented to reflect the callchain. It's not too > easy to describe and I don't have sample output available right now. > > > Another interesting tool for post-processing the samples is > pmcannotate. I've never actually used the tool before but it will > annotate the program's source to show which lines are the most > expensive. This of course needs unstripped modules to work. I think > that it will also work if the GNU "debug link" is in the stripped > module pointing to the location of the file with symbols. > > > * Here's a tip I picked up from Joseph Koshy's blog: to collect > samples for a fixed period of time(say 1 minute), have pmcstat run the > sleep command: > > pmcstat -S unhalted-cycles -O /tmp/samples.out sleep 60 > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"