From owner-cvs-src@FreeBSD.ORG Fri Oct 7 15:49:45 2005 Return-Path: X-Original-To: cvs-src@FreeBSD.org Delivered-To: cvs-src@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1E45016A41F; Fri, 7 Oct 2005 15:49:45 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailout2.pacific.net.au (mailout2.pacific.net.au [61.8.0.115]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7D57743D45; Fri, 7 Oct 2005 15:49:44 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.0.87]) by mailout2.pacific.net.au (8.13.4/8.13.4/Debian-3) with ESMTP id j97FnhKZ002272; Sat, 8 Oct 2005 01:49:43 +1000 Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailproxy2.pacific.net.au (8.13.4/8.13.4/Debian-3) with ESMTP id j97Fnf8H006254; Sat, 8 Oct 2005 01:49:41 +1000 Date: Sat, 8 Oct 2005 01:49:41 +1000 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Bruce Evans In-Reply-To: <200510071059.j97AxgfL051248@repoman.freebsd.org> Message-ID: <20051008011750.E58285@delplex.bde.org> References: <200510071059.j97AxgfL051248@repoman.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: cvs-src@FreeBSD.org, src-committers@FreeBSD.org, cvs-all@FreeBSD.org Subject: Re: cvs commit: src/usr.bin/gprof printgprof.c X-BeenThere: cvs-src@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Oct 2005 15:49:45 -0000 On Fri, 7 Oct 2005, Bruce Evans wrote: > (2) If a function has a zero call count but has a nonzero self or child > time, then print its total self time in the self time per call > column as a percentage of the total (self + child) time. It is > not possible to print the times per call in this case because the > call count is zero. Previously, this was handled by leaving both > per-call columns blank. The self time is printed in another column > but there was no way to recover the total time. > ... > (2) improves mainly cases like kernel threads. Most kernel threads > appear to be never called because they are always started before > userland can run to turn on profiling. As for main(), the fact that > they are called is not very interesting and their callers are > uninteresting, but their relative self time is interesting since they > are long-running. gprof output for hi-res (should be normal) kernel profiling now looks like this: % granularity: each sample hit covers 16 byte(s) for 0.00% of 65.15 seconds % % % cumulative self self total % time seconds seconds calls ns/call ns/call name % 86.8 56.584 56.584 103196 548312 548412 sleepq_add [4] % 3.1 58.573 1.990 0 100.00% mcount [10] % 2.3 60.085 1.512 381 3967867 3967867 cpu_idle_default [15] % 1.8 61.244 1.159 0 100.00% mexitcount [16] % 0.8 61.741 0.497 0 100.00% cputime [23] % 0.5 62.063 0.321 0 100.00% user [29] % 0.3 62.247 0.184 9 20480664 31839385 vmtotal [34] % 0.3 62.421 0.174 60381 2876 2876 sse_pagecopy [42] % 0.2 62.526 0.105 108567 970 970 cpu_switch [48] % [Only lines with self% shown from here on -- mostly for kernel threads] % 0.0 65.037 0.002 0 12.21% bintr [120] % 0.0 65.098 0.001 0 100.00% __mcount [316] % 0.0 65.107 0.001 0 0.48% ithread_loop [39] % 0.0 65.111 0.001 0 60.26% scheduler [293] % 0.0 65.126 0.001 0 100.00% __mexitcount [361] % 0.0 65.147 0.000 0 0.11% random_kthread [77] % 0.0 65.150 0.000 0 0.24% sched_sync [156] % 0.0 65.151 0.000 0 0.00% idle_proc [13] % 0.0 65.152 0.000 0 0.15% buf_daemon [197] % 0.0 65.152 0.000 0 0.13% vnlru_proc [196] % 0.0 65.152 0.000 0 0.01% schedcpu_thread [60] % 0.0 65.153 0.000 0 0.12% vm_pageout [283] % 0.0 65.153 0.000 0 0.10% poll_idle [284] % 0.0 65.153 0.000 0 0.00% btrap [1] I forget what this was profiling. It was for a kernel long ago and far away from -current. All threads that make a function call show up as having a self% (even if the percentage is 0, since the profiling clock is an i386 TSC so the unscaled counts are nonzero since a function call takes more than 0 cycles). Leaf functions for profiling support always have a self usage of 100% because they are not profiled in the normal way so they appear to have no callers (and being leaves, have no children). "user" (userland) has a self usage of 100% because its callers are not seen and its children are handled specially under Xsyscall, btrap and bintr for historical and efficiency reasons. Bruce