Date: Sat, 8 Oct 2005 01:49:41 +1000 (EST) From: Bruce Evans <bde@zeta.org.au> To: Bruce Evans <bde@FreeBSD.org> Cc: cvs-src@FreeBSD.org, src-committers@FreeBSD.org, cvs-all@FreeBSD.org Subject: Re: cvs commit: src/usr.bin/gprof printgprof.c Message-ID: <20051008011750.E58285@delplex.bde.org> In-Reply-To: <200510071059.j97AxgfL051248@repoman.freebsd.org> References: <200510071059.j97AxgfL051248@repoman.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 7 Oct 2005, Bruce Evans wrote: > (2) If a function has a zero call count but has a nonzero self or child > time, then print its total self time in the self time per call > column as a percentage of the total (self + child) time. It is > not possible to print the times per call in this case because the > call count is zero. Previously, this was handled by leaving both > per-call columns blank. The self time is printed in another column > but there was no way to recover the total time. > ... > (2) improves mainly cases like kernel threads. Most kernel threads > appear to be never called because they are always started before > userland can run to turn on profiling. As for main(), the fact that > they are called is not very interesting and their callers are > uninteresting, but their relative self time is interesting since they > are long-running. gprof output for hi-res (should be normal) kernel profiling now looks like this: % granularity: each sample hit covers 16 byte(s) for 0.00% of 65.15 seconds % % % cumulative self self total % time seconds seconds calls ns/call ns/call name % 86.8 56.584 56.584 103196 548312 548412 sleepq_add [4] % 3.1 58.573 1.990 0 100.00% mcount [10] % 2.3 60.085 1.512 381 3967867 3967867 cpu_idle_default [15] % 1.8 61.244 1.159 0 100.00% mexitcount [16] % 0.8 61.741 0.497 0 100.00% cputime [23] % 0.5 62.063 0.321 0 100.00% user [29] % 0.3 62.247 0.184 9 20480664 31839385 vmtotal [34] % 0.3 62.421 0.174 60381 2876 2876 sse_pagecopy [42] % 0.2 62.526 0.105 108567 970 970 cpu_switch [48] % [Only lines with self% shown from here on -- mostly for kernel threads] % 0.0 65.037 0.002 0 12.21% bintr [120] % 0.0 65.098 0.001 0 100.00% __mcount [316] % 0.0 65.107 0.001 0 0.48% ithread_loop [39] % 0.0 65.111 0.001 0 60.26% scheduler [293] % 0.0 65.126 0.001 0 100.00% __mexitcount [361] % 0.0 65.147 0.000 0 0.11% random_kthread [77] % 0.0 65.150 0.000 0 0.24% sched_sync [156] % 0.0 65.151 0.000 0 0.00% idle_proc [13] % 0.0 65.152 0.000 0 0.15% buf_daemon [197] % 0.0 65.152 0.000 0 0.13% vnlru_proc [196] % 0.0 65.152 0.000 0 0.01% schedcpu_thread [60] % 0.0 65.153 0.000 0 0.12% vm_pageout [283] % 0.0 65.153 0.000 0 0.10% poll_idle [284] % 0.0 65.153 0.000 0 0.00% btrap [1] I forget what this was profiling. It was for a kernel long ago and far away from -current. All threads that make a function call show up as having a self% (even if the percentage is 0, since the profiling clock is an i386 TSC so the unscaled counts are nonzero since a function call takes more than 0 cycles). Leaf functions for profiling support always have a self usage of 100% because they are not profiled in the normal way so they appear to have no callers (and being leaves, have no children). "user" (userland) has a self usage of 100% because its callers are not seen and its children are handled specially under Xsyscall, btrap and bintr for historical and efficiency reasons. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20051008011750.E58285>