Date: Mon, 15 Dec 2025 09:05:34 -0800 From: Mark Millard <marklmi@yahoo.com> To: Steve Kargl <kargls@comcast.net> Cc: freebsd-hackers <freebsd-hackers@freebsd.org> Subject: Re: profiling a user executable? Message-ID: <B8A4B3EA-FB41-474C-B4BD-8722FC7C4AED@yahoo.com> In-Reply-To: <fe1f5702-fcd0-4fea-bf34-be070713abae@comcast.net> References: <12053856-4DE5-4B98-9309-028869BB5395.ref@yahoo.com> <12053856-4DE5-4B98-9309-028869BB5395@yahoo.com> <fe1f5702-fcd0-4fea-bf34-be070713abae@comcast.net>
index | next in thread | previous in thread | raw e-mail
On Dec 14, 2025, at 23:18, Steve Kargl <kargls@comcast.net> wrote: > On 12/14/25 21:55, Mark Millard wrote: >> Steve Kargl <kargls_at_comcast.net> wrote on >> Date: Sat, 13 Dec 2025 20:12:12 UTC : >>> On 12/12/25 14:14, Ahmad Khalifa wrote: >>>> On Thu Dec 11, 2025 at 8:13 PM +0200, Steve Kargl wrote: >>>>> In the days of yore, one could add the '-pg' option to >>>>> the compilers options to generate profiling information, >>>>> which could be consumed by gprof(1). >>>>> >>>>> FreeBSD stopped shipping libc_p.a, libm_p.m, etc >>>>> (disabled in fe52b7f60ef4 and deleted in 3750ccefb8). >>>>> This breaks all lang/gcc* ports if one uses '-pg'. It is >>>>> not too difficult to fix lang/gcc* to avoid the missing >>>>> *_p.a files, but this seems to lead to invalid *.gmon files. >>>>> At least, for a Fortran application that I would like to >>>>> profile (compiled with gfortran), procedures in my libfoo_p.a, >>>>> appear in the profile, which I know with 100% certainty are >>>>> not referenced. >>>>> >>>>> So, how does one in modern FreeBSD, as mere normal user, >>>>> profile an executable? A google search suggests pmcstat(8) >>>>> may be of use, but all attempts to use it lead to a usage >>>>> message printed to the terminal. I'm simply trying to >>>>> determine where my code is spending all of its time. >>>> >>>> Just throwing in another option, you can use dtrace's profile-n probes. >>>> >>> >>> dtrace appears to be a useless for a mere user. >>> >>> % dtrace -n 'profile-99 /execname == "../../build/bin/tier -q"/ \ >> As I remember, execname holds only the base name that had been given >> to exec for the current thread/process. Also, it is not a way to run >> a program. It is a way to select processes/threads that are running >> a known-base-name of interest. It is DTrace variable in specific >> probes, not all probes. >> As I remember, dtrace uses -c COMMAND notation to run the command >> and exit once that command completes. >> Trying to deal with paths is much more involved and can involve things >> like copyinstr(arg0) notation, arg0 being for the first argument to the >> probe as the example. > > Unfortunately, dtrace requires root privilege, and so is > a non-starter. > > I adapted your suggestions with pmcstat to my problem, > and it seems promising. > > % pmcstat -O pmc.0 -P ex_ret_instr ../../build/bin/tier -q Turns out that it looks like "ex_ret_instr" is about the number of retired instructions related to function returns (popping a return address off the stack). And, ls_not_halted_cyc is tracking the number of cycles where the Load/Store unit is not stalled waiting for memory. Not as analogous to aarch64 as I thought. > % pmcstat -R pmc.0 -g > % gprof ../../build/bin/tier ex_ret_instr/tier.gmon | head -10 | tail -8 > Each sample counts as 0.0078125 seconds. > % cumulative self self total > time seconds seconds calls ms/call ms/call name > 36.68 13.15 13.15 13192 1.00 1.75 __spherem_MOD_sphere > 23.06 21.41 8.27 __pnam_MOD_pna_dble > 16.85 27.45 6.04 2045 2.95 3.00 __sjnm_MOD_sjn_dble > 10.08 31.07 3.61 693 5.21 5.34 __synm_MOD_syn_dble > 8.02 33.94 2.88 __sjnm_MOD_sjn_sngl Looks like the empty "calls", self "ms/call", and total "ms/call" columns might be indicating the lack of calls, despite the left hand side time information. May be __sjnm_MOD_sjn_sngl or the like is the closest prior symbol available for a "static" (non-public) routine that is not published for linking? > I know with 100% certainty that __sjnm_MOD_sjn_sngl is not > referenced in the code as I wrote it. I'll note the above > is similar to what 'gfortran -pg' produces. > > % pmcstat -R pmc.0 -G zxc.graph > CONVERSION STATISTICS: > #exec/elf 1 > #samples/total 67133 > #samples/unknown-function 1775 > #callchain/dubious-frames 17 > % grep sjn_dble zxc.graph | wc -l > 258 > % grep sjn_sngl zxc.graph | wc -l > 0 > > The callgraph shows that __sjnm_MOD_sjn_sngl is not used. > My working conclusion is that gprof is simply broken. I'm > still investigating what pmcstat can given me. Given the > attempt to convert to a gprof file, hopefully I can get > something like > > % pmcstat -R pmc.0 [some option(s)] > cycles cycles/cal function > 10000 90 __spherem_MOD_sphere > 12345 191 __pnam_MOD_pna_dble > 5433 400 __sjnm_MOD_sjn_dble > 15000 1500 __synm_MOD_syn_dble > > This would tell me which routine(s) to look into for > optimizations. > It probably gets back to if there is an event type that is appropriate. ls_not_halted_cyc would not treat waiting-for-memory time uniformly with load/store unit active time. But it would give information related to if waiting for memory was an issue or not. === Mark Millard marklmi at yahoo.comhelp
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B8A4B3EA-FB41-474C-B4BD-8722FC7C4AED>
