Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 15 Dec 2025 09:05:34 -0800
From:      Mark Millard <marklmi@yahoo.com>
To:        Steve Kargl <kargls@comcast.net>
Cc:        freebsd-hackers <freebsd-hackers@freebsd.org>
Subject:   Re: profiling a user executable?
Message-ID:  <B8A4B3EA-FB41-474C-B4BD-8722FC7C4AED@yahoo.com>
In-Reply-To: <fe1f5702-fcd0-4fea-bf34-be070713abae@comcast.net>
References:  <12053856-4DE5-4B98-9309-028869BB5395.ref@yahoo.com> <12053856-4DE5-4B98-9309-028869BB5395@yahoo.com> <fe1f5702-fcd0-4fea-bf34-be070713abae@comcast.net>

index | next in thread | previous in thread | raw e-mail

On Dec 14, 2025, at 23:18, Steve Kargl <kargls@comcast.net> wrote:

> On 12/14/25 21:55, Mark Millard wrote:
>> Steve Kargl <kargls_at_comcast.net> wrote on
>> Date: Sat, 13 Dec 2025 20:12:12 UTC :
>>> On 12/12/25 14:14, Ahmad Khalifa wrote:
>>>> On Thu Dec 11, 2025 at 8:13 PM +0200, Steve Kargl wrote:
>>>>> In the days of yore, one could add the '-pg' option to
>>>>> the compilers options to generate profiling information,
>>>>> which could be consumed by gprof(1).
>>>>> 
>>>>> FreeBSD stopped shipping libc_p.a, libm_p.m, etc
>>>>> (disabled in fe52b7f60ef4 and deleted in 3750ccefb8).
>>>>> This breaks all lang/gcc* ports if one uses '-pg'. It is
>>>>> not too difficult to fix lang/gcc* to avoid the missing
>>>>> *_p.a files, but this seems to lead to invalid *.gmon files.
>>>>> At least, for a Fortran application that I would like to
>>>>> profile (compiled with gfortran), procedures in my libfoo_p.a,
>>>>> appear in the profile, which I know with 100% certainty are
>>>>> not referenced.
>>>>> 
>>>>> So, how does one in modern FreeBSD, as mere normal user,
>>>>> profile an executable? A google search suggests pmcstat(8)
>>>>> may be of use, but all attempts to use it lead to a usage
>>>>> message printed to the terminal. I'm simply trying to
>>>>> determine where my code is spending all of its time.
>>>> 
>>>> Just throwing in another option, you can use dtrace's profile-n probes.
>>>> 
>>> 
>>> dtrace appears to be a useless for a mere user.
>>> 
>>> % dtrace -n 'profile-99 /execname == "../../build/bin/tier -q"/ \
>> As I remember, execname holds only the base name that had been given
>> to exec for the current thread/process. Also, it is not a way to run
>> a program. It is a way to select processes/threads that are running
>> a known-base-name of interest. It is  DTrace variable in specific
>> probes, not all probes.
>> As I remember, dtrace uses -c COMMAND notation to run the command
>> and exit once that command completes.
>> Trying to deal with paths is much more involved and can involve things
>> like copyinstr(arg0) notation, arg0 being for the first argument to the
>> probe as the example.
> 
> Unfortunately, dtrace requires root privilege, and so is
> a non-starter.
> 
> I adapted your suggestions with pmcstat to my problem,
> and it seems promising.
> 
> % pmcstat -O pmc.0 -P ex_ret_instr ../../build/bin/tier -q

Turns out that it looks like "ex_ret_instr" is about the number of
retired instructions related to function returns (popping a return
address off the stack).

And, ls_not_halted_cyc is tracking the number of cycles where the
Load/Store unit is not stalled waiting for memory.

Not as analogous to aarch64 as I thought.

> % pmcstat -R pmc.0 -g
> % gprof ../../build/bin/tier ex_ret_instr/tier.gmon | head -10 | tail -8
> Each sample counts as 0.0078125 seconds.
>  %   cumulative   self           self     total
> time  seconds   seconds  calls  ms/call ms/call  name
> 36.68   13.15   13.15   13192  1.00     1.75  __spherem_MOD_sphere
> 23.06   21.41   8.27                          __pnam_MOD_pna_dble
> 16.85   27.45   6.04    2045   2.95     3.00  __sjnm_MOD_sjn_dble
> 10.08   31.07   3.61     693   5.21     5.34  __synm_MOD_syn_dble
>  8.02   33.94   2.88                          __sjnm_MOD_sjn_sngl

Looks like the empty "calls", self "ms/call", and
total "ms/call" columns might be indicating the
lack of calls, despite the left hand side time
information.

May be __sjnm_MOD_sjn_sngl or the like is the
closest prior symbol available for a "static"
(non-public) routine that is not published for
linking?

> I know with 100% certainty that __sjnm_MOD_sjn_sngl is not
> referenced in the code as I wrote it.  I'll note the above
> is similar to what 'gfortran -pg' produces.
> 
> % pmcstat -R pmc.0 -G zxc.graph
> CONVERSION STATISTICS:
> #exec/elf                                1
> #samples/total                           67133
> #samples/unknown-function                1775
> #callchain/dubious-frames                17
> % grep sjn_dble zxc.graph | wc -l
>     258
> % grep sjn_sngl zxc.graph | wc -l
>       0
> 
> The callgraph shows that __sjnm_MOD_sjn_sngl is not used.
> My working conclusion is that gprof is simply broken.  I'm
> still investigating what pmcstat can given me.  Given the
> attempt to convert to a gprof file, hopefully I can get
> something like
> 
> % pmcstat -R pmc.0 [some option(s)]
> cycles  cycles/cal  function
> 10000       90     __spherem_MOD_sphere
> 12345       191     __pnam_MOD_pna_dble
> 5433       400     __sjnm_MOD_sjn_dble
> 15000      1500     __synm_MOD_syn_dble
> 
> This would tell me which routine(s) to look into for
> optimizations.
> 

It probably gets back to if there is an event type
that is appropriate.

ls_not_halted_cyc would not treat waiting-for-memory
time uniformly with load/store unit active time.
But it would give information related to if waiting
for memory was an issue or not.

===
Mark Millard
marklmi at yahoo.com



help

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B8A4B3EA-FB41-474C-B4BD-8722FC7C4AED>