Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 30 Nov 2020 06:32:34 -0600
From:      Scott Bennett <bennett@sdf.org>
To:        jbeich@FreeBSD.org, ihor@antonovs.family
Cc:        ml@netfence.it, freebsd-questions@freebsd.org
Subject:   Re: Profiling C application
Message-ID:  <202011301232.0AUCWZ58015525@sdf.org>
In-Reply-To: <zh2z-4i82-wny@FreeBSD.org>
References:  <39693639-6531-493f-fb24-53519f66cd98@antonovs.family> <zh2z-4i82-wny@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Jan Beich <jbeich@FreeBSD.org> wrote:

> Ihor Antonov <ihor@antonovs.family> writes:
>
> > Hello folks,
> >
> > What can I use on FreeBSD to profile a slow application?
> >
> > I have a source code of the application and I can rebuild it if necessary.
> >
> > I want to see where (in what functions) application spends most of its
> > time - amount of function calls and duration of each call (per call)
> >
> > What tools can you recommend to do this?
>
> hwpmc (pmcstat, pmcannotate) and dtrace, see examples at the end.
> There're probably other ways, so better ask on hackers@ list.
>
> # Load kernel module and check available counters. Aliases like
> # "instructions" or "unhalted-cycles" may not be available.
> $ kldload hwpmc
> $ pmccontrol -L | grep \\<inst
>         inst_retired.any
>         inst_retired.any_p
>         inst_retired.prec_dist
>         inst_retired.total_cycles_ps
>
> # Interactive use
> $ pmcstat -t firefox -P inst_retired.any -T
> PMC: [inst_retired.any] Samples: 287408 (100.0%) , 96792 unresolved
> Key:   => pause => press space again to continue
> %SAMP IMAGE      FUNCTION                       CALLERS
>   3.0 libxul.so  _ZN2sh13ResourcesHLSL14unifor  _ZN2sh10OutputHLSL9visitLoopE:1.9 ...
>   1.4 libxul.so  _ZN7mozilla19TrackBuffersMana  _ZN7mozilla10OggDemuxer13Seek
>   1.4 libavcodec encode_thread                  mp_set_zero_yuv
>   1.4 libxul.so  _ZN2sh10OutputHLSL9visitLoopE  _ZN2sh10OutputHLSL18visitCons
>   1.0 libxul.so  _ZN7mozilla3dom12FetchRequest  _ZN7mozilla26RemoteLazyInputS
>   1.0 libavcodec epzs_motion_search2            epzs_motion_search2
>   0.9 kernel     cpu_search_highest             cpu_search_highest
>   0.7 libxul.so  _ZN2sh10OutputHLSL22visitFunc  _ZN2sh13TParseContext16checkC
>   0.7 libxul.so  _ZN7mozilla3dom16HTMLMediaEle  ...
>   0.6 libc.so.7  regexec
> [...]
>
> # Capture on disk
> $ pmcstat -t firefox -P inst_retired.any -O firefox.pmc
>
> # Flamegraph
> $ git clone https://github.com/brendangregg/FlameGraph
> $ pmcstat -z16 -R firefox.pmc -G firefox.graph
> $ FlameGraph/stackcollapse-pmc.pl firefox.graph > firefox.stack
> $ FlameGraph/flamegraph.pl firefox.stack > firefox.svg
> $ firefox firefox.svg
>
> # gprof-like (no need to build with -pg, regular non-debug symbols will do)
> $ pmcstat -R firefox.pmc -g
> $ gprof $(which firefox) inst_retired.any/firefox.gmon
> Flat profile:
>
> Each sample counts as 0.00787402 seconds.
>   %   cumulative   self              self     total
>  time   seconds   seconds    calls  ms/call  ms/call  name
>  28.42     14.67    14.67                             (anonymous namespace)::LoadELF(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, (anonymous namespace)::MmapWrapper*, void**)
>  17.21     23.55     8.88                             mozilla::PrintfTarget::vprint(char const*, __va_list_tag*)
>  11.03     29.24     5.69      643     8.85     8.85  mozilla::JSONWriter::Scalar(mozilla::Span<char const, 18446744073709551615ul> const&, mozilla::Span<char const, 18446744073709551615ul> const&)
>   9.55     34.17     4.93                             lul::ReadSymbolDataInternal(unsigned char const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, lul::SecMap*, void*, unsigned long, lul::UniqueStringUniverse*, void (*)(char const*))
> [...]
>
> $ gprof /usr/local/lib/libavcodec.so inst_retired.any/libavcodec.so.*.gmon
> Flat profile:
>
> Each sample counts as 0.00787402 seconds.
>   %   cumulative   self              self     total
>  time   seconds   seconds    calls  ms/call  ms/call  name
>  29.63   2516.15  2516.15   341629     7.37     7.69  encode_thread
>  13.33   3647.98  1131.83   498281     2.27     3.57  epzs_motion_search2
>   8.02   4329.25   681.28                             decode_chunks
>   6.87   4912.72   583.46                             mpc8_decode_frame
> [...]
>
> $ gprof /boot/modules/i915kms.ko inst_retired.any/i915kms.ko.gmon
> Flat profile:
>
> Each sample counts as 0.00787402 seconds.
>   %   cumulative   self              self     total
>  time   seconds   seconds    calls  ms/call  ms/call  name
>  13.56      1.73     1.73                             eb_lookup_vmas
>  11.78      3.24     1.50       57    26.38    26.38  i915_active_ref
>   9.68      4.47     1.24       26    47.55    47.85  i915_request_await_object
>   6.29      5.28     0.80                             i915_gem_do_execbuffer
>   4.62      5.87     0.59                             __i915_vma_active
> [...]

     Nice examples!  Thank you for posting them.  I will definitely save
these for future reference.


                                  Scott Bennett, Comm. ASMELG, CFIAG
**********************************************************************
* Internet:   bennett at sdf.org   *xor*   bennett at freeshell.org  *
*--------------------------------------------------------------------*
* "A well regulated and disciplined militia, is at all times a good  *
* objection to the introduction of that bane of all free governments *
* -- a standing army."                                               *
*    -- Gov. John Hancock, New York Journal, 28 January 1790         *
**********************************************************************



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?202011301232.0AUCWZ58015525>