Date: Wed, 19 Jan 2011 11:44:48 +0100 From: Roman Divacky <rdivacky@freebsd.org> To: Hans Ottevanger <hansot@iae.nl> Cc: freebsd-toolchain@freebsd.org, Steve Kargl <sgk@troutmask.apl.washington.edu> Subject: Re: How to build an executable with profiling? Message-ID: <20110119104448.GA15730@freebsd.org> In-Reply-To: <4D36BA6E.5030202@iae.nl> References: <20110117184411.GA54556@troutmask.apl.washington.edu> <20110118143205.GA34216@freebsd.org> <20110118144313.GO2518@deviant.kiev.zoral.com.ua> <20110118171657.GA68321@freebsd.org> <20110118173517.GA60201@troutmask.apl.washington.edu> <20110118211200.GA3586@freebsd.org> <4D36BA6E.5030202@iae.nl>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Jan 19, 2011 at 11:18:22AM +0100, Hans Ottevanger wrote: > On 01/18/11 22:12, Roman Divacky wrote: > >On Tue, Jan 18, 2011 at 09:35:17AM -0800, Steve Kargl wrote: > >>On Tue, Jan 18, 2011 at 06:16:57PM +0100, Roman Divacky wrote: > >>>On Tue, Jan 18, 2011 at 04:43:13PM +0200, Kostik Belousov wrote: > >>>>On Tue, Jan 18, 2011 at 03:32:05PM +0100, Roman Divacky wrote: > >>>>>On Mon, Jan 17, 2011 at 10:44:11AM -0800, Steve Kargl wrote: > >>>>>>How does one build an executable for profiling with clang? > >>>>> > >>>>>LLVM (and thus clang) does not support GPROF profiling. > >>>>> > >>>>>>clang -o testf -O2 -march=native -pipe -static -pg > >>>>>>-I/usr/local/include -I../mp testf.c -L/usr/local/lib -L../mp -lsgk > >>>>>>-lmpfr -lgmp -L/usr/home/kargl/work/lib -lm_clang_p > >>>>>>clang: warning: the clang compiler does not support '-pg' > >>>>>> > > If you are really desperate to find the hotspots in your program when > compiled with clang, you could call clang with -v to find the call to > /bin/ld. Then append _p to the appropriate libs if still needed and > replace crt1.o by gcrt1.o while calling ld directly. E.g. > > "/usr/bin/ld" -Bstatic -o testcoll /usr/lib/gcrt1.o /usr/lib/crti.o > /usr/lib/crtbegin.o testcoll.o angle.o apsis.o error.o minmax.o qags.o > qext.o qk21.o sort.o timint.o zero.o vmol.o -lm_p -lgcc -lgcc_eh -lc_p > -lgcc -lgcc_eh -t /usr/lib/crtend.o /usr/lib/crtn.o > > You will get a profile without the number of calls for the objects > compiled with clang, but with the time spent. In my case: > > granularity: each sample hit covers 4 byte(s) for 0.00% of 6.41 seconds > > % cumulative self self total > time seconds seconds calls ms/call ms/call name > 30.3 1.94 1.94 0 100.00% f_timint [2] > 20.2 3.24 1.29 0 100.00% _mcount [3] > 19.4 4.48 1.24 21900000 0.00 0.00 exp [4] > 13.2 5.32 0.85 0 40.51% vmol [1] > 7.3 5.79 0.47 0 100.00% f_angle [5] > 2.8 5.98 0.18 1000000 0.00 0.00 pow [7] > 2.7 6.15 0.17 0 48.70% qk21 [6] > 2.4 6.30 0.15 0 100.00% .mcount (51) > 0.5 6.33 0.03 0 100.00% zero [8] > 0.4 6.35 0.02 0 100.00% qext [9] > 0.4 6.38 0.02 0 100.00% qags [10] > ... hm.. this is interesting. I wonder if it makes sense to teach the driver about this (it's a trivial change). opinions? > >>>>>>I suppose it will be pointless to ask, but shouldn't clang > >>>>>>support one of the most basic gcc compiler options if clang > >>>>>>is to replace gcc as the base system compiler? > >>>>> > >>>>>is GPROF really needed at this point? we have HWPMC, isnt > >>>>>it sufficient? > >>>>Hwpmc requires additional work for each new CPU model. Also, > >>>>hwpmc is not supported even on all Intel or AMD CPUs, esp. older > >>>>models, and e.g. VIA cores. > >>>> > >>>>Not to mention !x86 architectures. > >>> > >>>yes. I agree. HWPMC is not 100% solution. > >>> > >>>for those interested in profiling in LLVM in detail: > >>> > >>> http://llvm.org/pubs/2010-04-NeustifterProfiling.html > >>> > >>>summary: LLVM supports inserting profiling probes (but the selection > >>> of places where to put them is very naive) but there's no > >>> "GPROF writer". > >>> > >>>I mailed the author of the thesis yesterday and it looks like his work > >>>may > >>>get committed to upstream LLVM. > >>> > >> > >>Thanks for the url and checking on the status of profiling with llvm. > > > >I checked the LLVM code instead and here's what I found: > > > >LLVM actually supports profiling, in its own format (llvmprof.out). This > >can > >only be used for its PGO optimization (BasicBlockPlacement) and is very > >naive. > > > >Theoretically it should be possible to write "llvmprof.out -> a.out.gmon" > >converter - no idea how feasible it is. I guess it would not be very easy. > > > >I believe it can be sufficiently easy to write a "gprof-like dumper" for > >the llvmprof.out files (if there's not one already) that would print > >stuff like "foo called X times, bar called Y times". I dont know about > >the actual measuring of time. I think it's not in the llvmprof.out. > > > > I have not yet completely read the reference provided, but my impression > is that it describes considerably more sophistication than needed to get > gprof running with clang (though the thesis looks very interesting!). > All gprof needs is statistical profiling as provided by the kernel > through profil(2) and addition by the compiler of a call to .mcount (and > possibly allocation of a small amount of storage) on entry of each > function. gcc (and pcc before it) has done this for more than 20 years, > although I must admit that the code generated for the amd64 using -pg is > a bit opaque to me (i386 is straightforward, though). The rest of the > machinery needed is already there (in lib/libc/gmon and e.g. > lib/csu/amd64/crt1.c). would you be interested in working on adding the necessary stuff to LLVM? if it's really just about placing .mcount calls in profiling points I believe it should be doable.. roman
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110119104448.GA15730>
