Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 9 Mar 2010 23:55:35 +0100
From:      Fabien Thomas <fabien.thomas@netasq.com>
To:        David Christensen <davidch@broadcom.com>
Cc:        pyunyh@gmail.com, Ian FREISLICH <ianf@clue.co.za>, Ryan Stone <rysto32@gmail.com>, current@freebsd.org
Subject:   Re: dev.bce.X.com_no_buffers increasing and packet loss
Message-ID:  <EDAEC324-1741-4503-B94A-FE7551442E3A@netasq.com>
In-Reply-To: <bc2d971003091430g236806edy4cf3bb873665fc5@mail.gmail.com>
References:  <20100305210435.GF14818@michelle.cdnetworks.com> <20100305175639.GB14818@michelle.cdnetworks.com> <E1NnVaT-0003Ft-3p@clue.co.za> <E1Nnc4d-0003mB-6e@clue.co.za> <E1Nne0Q-0003uZ-OR@clue.co.za> <E1Noulp-0007Rc-Ro@clue.co.za> <20100309212139.GO1311@michelle.cdnetworks.com> <5D267A3F22FD854F8F48B3D2B52381933AF90EED69@IRVEXCHCCR01.corp.ad.broadcom.com> <20100309214012.GQ1311@michelle.cdnetworks.com> <5D267A3F22FD854F8F48B3D2B52381933AF90EEDA7@IRVEXCHCCR01.corp.ad.broadcom.com> <bc2d971003091430g236806edy4cf3bb873665fc5@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
If you are on head/stable_7/stable_8 you can also do quick test with top =
mode pmcstat -S unhalted-cycles -T =
(http://wiki.freebsd.org/PmcTools/PmcTop).
For more in depth post processing with source code (c+asm) you can =
output to Kcachegrind (http://wiki.freebsd.org/PmcTools/PmcKcachegrind).

Fabien


>> What's the traffic look like?  Jumbo, standard, short frames?  Any
>> good ideas on profiling the code?  I haven't figured out how to use
>> the CPU TSC but there is a free running timer on the device that
>> might be usable to calculate where the driver's time is spent.
>>=20
>> Dave
>=20
> In my experience hwpmc is the best and easiest way to profile anything
> on FreeBSD.  Here's something I sent to a different thread a couple of
> months ago explaining how to use it:
>=20
> 1) If device hwpmc is not compiled into your kernel, kldload hwpmc(you
> will need the HWPMC_HOOKS option in either case)
> 2) Run pmcstat to begin taking samples(make sure that whatever you are
> profiling is busy doing work first!):
>=20
> pmcstat -S unhalted-cycles -O /tmp/samples.out
>=20
> The -S option specifies what event you want to use to trigger
> sampling.  The unhalted-cycles is the best event to use if your
> hardware supports it; pmc will take a sample every 64K non-idle CPU
> cycles, which is basically equivalent to sampling based on time.  If
> the unhalted-cycles event is not supported by your hardware then the
> instructions event will probably be the next best choice(although it's
> nowhere near as good, as it will not be able to tell you, for example,
> if a particular function is very expensive because it takes a lot of
> cache misses compared to the rest of your program).  One caveat with
> the unhalted-cycles event is that time spent spinning on a spinlock or
> adaptively spinning on a MTX_DEF mutex will not be counted by this
> event, because most of the spinning time is spent executing an hlt
> instruction that idles the CPU for a short period of time.
>=20
> Modern Intel and AMD CPUs offer a dizzying array of events.  They're
> mostly only useful if you suspect that a particular kind of event is
> hurting your performance and you would like to know what is causing
> those events.  For example, if you suspect that data cache misses are
> causing you problems you can take samples on cache misses.
> Unfortunately on some of the newer CPUs(namely the Core2 family,
> because that's what I'm doing most of my profiling on nowadays) I find
> it difficult to figure out just what event to use to profile based on
> cache misses.  man pmc will give you an overview of pmc, and there are
> manpages for every CPU family supported(eg man pmc.core2)
>=20
> 3) After you've run pmcstat for "long enough"(a proper definition of
> long enough requires a statistician, which I most certainly am not,
> but I find that for a busy system 10 seconds is enough), Control-C it
> to stop it*.  You can use pmcstat to post-process the samples into
> human-readable text:
>=20
> pmcstat -R /tmp/samples.out -G /tmp/graph.txt
>=20
> The graph.txt file will show leaf functions on the left and their
> callers beneath them, indented to reflect the callchain.  It's not too
> easy to describe and I don't have sample output available right now.
>=20
>=20
> Another interesting tool for post-processing the samples is
> pmcannotate.  I've never actually used the tool before but it will
> annotate the program's source to show which lines are the most
> expensive.  This of course needs unstripped modules to work.  I think
> that it will also work if the GNU "debug link" is in the stripped
> module pointing to the location of the file with symbols.
>=20
>=20
> * Here's a tip I picked up from Joseph Koshy's blog: to collect
> samples for a fixed period of time(say 1 minute), have pmcstat run the
> sleep command:
>=20
> pmcstat -S unhalted-cycles -O /tmp/samples.out sleep 60
> _______________________________________________
> freebsd-current@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to =
"freebsd-current-unsubscribe@freebsd.org"




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?EDAEC324-1741-4503-B94A-FE7551442E3A>