Date: Wed, 3 Dec 2008 11:09:58 +0000 (UTC) From: Vadim Goncharov <vadim_nuclight@mail.ru> To: freebsd-performance@freebsd.org Subject: Re: hwpmc granularity and 6.4 network performance Message-ID: <slrngjcq85.2j6c.vadim_nuclight@server.filona.x88.info> References: <slrngil402.2di4.vadim_nuclight@server.filona.x88.info> <d763ac660811251209l7aa50960y8feff1845f90944f@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Adrian Chadd! On Tue, 25 Nov 2008 15:09:19 -0500; Adrian Chadd wrote about 'Re: hwpmc granularity and 6.4 network performance': > * Since you've changed two things - hwpmc _AND_ the kernel version - > you can't easily conclude which one (if any!) has any influence on > Giant showing up in your top output. I suggest recompiling without > hwpmc and seeing if the behaviour changes. This is not so easy to do at the time when I want :) I will check this some weeks later, may be. > * The gprof utility expects something resembling "time" for the > sampling data, but pmcstat doesn't record time, it records "events". > The counts you see in gprof are "events", so change "seconds" to > "events" in your reading of the gprof output. Of course, I know this, but it doesn't change the percentage. > * I don't know if the backported pmc to 6.4 handles stack call graphs > or not. Easy way to check - pmcstat -R sample.out | more ; see if you > just see "sample" lines or "sample" and "callgraph" lines. No. > * I bet that ipfw_chk is a big enough hint. How big is your ipfw ruleset? :) It's not so big in terms of rule count and not so big in terms of exact hint, but it is of course big as a CPU hog :) router# ipfw show | wc -l 70 Surely, not so much, yes? So I want to see which parts are more CPU-intensive, to use as a hint when rewriting ruleset. I've heard about a pmcannotate tool, in -arch@, and I think that it is tool which does the thing exactly what I want, but that requires patch for pmcstat which didn't apply on my 6.4, too much was different :( >> OK, I can conclude from this that I should optimize my ipfw ruleset, but >> that's all. I know from sources that ipfw_chk() is a big function with a >> bunch of 'case's in a large 'switch'. I want to know which parts of that >> switch are executed more often. It says in listing that granularity is >> 4 bytes, I assume that it has a sample for each of 4-byte chunks of binary >> code, so that it must have such information. My kernel is compiled with: >> >> makeoptions DEBUG=-g >> >> so kgdb does know where are instructions for each line of source code. >> How can I obtain this info from profiling? It also would be useful to know >> which places do calls to that bcmp() and rn_match(). -- WBR, Vadim Goncharov. ICQ#166852181 mailto:vadim_nuclight@mail.ru [Moderator of RU.ANTI-ECOLOGY][FreeBSD][http://antigreen.org][LJ:/nuclight]
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?slrngjcq85.2j6c.vadim_nuclight>