Date: Tue, 3 Jul 2012 23:28:16 +0200 From: Luigi Rizzo <rizzo@iet.unipi.it> To: "Alexander V. Chernikov" <melifaro@FreeBSD.org> Cc: hackers@freebsd.org, performance@freebsd.org, net@freebsd.org Subject: Re: FreeBSD 10G forwarding performance @Intel Message-ID: <20120703212816.GA92445@onelab2.iet.unipi.it> In-Reply-To: <4FF356BC.2060306@FreeBSD.org> References: <4FF319A2.6070905@FreeBSD.org> <20120703165506.GA90114@onelab2.iet.unipi.it> <4FF32DE2.2010606@FreeBSD.org> <20120703202757.GA90741@onelab2.iet.unipi.it> <4FF356BC.2060306@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Jul 04, 2012 at 12:31:56AM +0400, Alexander V. Chernikov wrote:
> On 04.07.2012 00:27, Luigi Rizzo wrote:
> >On Tue, Jul 03, 2012 at 09:37:38PM +0400, Alexander V. Chernikov wrote:
> >...
> >>Thanks, another good point. I forgot to merge this option from andre's
> >>patch.
> >>
> >>Another 30-40-50kpps to win.
> >
> >not much gain though.
> >What about the other IPSTAT_INC counters ?
> Well, we should then remove all such counters (total, forwarded) and
> per-interface statistics (at least for forwarded packets).
I am not saying to remove them for good, but at least have a
try at what we can hope to save by implementing them
on a per-cpu basis.
There is a chance that one will not
see big gains util the majority of such shared counters
are fixed (there are probably 3-4 at least on the non-error
path for forwarded packets), plus the per-interface ones
that are not even wrapped in macros (see if_ethersubr.c)
> >I think the IPSTAT_INC macros were introduced (by rwatson ?)
> >following a discussion on how to make the counters per-cpu
> >and avoid the contention on cache lines.
> >But they are still implemented as a single instance,
> >and neither volatile nor atomic, so it is not even clear
> >that they can give reliable results, let alone the fact
> >that you are likely to get some cache misses.
> >
> >the relevant macro is in ip_var.h.
> Hm. This seems to be just per-vnet structure instance.
yes but essentially they are still shared by all threads within a vnet
(besides you probably ran your tests in the main instance)
> We've got some more real DPCPU stuff (sys/pcpu.h && kern/subr_pcpu.c)
> which can be used for global ipstat structure, however since it is
> allocated from single area without possibility to free we can't use it
> for per-interface counters.
yes, those should be moved to a private, dynamically allocated
region of the ifnet (the number of CPUs is known at driver init
time, i hope). But again for a quick test disabling the
if_{i|o}{bytesC|packets} should do the job, if you can count
the received rate by some other means.
> I'll try to run tests without any possibly contested counters and report
> the results on Thursday.
great, that would be really useful info.
cheers
luigi
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120703212816.GA92445>
