Date: Tue, 3 Jul 2012 23:28:16 +0200 From: Luigi Rizzo <rizzo@iet.unipi.it> To: "Alexander V. Chernikov" <melifaro@FreeBSD.org> Cc: hackers@freebsd.org, performance@freebsd.org, net@freebsd.org Subject: Re: FreeBSD 10G forwarding performance @Intel Message-ID: <20120703212816.GA92445@onelab2.iet.unipi.it> In-Reply-To: <4FF356BC.2060306@FreeBSD.org> References: <4FF319A2.6070905@FreeBSD.org> <20120703165506.GA90114@onelab2.iet.unipi.it> <4FF32DE2.2010606@FreeBSD.org> <20120703202757.GA90741@onelab2.iet.unipi.it> <4FF356BC.2060306@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Jul 04, 2012 at 12:31:56AM +0400, Alexander V. Chernikov wrote: > On 04.07.2012 00:27, Luigi Rizzo wrote: > >On Tue, Jul 03, 2012 at 09:37:38PM +0400, Alexander V. Chernikov wrote: > >... > >>Thanks, another good point. I forgot to merge this option from andre's > >>patch. > >> > >>Another 30-40-50kpps to win. > > > >not much gain though. > >What about the other IPSTAT_INC counters ? > Well, we should then remove all such counters (total, forwarded) and > per-interface statistics (at least for forwarded packets). I am not saying to remove them for good, but at least have a try at what we can hope to save by implementing them on a per-cpu basis. There is a chance that one will not see big gains util the majority of such shared counters are fixed (there are probably 3-4 at least on the non-error path for forwarded packets), plus the per-interface ones that are not even wrapped in macros (see if_ethersubr.c) > >I think the IPSTAT_INC macros were introduced (by rwatson ?) > >following a discussion on how to make the counters per-cpu > >and avoid the contention on cache lines. > >But they are still implemented as a single instance, > >and neither volatile nor atomic, so it is not even clear > >that they can give reliable results, let alone the fact > >that you are likely to get some cache misses. > > > >the relevant macro is in ip_var.h. > Hm. This seems to be just per-vnet structure instance. yes but essentially they are still shared by all threads within a vnet (besides you probably ran your tests in the main instance) > We've got some more real DPCPU stuff (sys/pcpu.h && kern/subr_pcpu.c) > which can be used for global ipstat structure, however since it is > allocated from single area without possibility to free we can't use it > for per-interface counters. yes, those should be moved to a private, dynamically allocated region of the ifnet (the number of CPUs is known at driver init time, i hope). But again for a quick test disabling the if_{i|o}{bytesC|packets} should do the job, if you can count the received rate by some other means. > I'll try to run tests without any possibly contested counters and report > the results on Thursday. great, that would be really useful info. cheers luigi
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120703212816.GA92445>