Date: Mon, 23 Apr 2012 15:54:58 +0200 From: Julian Stecklina <js@alien8.de> To: freebsd-net@freebsd.org Subject: Re: Some performance measurements on the FreeBSD network stack Message-ID: <87k4161rkd.fsf@alien8.de> References: <20120419133018.GA91364@onelab2.iet.unipi.it> <4F907011.9080602@freebsd.org> <20120419204622.GA94904@onelab2.iet.unipi.it> <CAHM0Q_M4wcEiWGkjWxE1OjLeziQN0vM%2B4_EYS_WComZ6=j5xhA@mail.gmail.com> <20120419212224.GA95459@onelab2.iet.unipi.it> <CAHM0Q_Md4M1YRA=RJD7-xVxehvwWFjU07PdA5vWFBR6PXE14Zw@mail.gmail.com> <20120420144410.GA3629@onelab2.iet.unipi.it> <CAHM0Q_P3XvOZrfJW7dUa23H%2BYUMe608hoKY41DZ7BGGc=cKniQ@mail.gmail.com> <20120421155638.E982@besplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Thus spake Bruce Evans <brde@optusnet.com.au>: > On Fri, 20 Apr 2012, K. Macy wrote: > >> On Fri, Apr 20, 2012 at 4:44 PM, Luigi Rizzo <rizzo@iet.unipi.it> wrote: > >>> The small penalty when flowtable is disabled but compiled in is >>> probably because the net.flowtable.enable flag is checked >>> a bit deep in the code. >>> >>> The advantage with non-connect()ed sockets is huge. I don't >>> quite understand why disabling the flowtable still helps there. >> >> Do you mean having it compiled in but disabled still helps >> performance? Yes, that is extremely strange. > > This reminds me that when I worked on this, I saw very large throughput > differences (in the 20-50% range) as a result of minor changes in > unrelated code. I could get these changes intentionally by adding or > removing padding in unrelated unused text space, so the differences were > apparently related to text alignment. I thought I had some significant > micro-optimizations, but it turned out that they were acting mainly by > changing the layout in related used text space where it is harder to > control. For short code paths, code layout can significantly influence performance. We have been puzzled (in a project unrelated to FreeBSD) by a 10% performance drop in some microbenchmark that was ultimately caused by having all our code hotspots linked at 8K aligned addresses, which caused them to evict each other from the L1 instruction cache, because its associativity was too small. A simple way to check for this would be to have the option to build a kernel with random linking order. I don't know how difficult it is to implement that in the current FreeBSD toolchain. Julian
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?87k4161rkd.fsf>