Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 23 Apr 2012 15:54:58 +0200
From:      Julian Stecklina <js@alien8.de>
To:        freebsd-net@freebsd.org
Subject:   Re: Some performance measurements on the FreeBSD network stack
Message-ID:  <87k4161rkd.fsf@alien8.de>
References:  <20120419133018.GA91364@onelab2.iet.unipi.it> <4F907011.9080602@freebsd.org> <20120419204622.GA94904@onelab2.iet.unipi.it> <CAHM0Q_M4wcEiWGkjWxE1OjLeziQN0vM%2B4_EYS_WComZ6=j5xhA@mail.gmail.com> <20120419212224.GA95459@onelab2.iet.unipi.it> <CAHM0Q_Md4M1YRA=RJD7-xVxehvwWFjU07PdA5vWFBR6PXE14Zw@mail.gmail.com> <20120420144410.GA3629@onelab2.iet.unipi.it> <CAHM0Q_P3XvOZrfJW7dUa23H%2BYUMe608hoKY41DZ7BGGc=cKniQ@mail.gmail.com> <20120421155638.E982@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Thus spake Bruce Evans <brde@optusnet.com.au>:

> On Fri, 20 Apr 2012, K. Macy wrote:
>
>> On Fri, Apr 20, 2012 at 4:44 PM, Luigi Rizzo <rizzo@iet.unipi.it> wrote:
>
>>> The small penalty when flowtable is disabled but compiled in is
>>> probably because the net.flowtable.enable flag is checked
>>> a bit deep in the code.
>>>
>>> The advantage with non-connect()ed sockets is huge. I don't
>>> quite understand why disabling the flowtable still helps there.
>>
>> Do you mean having it compiled in but disabled still helps
>> performance? Yes, that is extremely strange.
>
> This reminds me that when I worked on this, I saw very large throughput
> differences (in the 20-50% range) as a result of minor changes in
> unrelated code.  I could get these changes intentionally by adding or
> removing padding in unrelated unused text space, so the differences were
> apparently related to text alignment.  I thought I had some significant
> micro-optimizations, but it turned out that they were acting mainly by
> changing the layout in related used text space where it is harder to
> control.

For short code paths, code layout can significantly influence
performance. We have been puzzled (in a project unrelated to FreeBSD) by
a 10% performance drop in some microbenchmark that was ultimately caused
by having all our code hotspots linked at 8K aligned addresses, which
caused them to evict each other from the L1 instruction cache, because
its associativity was too small.

A simple way to check for this would be to have the option to build a
kernel with random linking order. I don't know how difficult it is to
implement that in the current FreeBSD toolchain.

Julian




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?87k4161rkd.fsf>