Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 12 Nov 2012 22:45:38 -0800
From:      Alfred Perlstein <bright@mu.org>
To:        Peter Wemm <peter@wemm.org>
Cc:        "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, Adrian Chadd <adrian@freebsd.org>
Subject:   Re: auto tuning tcp
Message-ID:  <50A1EC92.9000507@mu.org>
In-Reply-To: <CAGE5yCoj1dL9w-EMMi8iYMTOq9uUUHmFg4rMY7aPneUBHBv67Q@mail.gmail.com>
References:  <50A0A0EF.3020109@mu.org> <50A0A502.1030306@networx.ch> <50A0B8DA.9090409@mu.org> <50A0C0F4.8010706@networx.ch> <EB2C22B5-C18D-4AC2-8694-C5C0D96C07B3@mu.org> <50A13961.1030909@networx.ch> <50A14460.9020504@mu.org> <50A1E2E7.3090705@mu.org> <50A1E47C.1030208@mu.org> <CAGE5yCoj1dL9w-EMMi8iYMTOq9uUUHmFg4rMY7aPneUBHBv67Q@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 11/12/12 10:23 PM, Peter Wemm wrote:
> On Mon, Nov 12, 2012 at 10:11 PM, Alfred Perlstein <bright@mu.org> wrote:
>> On 11/12/12 10:04 PM, Alfred Perlstein wrote:
>>> On 11/12/12 10:48 AM, Alfred Perlstein wrote:
>>>> On 11/12/12 10:01 AM, Andre Oppermann wrote:
>>>>>
>>>>> I've already added the tunable "kern.maxmbufmem" which is in pages.
>>>>> That's probably not very convenient to work with.  I can change it
>>>>> to a percentage of phymem/kva.  Would that make you happy?
>>>>>
>>>> It really makes sense to have the hash table be some relation to sockets
>>>> rather than buffers.
>>>>
>>>> If you are hashing "foo-objects" you want the hash to be some relation to
>>>> the max amount of "foo-objects" you'll see, not backwards derived from the
>>>> number of "bar-objects" that "foo-objects" contain, right?
>>>>
>>>> Because we are hashing the sockets, right?   not clusters.
>>>>
>>>> Maybe I'm wrong?  I'm open to ideas.
>>>
>>> Hey Andre, the following patch is what I was thinking
>>> (uncompiled/untested), it basically rounds up the maxsockets to a power of 2
>>> and replaces the default 512 tcb hashsize.
>>>
>>> It might make sense to make the auto-tuning default to a minimum of 512.
>>>
>>> There are a number of other hashes with static sizes that could make use
>>> of this logic provided it's not upside-down.
>>>
>>> Any thoughts on this?
>>>
>>> Tune the tcp pcb hash based on maxsockets.
>>> Be more forgiving of poorly chosen tunables by finding a closer power
>>> of two rather than clamping down to 512.
>>> Index: tcp_subr.c
>>> ===================================================================
>>
>> Sorry, GUI mangled the patch... attaching a plain text version.
>>
>>
> Wait, you want to replace a hash with a flat array?  Why even bother
> to call it a hash at that point?
>
>

If you are concerned about the space/time tradeoff I'm pretty happy with 
making it 1/2, 1/4th, 1/8th the size of maxsockets.  (smaller?)

Would that work better?

The reason I chose to make it equal to max sockets was a space/time 
tradeoff, ideally a hash should have zero collisions and if a user has 
enough memory for 250,000 sockets, then surely they have enough memory 
for 256,000 pointers.

If you strongly disagree then I am fine with a more conservative 
setting, just note that effectively the hash table will require 1/2 the 
factor that we go smaller in additional traversals when we max out the 
number of sockets.  Meaning if the table is 1/4 the size of max sockets, 
when we hit that many tcp connections I think we'll see an order of 
average 2 linked list traversals to find a node.  At 1/8, then that 
number becomes 4.

I recall back in 2001 on a PII400 with a custom webserver I wrote having 
a huge benefit by upping this to 2^14 or maybe even 2^16, I forget, but 
suddenly my CPU went down a huge amount and I didn't have to worry about 
a load balancer or other tricks.


-Alfred








Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?50A1EC92.9000507>