Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 12 Nov 2012 22:11:08 -0800
From:      Alfred Perlstein <bright@mu.org>
To:        Andre Oppermann <oppermann@networx.ch>
Cc:        "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, Adrian Chadd <adrian@freebsd.org>, Peter Wemm <peter@wemm.org>
Subject:   Re: auto tuning tcp
Message-ID:  <50A1E47C.1030208@mu.org>
In-Reply-To: <50A1E2E7.3090705@mu.org>
References:  <50A0A0EF.3020109@mu.org> <50A0A502.1030306@networx.ch> <50A0B8DA.9090409@mu.org> <50A0C0F4.8010706@networx.ch> <EB2C22B5-C18D-4AC2-8694-C5C0D96C07B3@mu.org> <50A13961.1030909@networx.ch> <50A14460.9020504@mu.org> <50A1E2E7.3090705@mu.org>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------090100000306090603090504
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

On 11/12/12 10:04 PM, Alfred Perlstein wrote:
> On 11/12/12 10:48 AM, Alfred Perlstein wrote:
>> On 11/12/12 10:01 AM, Andre Oppermann wrote:
>>>
>>> I've already added the tunable "kern.maxmbufmem" which is in pages.
>>> That's probably not very convenient to work with.  I can change it
>>> to a percentage of phymem/kva.  Would that make you happy?
>>>
>>
>> It really makes sense to have the hash table be some relation to 
>> sockets rather than buffers.
>>
>> If you are hashing "foo-objects" you want the hash to be some 
>> relation to the max amount of "foo-objects" you'll see, not backwards 
>> derived from the number of "bar-objects" that "foo-objects" contain, 
>> right?
>>
>> Because we are hashing the sockets, right?   not clusters.
>>
>> Maybe I'm wrong?  I'm open to ideas.
>
> Hey Andre, the following patch is what I was thinking 
> (uncompiled/untested), it basically rounds up the maxsockets to a 
> power of 2 and replaces the default 512 tcb hashsize.
>
> It might make sense to make the auto-tuning default to a minimum of 512.
>
> There are a number of other hashes with static sizes that could make 
> use of this logic provided it's not upside-down.
>
> Any thoughts on this?
>
> Tune the tcp pcb hash based on maxsockets.
> Be more forgiving of poorly chosen tunables by finding a closer power
> of two rather than clamping down to 512.
> Index: tcp_subr.c
> ===================================================================

Sorry, GUI mangled the patch... attaching a plain text version.



--------------090100000306090603090504
Content-Type: text/plain; charset=UTF-8; x-mac-type="0"; x-mac-creator="0";
 name="tcp_auto_tune_hash.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="tcp_auto_tune_hash.diff"

Index: tcp_subr.c
===================================================================
--- tcp_subr.c	(revision 242936)
+++ tcp_subr.c	(working copy)
@@ -235,7 +235,7 @@
  * variable net.inet.tcp.tcbhashsize
  */
 #ifndef TCBHASHSIZE
-#define TCBHASHSIZE	512
+#define TCBHASHSIZE	0
 #endif
 
 /*
@@ -282,6 +282,27 @@
 	return (0);
 }
 
+/*
+ * Take a value and get the next power of 2 that doesn't overflow.
+ * Used to size the tcp_inpcb hash buckets.
+ */
+static int
+maketcp_hashsize(int size)
+{
+	int hashsize;
+
+	/*
+	 * auto tune.
+	 * get the next power of 2 higher than maxsockets.
+	 */
+	hashsize = 1 << fls(maxsockets);
+	/* catch overflow, and just go one power of 2 smaller */
+	if (hashsize < maxsockets) {
+		hashsize = 1 << (fls(maxsockets) - 1);
+	}
+	return hashsize;
+}
+
 void
 tcp_init(void)
 {
@@ -296,9 +317,20 @@
 
 	hashsize = TCBHASHSIZE;
 	TUNABLE_INT_FETCH("net.inet.tcp.tcbhashsize", &hashsize);
+	if (hashsize == 0) {
+		/* auto tune based on maxsockets */
+		hashsize = maketcp_hashsize(maxsockets);
+	}
+	/*
+	 * Be forgiving of admins that don't know to make the tunable
+	 * a power of two.
+	 */
 	if (!powerof2(hashsize)) {
-		printf("WARNING: TCB hash size not a power of 2\n");
-		hashsize = 512; /* safe default */
+		int oldhashsize = hashsize;
+
+		hashsize = maketcp_hashsize(hashsize);
+		printf("%s: WARNING: TCB hash size not a power of 2, "
+		    "fixed %d -> %d\n", __func__, oldhashsize, hashsize);
 	}
 	in_pcbinfo_init(&V_tcbinfo, "tcp", &V_tcb, hashsize, hashsize,
 	    "tcp_inpcb", tcp_inpcb_init, NULL, UMA_ZONE_NOFREE,

--------------090100000306090603090504--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?50A1E47C.1030208>