From owner-freebsd-hackers Sat Mar 1 05:24:42 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id FAA04763 for hackers-outgoing; Sat, 1 Mar 1997 05:24:42 -0800 (PST) Received: from pdx1.world.net (pdx1.world.net [192.243.32.18]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id FAA04758 for ; Sat, 1 Mar 1997 05:24:40 -0800 (PST) Received: from suburbia.net (suburbia.net [203.4.184.1]) by pdx1.world.net (8.7.5/8.7.3) with SMTP id FAA18704 for ; Sat, 1 Mar 1997 05:26:21 -0800 (PST) Received: (qmail 9950 invoked by uid 110); 1 Mar 1997 10:44:39 -0000 MBOX-Line: From owner-netdev@roxanne.nuclecu.unam.mx Sat Mar 01 09:52:35 1997 remote from suburbia.net Delivered-To: proff@suburbia.net Received: (qmail 8779 invoked from network); 1 Mar 1997 09:52:29 -0000 Received: from roxanne.nuclecu.unam.mx (132.248.29.2) by suburbia.net with SMTP; 1 Mar 1997 09:52:29 -0000 Received: (from root@localhost) by roxanne.nuclecu.unam.mx (8.6.12/8.6.11) id DAA14241 for netdev-outgoing; Sat, 1 Mar 1997 03:35:34 -0600 Received: from caipfs.rutgers.edu (caipfs.rutgers.edu [128.6.37.100]) by roxanne.nuclecu.unam.mx (8.6.12/8.6.11) with ESMTP id DAA14236 for ; Sat, 1 Mar 1997 03:35:28 -0600 Received: from jenolan.caipgeneral (jenolan.rutgers.edu [128.6.111.5]) by caipfs.rutgers.edu (8.8.5/8.8.5) with SMTP id EAA12376 for ; Sat, 1 Mar 1997 04:33:00 -0500 (EST) Received: by jenolan.caipgeneral (SMI-8.6/SMI-SVR4) id EAA09015; Sat, 1 Mar 1997 04:32:49 -0500 Date: Sat, 1 Mar 1997 04:32:49 -0500 Message-Id: <199703010932.EAA09015@jenolan.caipgeneral> From: "David S. Miller" To: netdev@roxanne.nuclecu.unam.mx Subject: ok, final sockhash changes, new diff Sender: owner-hackers@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Ok, I've made the final major change I wanted to make, and fixed hopefully the last bug or two (a corner case where a port already in use would be given to a connect() request): vger.rutgers.edu:/pub/linux/Net/patches/fordyson3.diff.gz I believe I've made the TCP good port selection rather good. Here is approximately how it works. The third hash table I had previously mentioned is called the "bound" hash, it contains all sockets which are holding onto a local port and !(sk->dead && (sk->state == TCP_CLOSE). It is there just to make good socket selection and bind verification faster, it really serves no other purpose whatsoever. Choosing a good TCP local port goes something like this. The "start" static local to the routine tries to keep track of things so the next search is faster. It works most of the time, it is only updated when we hit a best possible match in the bound hash (ie. chain is empty). If an empty chain is found, the port we have gotten to is used, this is the fast case under low or zero load. When all chains have at least one entry, a different strategy kicks in. We keep track of how deep the smallest chains tend to be. I call this static local "binding_contour" just to be silly ;-) It tries to short circuit the search. It says essentially "if you find a chain with a length less than or equal to the binding contour, and the binding contour is non-zero, it is almost certainly the best you will find so just return it now" Another nice side effect of the binding contour technique is that it should spread out the bound hash table evenly, and as sockets begin to go away it automatically adjusts to this changing load, perhaps going back to the fast path mode if chains become entirely empty. There is one issue I still have with it, it's state is only updated every time a port selection occurs. There is potentially a really bad case where hundreds of sockets disappear in between two selection attempts, thus causing potentially an extremely poor selection (it's idea of load is 500 sockets old ;-) I considered adding a timer to adjust things so this doesn't happen, but I found that to be a rediculious idea. Nevertheless, I've beat up my machine and I'm rather confident with the changes. In particular I exhausted the entire TCP port space completely 30 or 40 times with connections on my test box. All I got was one stuck socket which timed out and disappeared just fine. ;-) Can you say "tcpcrashme" ;-) ---------------------------------------------//// Yow! 11.26 MB/s remote host TCP bandwidth & //// 199 usec remote TCP latency over 100Mb/s //// ethernet. Beat that! //// -----------------------------------------////__________ o David S. Miller, davem@caip.rutgers.edu /_____________/ / // /_/ ><