Date: Thu, 7 Sep 2006 00:18:56 -0500 (CDT) From: Mike Silbersack <silby@silby.com> To: Gleb Smirnoff <glebius@FreeBSD.org> Cc: cvs-src@FreeBSD.org, src-committers@FreeBSD.org, cvs-all@FreeBSD.org Subject: Re: cvs commit: src/sys/netinet in_pcb.c tcp_subr.c tcp_timer.c tcp_var.h Message-ID: <20060907000939.J12826@odysseus.silby.com> In-Reply-To: <20060906150129.GT40020@FreeBSD.org> References: <200609061356.k86DuZ0w016069@repoman.freebsd.org> <20060906091204.B6691@odysseus.silby.com> <20060906143204.GQ40020@FreeBSD.org> <20060906093553.L6691@odysseus.silby.com> <20060906150129.GT40020@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 6 Sep 2006, Gleb Smirnoff wrote: > I think we should free the oldmost tcptw entry in a case if we can't > find the local endpoint. We can tell definitely that we can't find one > only in in_pcbbind_setup() in the "do {} while (in_pcblookup_local)" cycle, > where EADDRNOTAVAIL is returned. We can't definitely tell this in > in_pcblookup_local() since we don't know whether tried port is the > last one. > > The oldmost tcptw entry can be taken simply from the ordered list, like > tcp_timer_2msl_tw() does this. That's something along the lines of what I was thinking. However, I think it'll be slightly more complex than taking just the oldest entry from the list. We could have time_wait states that are for sockets such as remoteip:ephemeralport <-> localip:80 and also localip:ephemeralport <-> remoteip:80. We'd have to find one of the ones of the second type to recycle. I think I know why my implementation went so wrong - I was testing the case where I had http_load (or was it apachebench?) connecting to apache on another machine. The case I was trying to solve was where the http benchmark tool created all the time_wait sockets on the client, thereby preventing new connections from being made. In that case, the heuristic would (probably) recycle the first socket it came upon, and be done. In your case, it would recycle the first socket it came upon, but it would be one of the remoteip:ephemeralport <-> localip:80 sockets, which wouldn't help it at all. Does that sound like what was happening? (I haven't reviewed the code, and I'm speaking from memory, so I apologize if I have the details slightly off.) > However, I don't like the idea of "finding" the free port at all. This > makes connect()/bind() performance depending on number of busy endpoints. > Shouldn't we make an algorythm, where free endpoints are stored, and > we don't need to _find_ one, we could just _take_ one? That's an interesting question. I guess right now the assumption is that you have 65535 ports, and very few of them are used, so it's cheaper to guess and see if one isn't used. You, on the other hand, seem to have a large number in use, so things are quite different. I guess you could make a port freelist. That would also solve the problem of randomized ephemeral ports causing a port to be reused too quickly. I'd be happy to review any such patch you could come up with in this area. > M> With this code removed, are you not seeing the web frontends delaying new > M> connections when they can't find a free port to use? > > No. We monitor the amount of entries in tcptw zone. It is the same > as before. So the periodic cycle purges tcptw states at the same > rate as in_pcblookup_local() was, except that it does this consuming > less CPU time. Ok, so you weren't actually running out of ephemeral ports like I was in the http benchmark tool scenario then. Mike "Silby" Silbersack
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060907000939.J12826>