Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 29 Sep 2006 01:00:30 +0200
From:      Rolf Grossmann <rg@progtech.net>
To:        Daniel Hartmeier <daniel@benzedrine.cx>
Cc:        freebsd-pf@freebsd.org
Subject:   Re: BAD state/State failure with large number of requests
Message-ID:  <451C540E.2010005@PROGTECH.net>
In-Reply-To: <20060928215208.GC25341@insomnia.benzedrine.cx>
References:  <200609282130.k8SLUmU8089296@progtech.net> <20060928215208.GC25341@insomnia.benzedrine.cx>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi,

thank you very much for your fast response.

Daniel Hartmeier wrote:

> The client is not honouring the 2MSL quiet period, the time it should
> wait before re-using the same source port to connect to the same
> destination address/port, as required by the TCP RFCs.
> 
> The reason for that is quite likely that it has run out of random high
> source ports. The range used should be about 49152-65536 (sysctl
> net.inet.ip.portrange.*), and 10,000 connections is getting close. The
> client stack can either make ap fail in connect(2), or re-use source ports
> and violate the RFCs in this case.

You're absolutely correct, that seems to be my problem. Increasing the 
range allows me to get a lot more requests through.

> Not sure if this is a realistic test, i.e. whether you see the very same
> problem in production (with 'BAD state' messages for SYN packets), it
> would only occur if one client is establishing connections to the same
> server port at high concurrency and/or rate. If not, I'd say the test is
> simply flawed, and you need multiple clients to simulate realistically.

I've been suspecting that the test is flawed, but I couldn't put my 
finger on it. However, I also need a way to actually test my 
application with a lot of requests and I wouldn't want to buy another 
server farm for that ;)

> pf keeps state entries around for a while after a connection has been
> closed (to catch packets related to the old connection that might arrive
> late), the timeout is tcp.closed, 90s by default. You can make pf purge
> such state entries sooner by lowering this timeout.

That timeout seems awfully long to me. Is there some standard that 
mandates such a long timeout? At least for testing I will definitely 
lower that, too.

Thanks again, Rolf.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?451C540E.2010005>