From owner-freebsd-pf@FreeBSD.ORG Thu Sep 28 21:52:15 2006 Return-Path: X-Original-To: freebsd-pf@freebsd.org Delivered-To: freebsd-pf@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 033D916A412 for ; Thu, 28 Sep 2006 21:52:15 +0000 (UTC) (envelope-from dhartmei@insomnia.benzedrine.cx) Received: from insomnia.benzedrine.cx (insomnia.benzedrine.cx [62.65.145.30]) by mx1.FreeBSD.org (Postfix) with ESMTP id A4BF043D53 for ; Thu, 28 Sep 2006 21:52:11 +0000 (GMT) (envelope-from dhartmei@insomnia.benzedrine.cx) Received: from insomnia.benzedrine.cx (dhartmei@localhost [127.0.0.1]) by insomnia.benzedrine.cx (8.13.4/8.13.4) with ESMTP id k8SLq936005298 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO); Thu, 28 Sep 2006 23:52:10 +0200 (MEST) Received: (from dhartmei@localhost) by insomnia.benzedrine.cx (8.13.4/8.12.10/Submit) id k8SLq9BE005701; Thu, 28 Sep 2006 23:52:09 +0200 (MEST) Date: Thu, 28 Sep 2006 23:52:09 +0200 From: Daniel Hartmeier To: Rolf Grossmann Message-ID: <20060928215208.GC25341@insomnia.benzedrine.cx> References: <200609282130.k8SLUmU8089296@progtech.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200609282130.k8SLUmU8089296@progtech.net> User-Agent: Mutt/1.5.10i Cc: freebsd-pf@freebsd.org Subject: Re: BAD state/State failure with large number of requests X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Sep 2006 21:52:15 -0000 On Thu, Sep 28, 2006 at 11:30:48PM +0200, Rolf Grossmann wrote: > Sep 28 23:56:56 balancer kernel: pf: BAD state: TCP 10.1.1.2:8080 10.25.0.41:8080 10.25.0.100:52209 [lo=2341692840 high=2341759447 win=33304 modulator=0 wscale=1] [lo=2919421554 high=2919488162 win=33304 modulator=0 wscale=1] 9:9 S seq=2345137961 ack=2919421554 len=0 ackskew=0 pkts=6:5 dir=in,fwd > Sep 28 23:56:56 balancer kernel: pf: State failure on: 1 | 5 This means there is an existing state entry from an old (and already closed) connection, and the client is re-using its source port 52209 for a new connection attempt (it's a SYN packet that triggered the log message). The client is not honouring the 2MSL quiet period, the time it should wait before re-using the same source port to connect to the same destination address/port, as required by the TCP RFCs. The reason for that is quite likely that it has run out of random high source ports. The range used should be about 49152-65536 (sysctl net.inet.ip.portrange.*), and 10,000 connections is getting close. The client stack can either make ap fail in connect(2), or re-use source ports and violate the RFCs in this case. Not sure if this is a realistic test, i.e. whether you see the very same problem in production (with 'BAD state' messages for SYN packets), it would only occur if one client is establishing connections to the same server port at high concurrency and/or rate. If not, I'd say the test is simply flawed, and you need multiple clients to simulate realistically. pf keeps state entries around for a while after a connection has been closed (to catch packets related to the old connection that might arrive late), the timeout is tcp.closed, 90s by default. You can make pf purge such state entries sooner by lowering this timeout. This most likely has nothing to do with rdr and load-balancing. The difference between enabling and disabling your rdr rule is basically that of filtering statefully vs. statelessly. Your 'pass all' rule does not create state, while the rdr will automatically create state. Daniel