Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 29 May 2009 11:32:46 -0400
From:      Martin Turgeon <freebsd@optiksecurite.com>
To:        freebsd-pf@freebsd.org
Subject:   Re: State Mismatch and tcp.closed
Message-ID:  <4A20001E.5000407@optiksecurite.com>
In-Reply-To: <4A1EB5A0.7030206@optiksecurite.com>
References:  <4A1EB5A0.7030206@optiksecurite.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Martin Turgeon a écrit :
> Hi list!
> 
> I had a problem with state mismatch on my DB server that I solved by 
> lowering the tcp.closed timeout. I setted it to 2 instead of 90.
> 
> I now have what looks like the same problem on the front-end web server. 
> However, when I tried to apply the same fix, I got connection problem 
> with the back-end DB, but the state mismatch disappearred.
> 
> On the front-end web server, the state mismatch occurs on the external 
> interface, only on port 80.
> 
> I enabled misc debugging and got this in /var/log/messages on the 
> front-end web server:
> 
> May 28 05:02:19 francis kernel: pf: BAD state: TCP 127.0.0.25:80 
> 206.125.166.65:80 98.207.239.10:54737 [lo=820536733 high=820603340 
> win=65535 modulator=0 wscale=0] [lo=2871317100 high=2871375106 win=8326 
> modulator=0 wscale=3] 7:4 R seq=820536733 (820536732) ack=2871317100 
> len=0 ackskew=0 pkts=43:69 dir=in,fwd
> May 28 05:02:19 francis kernel: pf: State failure on:         |
> May 28 05:02:19 francis kernel: pf: BAD state: TCP 127.0.0.25:80 
> 206.125.166.65:80 98.207.239.10:54733 [lo=374985971 high=375052578 
> win=65535 modulator=0 wscale=0] [lo=2999164748 high=2999229169 win=8326 
> modulator=0 wscale=3] 7:4 R seq=374985971 (374985970) ack=2999164748 
> len=0 ackskew=0 pkts=40:54 dir=in,fwd
> May 28 05:02:19 francis kernel: pf: State failure on:         |
> May 28 05:03:06 francis kernel: pf: BAD state: TCP 127.0.0.20:80 
> 206.125.166.80:80 123.116.84.41:59776 [lo=3407758259 high=3407823796 
> win=4096 modulator=0 wscale=2] [lo=374200006 high=374216390 win=8192 
> modulator=0 wscale=3] 4:2 A seq=3407758259 (3407758260) ack=2320196160 
> len=0 ackskew=-1945996154 pkts=1:1 dir=in,fwd
> May 28 05:03:06 francis kernel: pf: State failure on:     3   |
> May 28 05:03:06 francis kernel: pf: BAD state: TCP 127.0.0.20:80 
> 206.125.166.80:80 123.116.84.41:59776 [lo=3407758259 high=3407823796 
> win=4096 modulator=0 wscale=2] [lo=374200006 high=374216390 win=8192 
> modulator=0 wscale=3] 4:2 RA seq=3407758259 (3407758260) ack=2320196160 
> len=0 ackskew=-1945996154 pkts=1:1 dir=in,fwd
> 
> This server has been up for 12 days and already got almost 600000 state 
> mismatch!
> 
> I tried to lower tcp.finwait, no result. I tried to set optimization to 
> aggressive, no result. I tried to disable port randomization via sysctl, 
> no result either.
> 
> I tcpdumped and there is only a few RST so I don't understand why 
> tcp.closed would solve my problem. If it's a problem with source port 
> reuse, tcp.finwait should be the timeout that would help, not 
> tcp.closed, right?
> 
> How can a lower tcp.closed on the front-end cause mysql connection 
> problem with the back-end? I tcpdumped while there is a connection 
> problem with the DB and there is nothing that seems wrong, no RST at 
> all! The front-end web server tries to connect to the DB, wait 3 sec and 
> if it fails to establish a connection, it then tries to connect to a 
> read-only backup DB, on another server, which never fails to connect.
> 
> The only thing I'm sure is that it's the tcp.closed that cause the DB 
> connection problem. As soon as I remove it, the state mismatch comes 
> back on the external interface but there's no DB connection problem 
> anymore.
> 
> What am I missing?
> 
> Martin
> 

I forgot to mention in the starting post what version I'm using:

uname -a on the front-end web server:
FreeBSD webserver 7.2-RELEASE FreeBSD 7.2-RELEASE #0: Fri May  1 
07:18:07 UTC 2009 
root@driscoll.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  amd64

uname -a on the back-end MySQL server:
FreeBSD mysql 7.0-RELEASE-p5 FreeBSD 7.0-RELEASE-p5 #1: Tue Oct  7 
09:57:31 EDT 2008 root@martin.ringadmin.com:/usr/obj/usr/src/sys/OPTIK 
amd64

I read about the port reuse problem when I first experienced it with the 
DB server and I saw that this wasn't going to happen with the new 
release. I were happy to build I new 7.2-Rel server so that I wasn't 
going to face the same problem.

But, in fact, I'm facing what looks like the same problem...

I'm all ears to any pointers/suggestions!

Thanks for your precious help.

Martin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4A20001E.5000407>