From owner-freebsd-net@FreeBSD.ORG Thu Feb 6 07:14:56 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 498C8EC6 for ; Thu, 6 Feb 2014 07:14:56 +0000 (UTC) Received: from quix.smartspb.net (quix.smartspb.net [217.119.16.133]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id D814E1DAE for ; Thu, 6 Feb 2014 07:14:55 +0000 (UTC) Received: from dyr.smartspb.net ([217.119.16.26] helo=[127.0.0.1]) by quix.smartspb.net with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.61 (FreeBSD)) (envelope-from ) id 1WBJAk-00052n-8H for freebsd-net@freebsd.org; Thu, 06 Feb 2014 11:14:54 +0400 Message-ID: <52F3366D.3030202@smartspb.net> Date: Thu, 06 Feb 2014 11:14:53 +0400 From: Dennis Yusupoff User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: PF states degrade? X-Enigmail-Version: 1.6 X-Antivirus: avast! (VPS 140205-1, 05.02.2014), Outbound message X-Antivirus-Status: Clean Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.17 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Feb 2014 07:14:56 -0000 Good day. We had started to testing FreeBSD 10.0 in production (pf nat, ipfw pipes, ng_netflow) with setting (sysctl,pf.conf,ipfw.conf etc) from similar rocksolid 9.0-STABLE. Server has worked fine for a ~5 days and then suddenly stop forwarding traffic from clients. What was a quite unexpecting is how it had happening. Traffic from customers...dissappear (seen in tcpdump) from LAN interface in ~10 seconds after _connection_ (NAT translation state has been created?) has been started, with pf log (when set "log debug loud" in pf.conf) strange record appears in that moment, like that: 10.53.80.224 nat'ed in 109.71.177.147, http connection to 213.180.204.183: --- Feb 5 20:41:21 nata2 kernel: pf: State failure on: 1 | 5 Feb 5 20:41:21 nata2 kernel: pf: BAD state: TCP out wire: 213.180.204.183:80 Feb 5 20:41:21 nata2 kernel: 109.71.177.147:50114 stack: 213.180.204.183:80 10.53.80.224:50114 [lo=1997798965 high=1997799354 win=2772 modulator=0] Feb 5 20:41:21 nata2 kernel: [lo=864623348 high=864624718 win=389 modulator=0] 4:4 A seq=864739382 (864739382) ack=1997798965 len=1398 ackskew=0 pkts=3:2 dir=in,rev --- Full log there: http://pastebin.com/CQ78JyJe Disabling/enabling PF - no difference (except, indeed, nat stop working). After all attempts we did "pfctl -d" and setup ipfw nat for that customer. All has work fine! So we believe in uknown (for us) problem related to PF and it state work. PF rules and settings: --- ext_if="lagg0" int_if_1="vlan22" int_if_2="vlan21" dst_nat1="109.71.177.128/25" dst_nat2="109.71.177.0/25" table persist file "/etc/pf.src-nat" table const { 80.249.176.0/20, 93.92.192.0/21, 109.71.176.0/21, 217.119.16.0/20 } table persist { 10.52.249.24 } table persist { 84.204.97.154, 213.180.204.32, 195.95.218.31, 195.95.218.30 } set limit { states 1000000, frags 80000, src-nodes 100000, table-entries 500000} set state-policy if-bound set optimization aggressive set debug urgent set ruleset-optimization profile set timeout { frag 10, tcp.established 3600, src.track 30 } set block-policy drop set require-order no set skip on {lo0, em0, pfsync0} table persist pass in quick on $int_if_1 proto tcp from to any port smtp flags S/SAFR keep state pass in quick on $int_if_2 proto tcp from to any port smtp flags S/SAFR keep state pass in on $int_if_1 proto tcp from any to any port smtp flags S/SAFR keep state \ (max-src-conn 15, max-src-conn-rate 15/30, overload flush global) block return-icmp (host-prohib) log quick proto tcp from to any port smtp pass in on $int_if_2 proto tcp from any to any port smtp flags S/SAFR keep state \ (max-src-conn 15, max-src-conn-rate 15/30, overload flush global) block return-icmp (host-prohib) log quick proto tcp from to any port smtp pass in quick on $int_if_1 all no state allow-opts tag NAT1 label "$nr:NAT1" pass in quick on $int_if_2 all no state allow-opts tag NAT2 label "$nr:NAT2" binat-anchor "binat" load anchor "binat" from "/etc/pf.anchor.binat" nat-anchor "ftp-proxy/*" rdr-anchor "ftp-proxy/*" rdr pass on $int_if_1 proto tcp from to any port 21 -> 127.0.0.1 port 8021 rdr pass on $int_if_2 proto tcp from to any port 21 -> 127.0.0.1 port 8021 rdr pass on $ext_if proto udp from 109.71.176.3 to 109.71.176.2 port 4784 -> 10.78.76.2 port 4784 nat on $ext_if from to any tagged NAT1 -> $dst_nat1 static-port source-hash #sticky-address nat on $ext_if from to any tagged NAT2 -> $dst_nat2 static-port source-hash #sticky-address nat on $ext_if from any to -> $dst_nat1 static-port source-hash #sticky-address binat on $ext_if from 10.78.78.2 to any -> 93.92.199.252 nat on $ext_if from 10.78.76.0/24 to any -> 109.71.176.2 static-port source-hash nat on $ext_if from 10.78.77.0/24 to any -> 93.92.199.254 nat on $ext_if from 10.78.78.0/24 to any -> $dst_nat1 static-port source-hash anchor "ftp-proxy/*" pass out quick proto tcp from any to any port 21 no state pass quick on $ext_if proto gre all no state --- *P. S. Traffic start forwarding with pf only after server has been rebooted.* -- Best regards, Dennis Yusupoff, network engineer of Smart-Telecom ISP Russia, Saint-Petersburg