From owner-freebsd-net@FreeBSD.ORG Thu Feb 6 16:59:12 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2AE2898C for ; Thu, 6 Feb 2014 16:59:12 +0000 (UTC) Received: from mx1.shrew.net (mx1.shrew.net [38.97.5.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id E1C281BBA for ; Thu, 6 Feb 2014 16:59:11 +0000 (UTC) Received: from mail.shrew.net (mail.shrew.prv [10.24.10.20]) by mx1.shrew.net (8.14.7/8.14.7) with ESMTP id s16Gd7xq019198 for ; Thu, 6 Feb 2014 10:39:07 -0600 (CST) (envelope-from mgrooms@shrew.net) Received: from [127.0.0.1] (216-110-21-66.static.twtelecom.net [216.110.21.66]) by mail.shrew.net (Postfix) with ESMTPSA id 05FDA187F82 for ; Thu, 6 Feb 2014 10:39:02 -0600 (CST) Message-ID: <52F3BAB6.7090304@shrew.net> Date: Thu, 06 Feb 2014 10:39:18 -0600 From: Matthew Grooms User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Re: PF states degrade? References: <52F3366D.3030202@smartspb.net> In-Reply-To: <52F3366D.3030202@smartspb.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (mx1.shrew.net [10.24.10.10]); Thu, 06 Feb 2014 10:39:07 -0600 (CST) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Feb 2014 16:59:12 -0000 On 2/6/2014 1:14 AM, Dennis Yusupoff wrote: > Good day. > > We had started to testing FreeBSD 10.0 in production (pf nat, ipfw > pipes, ng_netflow) with setting (sysctl,pf.conf,ipfw.conf etc) from > similar rocksolid 9.0-STABLE. > Server has worked fine for a ~5 days and then suddenly stop forwarding > traffic from clients. What was a quite unexpecting is how it had > happening. Traffic from customers...dissappear (seen in tcpdump) from > LAN interface in ~10 seconds after _connection_ (NAT translation state > has been created?) has been started, with pf log (when set "log debug > loud" in pf.conf) strange record appears in that moment, like that: > > 10.53.80.224 nat'ed in 109.71.177.147, http connection to 213.180.204.183: > --- > Feb 5 20:41:21 nata2 kernel: pf: State failure on: 1 | 5 > Feb 5 20:41:21 nata2 kernel: pf: BAD state: TCP out wire: > 213.180.204.183:80 > Feb 5 20:41:21 nata2 kernel: 109.71.177.147:50114 stack: > 213.180.204.183:80 10.53.80.224:50114 [lo=1997798965 high=1997799354 > win=2772 modulator=0] > Feb 5 20:41:21 nata2 kernel: [lo=864623348 high=864624718 win=389 > modulator=0] 4:4 A seq=864739382 (864739382) ack=1997798965 len=1398 > ackskew=0 pkts=3:2 dir=in,rev > --- > Full log there: http://pastebin.com/CQ78JyJe > > Disabling/enabling PF - no difference (except, indeed, nat stop working). > > After all attempts we did "pfctl -d" and setup ipfw nat for that > customer. All has work fine! So we believe in uknown (for us) problem > related to PF and it state work. > > PF rules and settings: > > --- > ext_if="lagg0" > int_if_1="vlan22" > int_if_2="vlan21" > > dst_nat1="109.71.177.128/25" > dst_nat2="109.71.177.0/25" > > table persist file "/etc/pf.src-nat" > table const { 80.249.176.0/20, 93.92.192.0/21, > 109.71.176.0/21, 217.119.16.0/20 } > table persist { 10.52.249.24 } > > table persist { 84.204.97.154, 213.180.204.32, > 195.95.218.31, 195.95.218.30 } > > set limit { states 1000000, frags 80000, src-nodes 100000, table-entries > 500000} > set state-policy if-bound > set optimization aggressive > set debug urgent > set ruleset-optimization profile > set timeout { frag 10, tcp.established 3600, src.track 30 } > set block-policy drop > set require-order no > > > set skip on {lo0, em0, pfsync0} > > > table persist > pass in quick on $int_if_1 proto tcp from to any port > smtp flags S/SAFR keep state > pass in quick on $int_if_2 proto tcp from to any port > smtp flags S/SAFR keep state > pass in on $int_if_1 proto tcp from any to any port smtp flags S/SAFR > keep state \ > (max-src-conn 15, max-src-conn-rate 15/30, overload > flush global) > block return-icmp (host-prohib) log quick proto tcp from > to any port smtp > > pass in on $int_if_2 proto tcp from any to any port smtp flags S/SAFR > keep state \ > (max-src-conn 15, max-src-conn-rate 15/30, overload > flush global) > block return-icmp (host-prohib) log quick proto tcp from > to any port smtp > > > pass in quick on $int_if_1 all no state allow-opts tag NAT1 label "$nr:NAT1" > pass in quick on $int_if_2 all no state allow-opts tag NAT2 label "$nr:NAT2" > > binat-anchor "binat" > load anchor "binat" from "/etc/pf.anchor.binat" > nat-anchor "ftp-proxy/*" > rdr-anchor "ftp-proxy/*" > rdr pass on $int_if_1 proto tcp from to any port 21 -> > 127.0.0.1 port 8021 > rdr pass on $int_if_2 proto tcp from to any port 21 -> > 127.0.0.1 port 8021 > rdr pass on $ext_if proto udp from 109.71.176.3 to 109.71.176.2 port > 4784 -> 10.78.76.2 port 4784 > > nat on $ext_if from to any tagged NAT1 -> $dst_nat1 > static-port source-hash #sticky-address > nat on $ext_if from to any tagged NAT2 -> $dst_nat2 > static-port source-hash #sticky-address > nat on $ext_if from any to -> $dst_nat1 static-port > source-hash #sticky-address > > binat on $ext_if from 10.78.78.2 to any -> 93.92.199.252 > > nat on $ext_if from 10.78.76.0/24 to any -> 109.71.176.2 static-port > source-hash > nat on $ext_if from 10.78.77.0/24 to any -> 93.92.199.254 > nat on $ext_if from 10.78.78.0/24 to any -> $dst_nat1 static-port > source-hash > > anchor "ftp-proxy/*" > pass out quick proto tcp from any to any port 21 no state > > pass quick on $ext_if proto gre all no state > --- > > *P. S. Traffic start forwarding with pf only after server has been Dennis, Did you run out of pf state table entries? You can use pfctl to list the current limit and usage ... INFO: Status: Enabled for 14 days 19:48:29 Debug: Urgent State Table Total Rate current entries 4 searches 2030427 1.6/s inserts 64990 0.1/s removals 64986 0.1/s LIMITS: states hard limit 10000 src-nodes hard limit 10000 frags hard limit 5000 table-entries hard limit 200000 .. If that is the case, you can increase your state table size by inserting some configuration parameters at the top of your pf.conf file. For example ... set limit states 50000 set limit src-nodes 50000 set limit frags 25000 -Matthew