From owner-freebsd-net@freebsd.org Thu Jan 21 19:45:12 2016 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0B3EAA8C926 for ; Thu, 21 Jan 2016 19:45:12 +0000 (UTC) (envelope-from mgrooms@shrew.net) Received: from mx1.shrew.net (mx1.shrew.net [38.97.5.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id AEB771E57 for ; Thu, 21 Jan 2016 19:45:11 +0000 (UTC) (envelope-from mgrooms@shrew.net) Received: from mail.shrew.net (mail.shrew.prv [10.24.10.20]) by mx1.shrew.net (8.14.7/8.14.7) with ESMTP id u0LJgO5n096209 for ; Thu, 21 Jan 2016 13:42:24 -0600 (CST) (envelope-from mgrooms@shrew.net) Received: from [10.16.32.30] (72-48-144-84.static.grandenetworks.net [72.48.144.84]) by mail.shrew.net (Postfix) with ESMTPSA id 4776018C688 for ; Thu, 21 Jan 2016 13:42:19 -0600 (CST) From: Matthew Grooms Subject: Re: pf state disappearing [ adaptive timeout bug ] To: freebsd-net@freebsd.org References: <56A003B8.9090104@shrew.net> Message-ID: <56A13531.8090209@shrew.net> Date: Thu, 21 Jan 2016 13:44:49 -0600 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (mx1.shrew.net [10.24.10.10]); Thu, 21 Jan 2016 13:42:24 -0600 (CST) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jan 2016 19:45:12 -0000 On 1/21/2016 11:04 AM, Nick Rogers wrote: > On Wed, Jan 20, 2016 at 2:01 PM, Matthew Grooms wrote: > >> All, >> >> I have a curious problem with a lightly loaded pair of pf firewall running >> on FreeBSD 10.2-RELEASE. I'm noticing TCP entries are disappearing from >> the state table for no good reason that I can see. The entry limit is set >> to 100000 and I never see the system go over about 70000 entries, so we >> shouldn't be hitting the configured limit ... >> > In my experience if you hit the state limit, new connections/states are > dropped and existing states are unaffected. Aha! You shook something out of the dusty depths of my slow brain :) I believe that what you say is true as long as adaptive timeouts are disabled, which by default they are not ... Timeout values can be reduced adaptively as the number of state ta- ble entries grows. adaptive.start When the number of state entries exceeds this value, adaptive scaling begins. All timeout values are scaled linearly with factor (adaptive.end - number of states) / (adaptive.end - adaptive.start). adaptive.end When reaching this number of state entries, all timeout val- ues become zero, effectively purging all state entries imme- diately. This value is used to define the scale factor, it should not actually be reached (set a lower state limit, see below). Adaptive timeouts are enabled by default, with an adaptive.start value equal to 60% of the state limit, and an adaptive.end value equal to 120% of the state limit. They can be disabled by setting both adaptive.start and adaptive.end to 0. >> # pfctl -sm >> states hard limit 100000 >> src-nodes hard limit 100000 >> frags hard limit 50000 >> table-entries hard limit 200000 >> >> # pfctl -si >> Status: Enabled for 78 days 14:24:18 Debug: Urgent >> >> State Table Total Rate >> current entries 67829 >> searches 113412118733 16700.2/s >> inserts 386313496 56.9/s >> removals 386245667 56.9/s >> Counters >> match 441731678 65.0/s >> bad-offset 0 0.0/s >> fragment 1090 0.0/s >> short 220 0.0/s >> normalize 761 0.0/s >> memory 0 0.0/s >> bad-timestamp 0 0.0/s >> congestion 0 0.0/s >> ip-option 4366487 0.6/s >> proto-cksum 0 0.0/s >> state-mismatch 50334 0.0/s >> state-insert 10 0.0/s >> state-limit 0 0.0/s >> src-limit 0 0.0/s >> synproxy 0 0.0/s >> >> This problem is easy to reproduce by establishing an SSH connection to the >> firewall itself, letting it sit for a while and then examining the state >> table. After a connection is made, I can see the entry with an >> established:established state ... >> >> # pfctl -ss | grep X.X.X.X | grep 63446 >> all tcp Y.Y.Y.Y:22 <- X.X.X.X:63446 ESTABLISHED:ESTABLISHED >> >> If I let the SSH session sit for a while and then try to type into the >> terminal on the client end, the connection stalls and produces a network >> error message. When I look at the pf state table again, the state entry for >> the connection is no longer visible. However, the ssh process is still >> running and I still see the TCP connection established in the output of >> netstat ... >> >> # netstat -na | grep 63446 >> tcp4 0 0 Y.Y.Y.Y.22 X.X.X.X.63446 ESTABLISHED >> >> When I observe the packet flow in TCP dump when a connection stalls, >> packets being sent from the client are visible on the physical interface >> but are shown as blocked on the pflog0 interface. >> > Does this happen with non-SSH connections? It sounds like your SSH > client/server interaction is not performing a keep-alive frequently enough > to keep the PF state established. If no packets are sent over the > connection (state) for some time, then PF will timeout (remove) the state. > At this point your SSH client still believes it has a successful > connection, so it tries to send packets when you resume typing, but they > are blocked by your PF rules which likely specify "flags S/SA keep state", > either explicitly or implicitly (it is the filter rule default), which > means block packets that don't match an existing state or are not part of > the initial SYN handshake of the TCP connection. It happened with UDP SIP and log running HTTP sessions that sit idle as well. The SSH connection was just the easiest to test. Besides that, the default TCP timeout value for established connections is quite high at 86400s. An established TCP connection should be able to sit for a full day with no traffic before the related state table entry gets evicted. > Look at your settings in pf.conf for "timeout tcp.established", which > affects how long before an idle ESTABLISHED state will timeout. Also look > into ClientAliveInterval in sshd configuration, which I believe is 0 > (disabled) by default, which means it will let the client timeout without > sending a keep-alive. If you don't want PF to force timeout an idle SSH > connection, then ideally ClientAliveInterval is less than or equal (i.e., > more-frequent) to PF's tcp.established timeout value. Thanks for the suggestion! I completely forgot about the adaptive timeout options until I double checked the settings based on you reply :) My values are set to default for TCP and extended a bit for UDP. The adaptive.start value was calculated at 60k for the 100k state limit. That in particular looked way too relevant to be a coincidence. After increasing the value to 90k, my total state count started increasing and leveled out around 75k. It's always hovered around 65k up until now, so 10k sate entries were being discarded on a regular basis ... # pfctl -si Status: Enabled for 0 days 02:25:41 Debug: Urgent State Table Total Rate current entries 77759 searches 483831701 55352.0/s inserts 825821 94.5/s removals 748060 85.6/s Counters match 27118754 3102.5/s bad-offset 0 0.0/s fragment 0 0.0/s short 0 0.0/s normalize 0 0.0/s memory 0 0.0/s bad-timestamp 0 0.0/s congestion 0 0.0/s ip-option 6655 0.8/s proto-cksum 0 0.0/s state-mismatch 0 0.0/s state-insert 0 0.0/s state-limit 0 0.0/s src-limit 0 0.0/s synproxy 0 0.0/s # pfctl -st tcp.first 120s tcp.opening 30s tcp.established 86400s tcp.closing 900s tcp.finwait 45s tcp.closed 90s tcp.tsdiff 30s udp.first 600s udp.single 600s udp.multiple 900s icmp.first 20s icmp.error 10s other.first 60s other.single 30s other.multiple 60s frag 30s interval 10s adaptive.start 90000 states adaptive.end 120000 states src.track 0s I think there may be a problem with the code that calculates adaptive timeout values that is making it way too aggressive. If by default it's supposed to decrease linearly between %60 and %120 of the state table max, I shouldn't be loosing TCP connections that are only idle for a few minutes when the sate table is < %70 full. Unfortunately that appears to be the case. At most this should have decreased the 86400s timeout by %17 to 72000s for established TCP connections. I've tested this for a few hours now and all my idle SSH sessions have been rock solid. If anyone else is scratching their head over a problem like this, I would suggest disabling the adaptive timeout feature or increasing it to a much higher value. Maybe one of the pf maintainers can chime in and shed some light on why this is happening. If not, I'm going to file a bug report as this certainly feels like one. Thanks again, -Matthew