Date: Sat, 20 Jun 2015 18:24:32 +0200 From: Milan Obuch <freebsd-pf@dino.sk> To: Ian FREISLICH <ian.freislich@capeaugusta.com> Cc: freebsd-pf@freebsd.org Subject: Re: Large scale NAT with PF - some weird problem Message-ID: <20150620182432.62797ec5@zeta.dino.sk> In-Reply-To: <14e119e8fa8.2755.abfb21602af57f30a7457738c46ad3ae@capeaugusta.com> References: <20150619091857.304b707b@zeta.dino.sk> <14e119e8fa8.2755.abfb21602af57f30a7457738c46ad3ae@capeaugusta.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --] On Sat, 20 Jun 2015 17:38:05 +0200 Ian FREISLICH <ian.freislich@capeaugusta.com> wrote: > Hi, > > How many NAT states in your table? > How can I find out? Is there another statistics collected I can gert out of pfctl? Full output of pfctl -vs info is attached (hopefully it will come unmangled through), is there anything interesting? > I had a router translating a /20 and a /22 to a /24 and doing > transparent interception of those and a /16 to a proxy pool and I > never saw this. My state table was about 380000 to 850000 with a > search rate about quadruple yours. > Did you do any pf tuning? What about limits in your case? > If you can, give 10-STABLE a try. I ran the above router pair as > 10-CURRENT for a long time. There are some significant performance > improvements. > It is possible, but not easy (read: needs some planning before switch) so as not interrupt the operation. > Ian > Thanks, Milan > On 19 June 2015 09:24:22 Milan Obuch <freebsd-pf@dino.sk> wrote: > > > Hi, > > > > I am managing FreeBSD 9 based router for a network using PF for > > NAT. I think I can call it large scale - there is approximately > > 3000 customers' devices (home routers and similar) with private IPs > > in segment 172.16.0.0/12 translated to /23 public address block. > > Basically, in pf.conf, there is > > > > nat on $if_ext from $net_int to any -> $pool_ext round-robin > > sticky-address > > > > and handful of > > > > binat on $if_ext from 172.16.x.y to any -> a.b.c.d > > > > statements. It works, basically, but for some time now there are > > some intermitent outages. When it occurs, customer's device loses > > access to internet. I can verify it with simple ping to any address > > outside of the network. > > > > The weird thing is, I can see icmp request packets coming out of > > external interface, but no icmp echo packets coming back. While I > > can't verify on uplink router that these replies are actually > > coming in on interface, I am pretty sure it does, but they are not > > visible in tcpdump's output. (When I am pinging some device outside > > of the network, which is under my control, I can see there both > > icmp requests and icmp echo packets. Also, if I ping address to > > which thich ping is translated from outside, I see it on external > > interface coming in.) > > > > I think I have a problem with same table being too small, but no > > idea where it is. It is not state table, I have > > > > set limit states 500000 > > > > in my pf.conf, and pfctl -vs info tells > > > > State Table Total Rate > > current entries 36668 > > searches 1996138369 29280.5/s > > inserts 15757727 231.1/s > > removals 15770004 231.3/s > > > > so I think I have plenty of room here. It was set in past when > > issue a bit similar occured and using bigger state table solved it. > > > > Also, pfctl -vs state | grep <ip.address.with.problem> shows states > > for not working ping as > > > > all icmp a.b.c.d:538 <- 172.16.x.y:538 0:0 > > all icmp e.f.g.h:40011 (172.16.x.y:538) -> a.b.c.d:40011 0:0 > > > > where a.b.c.d is address being used as ping target (outside of > > network), 172.16.x.y is address of device with trouble access to > > internet, and e.f.g.h is translated address for this device, > > allocated dynamically. > > > > After doing /etc/rc.d/pf restart if works again, so I think, again, > > issue is with some table being too small. Restart empties it and > > things begin to work. > > > > Does this sound familiar to anybody? I was trying to find some > > tuning guide for pf and large scale nat, but no success yet. I > > would be gratefull for any help. > > > > Regards, > > Milan > > > [-- Attachment #2 --] Status: Enabled for 1 days 08:40:03 Debug: Urgent Hostid: 0xdc96c05a Checksum: 0xeb207be7b007c11da3f9be3c541d90d0 State Table Total Rate current entries 49518 searches 4203116524 35739.9/s inserts 31009556 263.7/s removals 31000311 263.6/s Source Tracking Table current entries 548 searches 8672696 73.7/s inserts 11348 0.1/s removals 10800 0.1/s Counters match 98746186 839.7/s bad-offset 0 0.0/s fragment 208 0.0/s short 341 0.0/s normalize 0 0.0/s memory 0 0.0/s bad-timestamp 0 0.0/s congestion 0 0.0/s ip-option 0 0.0/s proto-cksum 0 0.0/s state-mismatch 44311 0.4/s state-insert 12 0.0/s state-limit 0 0.0/s src-limit 66 0.0/s synproxy 0 0.0/s Limit Counters max states per rule 0 0.0/s max-src-states 0 0.0/s max-src-nodes 0 0.0/s max-src-conn 377 0.0/s max-src-conn-rate 276 0.0/s overload table insertion 649 0.0/s overload flush states 649 0.0/s
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150620182432.62797ec5>
