Date: Fri, 19 Jun 2015 09:18:57 +0200 From: Milan Obuch <freebsd-pf@dino.sk> To: freebsd-pf@freebsd.org Subject: Large scale NAT with PF - some weird problem Message-ID: <20150619091857.304b707b@zeta.dino.sk>
next in thread | raw e-mail | index | archive | help
Hi, I am managing FreeBSD 9 based router for a network using PF for NAT. I think I can call it large scale - there is approximately 3000 customers' devices (home routers and similar) with private IPs in segment 172.16.0.0/12 translated to /23 public address block. Basically, in pf.conf, there is nat on $if_ext from $net_int to any -> $pool_ext round-robin sticky-address and handful of binat on $if_ext from 172.16.x.y to any -> a.b.c.d statements. It works, basically, but for some time now there are some intermitent outages. When it occurs, customer's device loses access to internet. I can verify it with simple ping to any address outside of the network. The weird thing is, I can see icmp request packets coming out of external interface, but no icmp echo packets coming back. While I can't verify on uplink router that these replies are actually coming in on interface, I am pretty sure it does, but they are not visible in tcpdump's output. (When I am pinging some device outside of the network, which is under my control, I can see there both icmp requests and icmp echo packets. Also, if I ping address to which thich ping is translated from outside, I see it on external interface coming in.) I think I have a problem with same table being too small, but no idea where it is. It is not state table, I have set limit states 500000 in my pf.conf, and pfctl -vs info tells State Table Total Rate current entries 36668 searches 1996138369 29280.5/s inserts 15757727 231.1/s removals 15770004 231.3/s so I think I have plenty of room here. It was set in past when issue a bit similar occured and using bigger state table solved it. Also, pfctl -vs state | grep <ip.address.with.problem> shows states for not working ping as all icmp a.b.c.d:538 <- 172.16.x.y:538 0:0 all icmp e.f.g.h:40011 (172.16.x.y:538) -> a.b.c.d:40011 0:0 where a.b.c.d is address being used as ping target (outside of network), 172.16.x.y is address of device with trouble access to internet, and e.f.g.h is translated address for this device, allocated dynamically. After doing /etc/rc.d/pf restart if works again, so I think, again, issue is with some table being too small. Restart empties it and things begin to work. Does this sound familiar to anybody? I was trying to find some tuning guide for pf and large scale nat, but no success yet. I would be gratefull for any help. Regards, Milan
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150619091857.304b707b>