Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 13 Sep 2012 19:46:12 +0200
From:      Luigi Rizzo <rizzo@iet.unipi.it>
To:        Soren Dreijer <dreijer+bsd@echobit.net>
Cc:        freebsd-ipfw@freebsd.org, Ian Smith <smithi@nimnet.asn.au>
Subject:   Re: Significant network latency when using ipfw and in-kernel NAT
Message-ID:  <20120913174612.GB22571@onelab2.iet.unipi.it>
In-Reply-To: <CALoZf3iRzx5V=1th32LE8OCa0_GTBNGSZeGuH9qTp4Fk1j3ZRw@mail.gmail.com>
References:  <CALoZf3hfZDQQ4ZEXMrGUkYiGvb5QPoAcbpUikAq1adqVY4fLyg@mail.gmail.com> <20120913221758.E51539@sola.nimnet.asn.au> <CALoZf3iCf1_fHgAWUXa3fgudOe66sbk35P0CYhgsneBuhCORJg@mail.gmail.com> <20120913163013.GA22049@onelab2.iet.unipi.it> <CALoZf3iRzx5V=1th32LE8OCa0_GTBNGSZeGuH9qTp4Fk1j3ZRw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Sep 13, 2012 at 12:01:56PM -0500, Soren Dreijer wrote:
> Luigi and Ian,
> 
> As Ian mentioned, we had some off-list discussion by accident and he
> suggested the TSO approach too (although I don't know how that would
> affect e.g. ICMP traffic). It seems to have been a known issue for a
> while (http://lists.freebsd.org/pipermail/freebsd-net/2010-July/025743.html).
> Does anybody know if this is still the case in 9-0-RELEASE?
> 
> I've already done "ifconfig ix1 -tso" to disable TSO on the public
> nic, but there was no difference. I'm not sure what VLAN_HWTSO means,
> though. Is the nic doing TSO on its own? Do I need to turn that off as
> well?. also, do I need to turn off TSO on ix0, which is what the ip
> tunnel runs over?

i'd start by disabling all accelerations (and jumobgrams)
and then move on from the results to figure out where is the problem.

When the nat code was written it assumed well-formed
1500-byte packets, and it uses the checksums when rebuilding the
headers. TSO/RSC can generate large segments causing buffer overflows,
whereas the *XCSUM can generate invalid packets that are sometimes
recovered by retransmissions.

cheers
luigi

> Thanks,
> Soren
> 
> On Thu, Sep 13, 2012 at 11:30 AM, Luigi Rizzo <rizzo@iet.unipi.it> wrote:
> >
> > [top posting for readability]
> > i have seen this kind of issues related to bad interaction
> > between the nat code and the various accelerations
> > (mostly TSO/RSC, but i would also try to disable the
> > checksums).
> > Try to remove tso,csum, possibly rsc if you have it, and see
> > if the problem continues. Please post the result so people
> > reading this thread in the future can tell whether my suggestion
> > was useful or not.
> >
> > cheers
> > luigi
> >
> >
> > On Thu, Sep 13, 2012 at 10:48:01AM -0500, Soren Dreijer wrote:
> >> Definitely. Since this is a server in production, I've obfuscated some
> >> of the IPs, etc.
> >>
> >> First off, here's the ifconfig. Our setup consists of a private (ix0)
> >> and a public nic (ix1) and an ip tunnel (gif0), which is what we use
> >> in ipfw to forward incoming packets to our internal boxes:
> >>
> >> ix0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> >>         options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
> >>         ether XX:XX:XX:XX:XX:XX
> >>         inet <private VLAN IP> netmask 0xffffffc0 broadcast xx
> >>         inet6 xxxx::xxx:xxxx:xxxx:xxxx%ix0 prefixlen 64 scopeid 0x7
> >>         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
> >>         media: Ethernet autoselect (10Gbase-Twinax <full-duplex>)
> >>         status: active
> >> ix1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> >>         options=400bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO>
> >>         ether XX:XX:XX:XX:XX:XX
> >>         inet <public IP> netmask 0xfffffff8 broadcast xx
> >>         inet6 xxxx::xxx:xxxx:xxxx:xxxx%ix1 prefixlen 64 scopeid 0x8
> >>         inet <alias public IP> netmask 0xffffffff broadcast xx
> >>         inet <alias public IP> netmask 0xffffffff broadcast xx
> >>         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
> >>         media: Ethernet autoselect (10Gbase-Twinax <full-duplex>)
> >>         status: active
> >> ipfw0: flags=8801<UP,SIMPLEX,MULTICAST> metric 0 mtu 65536
> >>         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
> >> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
> >>         options=3<RXCSUM,TXCSUM>
> >>         inet6 ::1 prefixlen 128
> >>         inet6 fe80::1%lo0 prefixlen 64 scopeid 0xa
> >>         inet 127.0.0.1 netmask 0xff000000
> >>         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
> >> gif0: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1500
> >>         tunnel inet <private VLAN IP> --> <private VLAN IP>
> >>         inet 172.16.1.1 --> 172.16.1.2 netmask 0xffff0000
> >>         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
> >>         options=1<ACCEPT_REV_ETHIP_VER>
> >>
> >> The basic ruleset looks like this. One-pass is off so that packets are
> >> reinjected after going through NAT'ing and pipes:
> >>
> >> 00001  16653   4417407 allow ip from any to any via ix0
> >> 00003  14588   2860344 allow ip from any to any via gif1
> >> 00006      0         0 allow ip from any to any via lo0
> >> 00010      0         0 deny ip from 192.168.0.0/16 to any in via ix1
> >> 00011      0         0 deny ip from 172.16.0.0/12 to any in via ix1
> >> 00012      0         0 deny ip from 10.0.0.0/8 to any in via ix1
> >> 00013      0         0 deny ip from 127.0.0.0/8 to any in via ix1
> >> 00014      0         0 deny ip from 0.0.0.0/8 to any in via ix1
> >> 00015      0         0 deny ip from 169.254.0.0/16 to any in via ix1
> >> 00016      0         0 deny ip from 192.0.2.0/24 to any in via ix1
> >> 00017      0         0 deny ip from 204.152.64.0/23 to any in via ix1
> >> 00018      0         0 deny ip from 224.0.0.0/3 to any in via ix1
> >> 00019     15      1020 allow icmp from any to any via ix1   # For
> >> testing purposes, allow all ICMP in and out of the public adapter
> >> 00020   7537    647951 nat 1 ip from any to any in via ix1   # NAT all
> >> incoming traffic
> >> 00030      0         0 check-state # For some reason, this never gets
> >> matched even though rule #100 is matched
> >> 00100    161    124340 skipto 805 tcp from any to any out via ix1
> >> setup keep-state   # For testing purposes, allow all TCP originating
> >> from the box out of the public adapter
> >> 00110      0         0 skipto 805 icmp from any to any out via ix1 keep-state
> >> 00200  36557   1996626 skipto 500 tcp from any to 172.16.1.2 dst-port
> >> 443 in via ix1   # Forward NAT'ed traffic for port 443 over the ip
> >> tunnel
> >> 00201  46593  63973143 skipto 805 tcp from 172.16.1.2 443 to any out via ix1
> >> 00400      8      6192 deny ip from any to any via ix1
> >> 00500      0         0 pipe 1 ip from any to any in via ix1   # Packet shaping
> >> 00501      0         0 allow ip from any to any in via ix1
> >> 00805   8963   3412995 nat 1 ip from any to any out via ix1
> >> 00806   8963   3412995 allow ip from any to any
> >> 10000      0         0 deny ip from any to any via ix1   # Last ditch catch
> >> 65535 864357 867120912 allow ip from any to any
> >>
> >> 'ipfw nat show config' yields:
> >>
> >> ipfw nat 1 config if ix1 log reset redirect_port tcp 172.16.1.2:443
> >> <public IP>:443
> >>
> >> And finally, here are the horrifying ping times (furthermore, all
> >> outgoing TCP traffic originating from this box, such as wget or
> >> pkg_add, time out. I've managed to get an outgoing telnet working, but
> >> it's horrible slow and takes a while to establish):
> >>
> >> PING google.com (74.125.227.14): 56 data bytes
> >> 64 bytes from 74.125.227.14: icmp_seq=0 ttl=56 time=2746.953 ms
> >> 64 bytes from 74.125.227.14: icmp_seq=1 ttl=56 time=2097.460 ms
> >> 64 bytes from 74.125.227.14: icmp_seq=2 ttl=56 time=2186.068 ms
> >> 64 bytes from 74.125.227.14: icmp_seq=3 ttl=56 time=4292.776 ms
> >> 64 bytes from 74.125.227.14: icmp_seq=4 ttl=56 time=5056.965 ms
> >> 64 bytes from 74.125.227.14: icmp_seq=5 ttl=56 time=5323.720 ms
> >> 64 bytes from 74.125.227.14: icmp_seq=6 ttl=56 time=5007.974 ms
> >> 64 bytes from 74.125.227.14: icmp_seq=7 ttl=56 time=4756.587 ms
> >>
> >> It's worth mentioning that when I switch back to using natd and divert
> >> in the ruleset (which really only changes the nat portions and
> >> everything else stays the same), the ping time drops to ~300ms, which
> >> is a big difference for simply "using" natd even when the ICMP packets
> >> aren't supposed to be going through NAT'ing whatsoever. The ~300ms
> >> ping time is still way too high, though, since our other boxes have a
> >> ping time to Google of ~0.300ms...
> >>
> >> Any ideas?
> >>
> >> On Thu, Sep 13, 2012 at 7:41 AM, Ian Smith <smithi@nimnet.asn.au> wrote:
> >> > On Wed, 12 Sep 2012 23:09:27 -0500, Soren Dreijer wrote:
> >> >  > Hi there,
> >> >  >
> >> >  > We're running freebsd 9.0-RELEASE on a box whose primary purpose is to
> >> >  > act as a firewall and a gateway. Up until today, we've been using ipfw
> >> >  > in conjunction with natd and the divert action in ipfw to forward
> >> >  > packets between the freebsd box (i.e. the public Internet) and our
> >> >  > private servers.
> >> >  >
> >> >  > Unfortunately, natd appears to be quite the CPU hog and we therefore
> >> >  > decided to switch to the in-kernel NAT support in ipfw. The issue
> >> >  > we're running in to is that the network latency appears to be
> >> >  > skyrocketing when ipfw contains nat rules. Basically all TCP traffic
> >> >  > originating from the box times out and pinging google.com on the box
> >> >  > gives an average of ~10 SECONDS -- and that's even if I explicitly
> >> >  > allow all ICMP traffic before the packets even get to the nat rules in
> >> >  > ipfw.
> >> >  >
> >> >  > The really odd part, however, is that I can ping the freebsd box just
> >> >  > fine externally. For instance, pinging the server from my home
> >> >  > connection gives an average of 45 ms. I'm also able to communicate
> >> >  > just fine with the internal servers through the freebsd box.
> >> >  >
> >> >  > Does anybody have any idea what's going on? I assume I must've
> >> >  > misconfigured something big here...
> >> >
> >> > Or maybe only something small .. but without seeing your basic ruleset
> >> > and network config - obscured as need be - we can only guess.  Maybe an
> >> > 'ifconfig', 'ipfw show' and 'ipfw nat show config' would illustrate?
> >> >
> >> > cheers, Ian
> >> _______________________________________________
> >> freebsd-ipfw@freebsd.org mailing list
> >> http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
> >> To unsubscribe, send any mail to "freebsd-ipfw-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120913174612.GB22571>