Date: Wed, 13 Feb 2013 23:54:24 -0800 From: Doug Hardie <bc979@lafn.org> To: pyunyh@gmail.com Cc: Jeremy Chadwick <jdc@koitsu.org>, freebsd-stable@freebsd.org, Eugene Grosbein <egrosbein@rdtc.ru>, yongari@freebsd.org Subject: Re: Unusual TCP/IP Packet Size Message-ID: <3BB4EC29-0FD5-4F5D-9189-51770E2B55D5@lafn.org> In-Reply-To: <20130214064521.GA1464@michelle.cdnetworks.com> References: <96AE8BD1-79C2-4743-854F-B8386C54E4A1@lafn.org> <511B6B21.5030606@rdtc.ru> <20130213130059.GA57337@icarus.home.lan> <20130214013723.GB2945@michelle.cdnetworks.com> <CAN6yY1v9oc7BEQXDkAwSCxi65ibuApP6geXA1hi0fzQZRXVjxQ@mail.gmail.com> <20130214064521.GA1464@michelle.cdnetworks.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 13 February 2013, at 22:45, YongHyeon PYUN <pyunyh@gmail.com> wrote: > On Wed, Feb 13, 2013 at 09:10:36PM -0800, Kevin Oberman wrote: >> On Wed, Feb 13, 2013 at 5:37 PM, YongHyeon PYUN <pyunyh@gmail.com> = wrote: >>> On Wed, Feb 13, 2013 at 05:00:59AM -0800, Jeremy Chadwick wrote: >>>> On Wed, Feb 13, 2013 at 05:29:53PM +0700, Eugene Grosbein wrote: >>>>> 13.02.2013 17:25, Doug Hardie ??????????: >>>>>> Monitoring a tcpdump between two systems, a FreeBSD 9.1 system = has the following interface: >>>>>>=20 >>>>>> msk0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric = 0 mtu 1500 >>>>>> = options=3Dc011b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,TSO4,VLAN_HWTSO,LINK= STATE> >>>>>> ether 00:11:2f:2a:c7:03 >>>>>> inet 10.0.1.199 netmask 0xffffff00 broadcast 10.0.1.255 >>>>>> inet6 fe80::211:2fff:fe2a:c703%msk0 prefixlen 64 scopeid 0x1 >>>>>> nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> >>>>>> media: Ethernet autoselect (100baseTX = <full-duplex,flowcontrol,rxpause,txpause>) >>>>>> status: active >>>>>>=20 >>>>>>=20 >>>>>> It sent the following packet: (data content abbreviated) >>>>>>=20 >>>>>> 02:14:42.081617 IP 10.0.1.199.443 > 10.0.1.2.61258: Flags [P.], = seq 930:4876, ack 846, win 1040, options [nop,nop,TS val 401838072 ecr = 920110183], length 3946 >>>>>> 0x0000: 4500 0f9e ea89 4000 4006 2a08 0a00 01c7 = E.....@.@.*..... >>>>>> 0x0010: 0a00 0102 01bb ef4a ece1 680b ae37 1bbc = .......J..h..7.. >>>>>> 0x0020: 8018 0410 3407 0000 0101 080a 17f3 8ff8 = ....4...??????. >>>>>>=20 >>>>>>=20 >>>>>> The indicated packet length is 3946 and the load of data shown is = that size. The MTU on both interfaces is 1500. The receiving system = received 3 packets. There is a router and switch between them. One of = them fragmented that packet. This is part of a SSL/TLS exchange and one = side or the other is hanging on this and just dropping the connection. = I suspect the packet size is the issue. ssldump complains about the = packet too and stops monitoring. Could this possibly be related to the = hardware checksums? >>>>>=20 >>>>> You have TSO enabled on the interface, so large outgoing TCP = packet is pretty normal. >>>>> It will be split by the NIC. Disable TSO with ifconfig if it = interferes with your ssldump. >>>>=20 >>>> This is not the behaviour I see with em(4) on a 82573E with all = defaults >>>> used (which includes TSO4). Note that Doug is using msk(4). >>>>=20 >>>> I can provide packet captures on both ends of a LAN segment using = both >>>> tcpdump (on the FreeBSD side) and Wireshark (on the Windows side) = that >>>> show a difference in behaviour compared to what Doug sees. >>>=20 >>> This is strange. tcpdump sees a (big) TCP segment right before >>> controller actually transmits it. So if TSO is active for the TCP >>> segment, you should see a series of small TCP packets on receiver >>> side(i.e. 3 TCP packets in Doug's case). If you don't see a big TCP >>> segment with tcpdump on TX path, probably TSO was not used for the >>> TCP segment. >>> It's possible for controller to corrupt the TCP segment during >>> segmentation but Doug's tcpdump looks completely normal to me since >>> tcpdump sees the segment before TCP segmentation. >>>=20 >>>>=20 >>>> What I see on the FreeBSD side with tcpdump is repeated "bad-len 0" >>>> messages for payloads which are chunked or segmented as a result of = TSO. >>>> I do not see a 1:1 ratio of "bad-len" entries to chunked payloads; = I >>>> only see one "bad-len" entry for all chunks (up until the next ACK = or >>>> PSH+ACK of course). >>>>=20 >>>=20 >>> I vaguely recall that some users reported similar TSO issues on >>> various drivers. The root cause of the issue was not identified >>> though. Personally I couldn't reproduce the issue at that time. >>> It could be a driver or network stack bug. >>=20 >> Beware TSO. It can significantly improve throughput on high speed >> networks, but it really has issues. >>=20 >> TSO segments the data and transmits all of them back-to-back with no >> delay beyond IFG (the 802.3 mandated space between frames) TSO does >> not understand congestion control. If there is congestion and TSO >> sends several frames in a row, it is entirely possible that a queue = is >> full or getting close enough to full to start dropping packets and >> these segmented frames are excellent candidates. >=20 > I'm not saying the drawback of TSO. Sometimes segmented packets > have malformed IP header length under certain circumstances such > that these packets were dropped on receiver side. How do I configure the msk0 interface in rc.conf to disable tso4? I can = easily do it with ifconfig, but don't see how to make sure its disabled = after a boot.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3BB4EC29-0FD5-4F5D-9189-51770E2B55D5>