Date: Mon, 22 Jun 2015 14:00:55 +0200 From: Andrej Sossi <asossi@dotcom.ts.it> To: <freebsd-net@freebsd.org> Subject: Strange problem with TCP checksum Message-ID: <5587F8F7.2000700@dotcom.ts.it>
next in thread | raw e-mail | index | archive | help
Hello, I have a weird network problem which I believe may be caused by the FreeBSD igb driver or perhaps even the network adapter. Let me try to explain the scenario in brief: I have a FreeBSD 10.0-RELEASE-p10 server with a public IP address, in which N virtual machines are installed through JAIL; the machines hold private IP addresses on the loopback1 adapter. The VMs access the internet through NATting on the public IP via ipfw: nat 1 config ip X.Y.Z.W if igb0 unreg_only same_ports add 60000 nat 1 ip from 192.168.250.0/24 to any out xmit igb0 keep-state add 60001 nat 1 ip from any to X.Y.Z.W in recv igb0 In addition, port forwarding is configured on the real machine towards the VMs in order to support public services (Apache httpd, database, etc.) The network adapter is: igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO> ether 00:45:80:dd:32:30 inet X.Y.Z.W netmask 0xffffff00 broadcast X.Y.Z.W inet6 XX::YY:ZZ:WWW:VVV%igb0 prefixlen 64 scopeid 0x1 inet6 XX:YY:ZZ:WWW::1 prefixlen 64 nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL> media: Ethernet autoselect (1000baseT <full-duplex>) status: active The loopback1 adapter, where the VMs' IPs are assigned, too, has MTU 1500. So far so good, in the sense that everything works as expected, almost. Occasionally there are requests originated by the VMs towards internet servers which end in timeout (http, sftp, etc.). The very same requests, if executed by the real machine, end correctly with a response. After countless experiments I have managed to reproduce the problem deterministically. Through a tcpdump executed on the request's recipient I have noticed that all TCP packets with a payload between 101 e 106 (inclusive) bytes in size arrive with a wrong TCP checksum and as such are rejected. Subsequent retransmissions of the same packet continue to bear a wrong checksum and this continues until the connection timeout is reached. The IP checksum, instead, is always correct. Packets smaller than 101 bytes are transmitted and received with the correct checksum, as the same happens to packets with a payload in excess of 116 bytes in size. If TXCSUM is disabled, the problem disappears. The same problem I have on second server with same configuration and hardware bat with FreeBSD 10.0-RELEASE-p1 . I believe the above behavior is something error with the driver, as on a third machine, with identical configuration with jail machines NATting but with an em driver, the checksum problem didn't appear. -- Cordiali saluti Sossi Andrej ------------------------- DOTCOM Information technology Via Machiavelli, 28 34132 - Trieste (TS) Italy tel: +39 040 9828090 fax: +39 040 0641954 E-mail: asossi@dotcom.ts.it ---------------------------- Ai sensi del D.lgs n. 196 del 30.06.03 (Codice Privacy) si precisa che le informazioni contenute in questo messaggio sono riservate e ad uso esclusivo del destinatario. Qualora il messaggio in parola Le fosse pervenuto per errore, La preghiamo di eliminarlo senza copiarlo e di non inoltrarlo a terzi, dandocene gentilmente comunicazione. Grazie This message, for the D.lgs n. 196 / 30.06.03 (Privacy Code), may contain confidential and/or privileged information. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5587F8F7.2000700>