From owner-freebsd-hackers@freebsd.org Mon May 9 18:49:55 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5106FB34268 for ; Mon, 9 May 2016 18:49:55 +0000 (UTC) (envelope-from dieterbsd@gmail.com) Received: from mail-ig0-x22e.google.com (mail-ig0-x22e.google.com [IPv6:2607:f8b0:4001:c05::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1F8001198 for ; Mon, 9 May 2016 18:49:55 +0000 (UTC) (envelope-from dieterbsd@gmail.com) Received: by mail-ig0-x22e.google.com with SMTP id s8so99576324ign.0 for ; Mon, 09 May 2016 11:49:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to; bh=UkbWWBwAtJm+CxB/yVuWqV1GCSh4g3gZFexEKZLhbuk=; b=che5ajCi1+UZvN/UJfQIFO1LyjKjsO3L5eJoHqCu6cdZlwdIaq2InNSZQYi9VYb68y T0TMIUZL6boBuIY3Qa0H/VdPtXg4sYKABYMGjM23doC2tJ7UUFE9Fo7qo6LhQSLSFSXb NsY7anPGzGis9JaY2HSX70IaCqfg6meR+F69OXlTxIhkvFrc5InPtBWa51kvmA3eh0jx A26AkLWWvdyUXiuHmb14WekNY8chkR2v1GUx5lcIL1t6SFpmX5Tkum9lT4mp0vYOOiDu EJMRUbIBBotgfceBrkm34l/jWsuZLdOpsbW1twVmggLGpX5ECVgPuq9OD9oUikcDn5wg z5yg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to; bh=UkbWWBwAtJm+CxB/yVuWqV1GCSh4g3gZFexEKZLhbuk=; b=a36k1AtAWnvZqG3kEnMl32TzcRqKM7V75DJPY33gB0fNIPEKtIOGi+vMM2oLhkIwV9 t5TZ1Ftbf3E99TTeZsarRBk1K52vayJKqCsx/e2T2zKnBmKWf7KYJxz+CVgjfzAwCFCj X+OjSCDqglf8oqXUx03wJXClw+A/udJ6MxN4dEQ26eolikmjoNX7nkXlC8HUiXS5tO36 gaD6U/rfJJVSZ40REE1HopXHoCoRxa1UCH1697hoggCULbpw5a9zQclRSNlbgNyvTtgY evryuIAkAGBUUxHk4lFXO2cmsJRT7SckgljWpERKePeFr0obTnFw2pB1FuD2iOaknaPj IwbQ== X-Gm-Message-State: AOPr4FUWawz6sTizVJnENF9dJS1HaASmi1tWaJuHYhraNZHpBwcPb2aw4KDgdvJu7sFlAwFkwCIPiE2ddscLUQ== MIME-Version: 1.0 X-Received: by 10.50.112.42 with SMTP id in10mr12322665igb.67.1462819794318; Mon, 09 May 2016 11:49:54 -0700 (PDT) Received: by 10.64.89.101 with HTTP; Mon, 9 May 2016 11:49:54 -0700 (PDT) Date: Mon, 9 May 2016 11:49:54 -0700 Message-ID: Subject: Re: TCP problems From: Dieter BSD To: freebsd-hackers@freebsd.org Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 May 2016 18:49:55 -0000 Larry suggests: > Have you tried bumping the MTU on the interfaces to JUMBO frames? > 9000 or whatever max is? Easy enough to try, but 2 of the 4 max out at 1500. I suppose I could rewire the networks to get the 2 that allow 9000 on the same wire. But packet size seems unlikely to have anything to do with bind failing. And MTU=1500 is really supposed to work. I set ue0 to *smaller* mtu (500, 250, 100) and still got data corruption, along with rcp: lost connection Data connection: Operation timed out. (ftp) More on ue0's MTU below. Mark suggests: > Sounds like you may have fired the nic on the G box Which is why I tried both networks. Seems unlikely that both re0 *and* ue0 would fail with the same symptoms. Seems unlikely that bad nics would have anything to do with bind failing? Thank you both for your suggestions. ------------------- re0: New problem. One network got some strange ping times for awhile: 64 bytes from machine_G_re0: icmp_seq=2 ttl=64 time=0.355 ms 64 bytes from machine_G_re0: icmp_seq=3 ttl=64 time=2001.209 ms 64 bytes from machine_G_re0: icmp_seq=4 ttl=64 time=2001.219 ms 64 bytes from machine_G_re0: icmp_seq=5 ttl=64 time=1000.728 ms 64 bytes from machine_G_re0: icmp_seq=6 ttl=64 time=0.229 ms 64 bytes from machine_G_re0: icmp_seq=7 ttl=64 time=2001.091 ms 64 bytes from machine_G_re0: icmp_seq=8 ttl=64 time=2001.129 ms 64 bytes from machine_G_re0: icmp_seq=9 ttl=64 time=1000.643 ms 64 bytes from machine_G_re0: icmp_seq=10 ttl=64 time=0.149 ms 64 bytes from machine_G_re0: icmp_seq=11 ttl=64 time=2001.207 ms 64 bytes from machine_G_re0: icmp_seq=12 ttl=64 time=2001.211 ms 64 bytes from machine_G_re0: icmp_seq=13 ttl=64 time=1000.726 ms 64 bytes from machine_T_bge0: icmp_seq=0 ttl=64 time=423.415 ms 64 bytes from machine_T_bge0: icmp_seq=1 ttl=64 time=14491.793 ms 64 bytes from machine_T_bge0: icmp_seq=2 ttl=64 time=13490.387 ms 64 bytes from machine_T_bge0: icmp_seq=3 ttl=64 time=12489.373 ms 64 bytes from machine_T_bge0: icmp_seq=4 ttl=64 time=11488.635 ms 64 bytes from machine_T_bge0: icmp_seq=5 ttl=64 time=10487.481 ms 64 bytes from machine_T_bge0: icmp_seq=6 ttl=64 time=9486.493 ms 64 bytes from machine_T_bge0: icmp_seq=7 ttl=64 time=8485.567 ms Powered machine G down overnight and re0 mostly recovered. Still have the bind problem. Does bind have anything to do with the Ethernet hardware or device drivers? I'm guessing no. No clue as to why re0 was causing data corruption, or why the data corruption went away (that problem went away before the power down so it isn't that). Also no clue about what caused the long ping times, which went away after the power down. ------------------- ue0: Noticed that netstat was reporting input errors for ue0. And ue0 input was where the data corruption was happening. Sent data from machine A with 10Mbps Ethernet. Netstat did not report any input errors on ue0 and there was no data corruption. So ue0 can handle gigabit data rate, but gets input errors if packets arrive too frequently. # ifconfig ue0 media 100baseTX-FDX fixed the input error problem and the data corruption problem, at the expense of making it even slower. Max data rate seen (before lowering to 100Mbps) was about 35 MB/s which is said to be the effective rate of USB2. usbconfig says: ugen0.3: at usbus0, cfg=0 md=HOST spd=SUPER (5.0Gbps) pwr=ON (124mA) so I guess it really is running at USB3 speed. The chip performs a lot better for tweaktown: http://www.tweaktown.com/reviews/7243/vantec-cb-u300gna-usb-3-gigabit-network-adapter-review/index.html (Vantec CB-U300GNA with the same Asix AX88179 chip) "full duplex gigabit with 952 Mbps consistently across the chart" Asix AX88179 chip: http://www.asix.com.tw/products.php?op=pItemdetail&PItemID=131;71;112 "Supports Jumbo frame up to 4KB" But ifconfig rejects any value > 1500: # ifconfig ue0 mtu 4000 ifconfig: ioctl (set mtu): Invalid argument # ifconfig ue0 mtu 1501 ifconfig: ioctl (set mtu): Invalid argument A quick look at the driver code didn't find a MTU limit. (But did in other Ethernet drivers.) Looks to me like axge(4) doesn't support a large MTU. IIRC, one should set ifconfig -rxcsum -txcsum to get maximum data integrity (at the expense of using more cpu). If the cpu were doing the checksums it should catch and correct the data corruption I'm getting since the corruption appears to be happening inside the Asix AX88179 chip. But: # ifconfig ue0 -rxcsum results in no Ethernet traffic # ifconfig ue0 -txcsum seems to work ok. (including no data corruption) Why am I not getting any Ethernet traffic with -rxcsum? I can see that some controllers might not have the hardware to support rxcsum, but it seems to me that -rxcsum and -txcsum should always work? # ifconfig re0 -rxcsum -txcsum seems to work ok. (including no data corruption) Is Asix AX88179 still the only USB to gigabit Ethernet chip?