Date: Mon, 10 Oct 2011 16:02:07 -0700 From: Jeremy Chadwick <freebsd@jdc.parodius.com> To: Larry Rosenman <ler@lerctr.org> Cc: freebsd-stable@freebsd.org, "Vogel, Jack" <jack.vogel@intel.com>, John Baldwin <jhb@freebsd.org> Subject: Re: rsync corrupted MAC Message-ID: <20111010230207.GA85243@icarus.home.lan> In-Reply-To: <4E93606D.8070306@lerctr.org> References: <alpine.BSF.2.00.1110091604450.94525@lrosenman.dyndns.org> <201110101147.30558.jhb@freebsd.org> <4E933BBF.6070209@lerctr.org> <36C97D31-5D01-4AC2-8E48-9A8B04B98F91@transsys.com> <4E93606D.8070306@lerctr.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Oct 10, 2011 at 04:15:25PM -0500, Larry Rosenman wrote: > On 10/10/2011 3:57 PM, Louis Mamakos wrote: > >On Oct 10, 2011, at 2:38 PM, Larry Rosenman wrote: > > > >>On 10/10/2011 10:47 AM, John Baldwin wrote: > >>>On Sunday, October 09, 2011 5:06:26 pm Larry Rosenman wrote: > >>>>Any ideas on which side or what might be broke here? > >>>> > >>>>ler/MAIL-ARCHIVE/2008/12/INBOX > >>>>Corrupted MAC on input. > >>>>Disconnecting: Packet corrupt > >>>>rsync: connection unexpectedly closed (33845045 bytes received so far) > >>>[receiver] > >>>>rsync error: error in rsync protocol data stream (code 12) at io.c(605) > >>>[receiver=3.0.9] > >>>>rsync: connection unexpectedly closed (1450 bytes received so far) > >>>[generator] > >>>>rsync error: unexplained error (code 255) at io.c(605) [generator=3.0.9] > >>>I've had somewhat similar issues (ssh getting corruption in its data stream) > >>>when a NIC in my netbook was corrupting packet data when it ran at 1G (it > >>>worked fine at 10/100). Pyun eventually fixed the issue by applying enough > >>>workarounds (it was likely a hardware bug in the NIC's chipset). However, it > >>>wasn't easy to debug unfortunately. :( > >>> > >>Any ideas on where to start? > >> > >>from the 8.2 box (tbh.lerctr.org in the script): > >> > >>8.2->PIX->Provider->Internet->Motorola SBG6580 (Time-Warner)->Trendnet TEG-160WS Gig switch->9.0 box (borg.lerctr.org). > >> > >>So, where do I start? > >I'd turn off IP / TCP / UDP checksum offloading on your NIC if it supports it, and see if you are getting network layer checksum errors. If the IP checksum is wrong, then it happened on the last hops between the NIC and memory or across the previous network hop. > > > > > > > Good idea, but, it didn't show ANY errors on EITHER side (both are > em nics). > > Next? > $ ifconfig em0 > em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 > options=2098<VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC> > ether 00:30:48:2e:99:ba > inet 192.147.25.65 netmask 0xffffff00 broadcast 192.147.25.255 > inet6 fe80::230:48ff:fe2e:99ba%em0 prefixlen 64 scopeid 0x1 > inet 192.147.25.45 netmask 0xffffff00 broadcast 192.147.25.255 > inet 192.147.25.11 netmask 0xffffff00 broadcast 192.147.25.255 > nd6 options=3<PERFORMNUD,ACCEPT_RTADV> > media: Ethernet autoselect (100baseTX <full-duplex>) > status: active > $ > $ uname -a > FreeBSD thebighonker.lerctr.org 8.2-STABLE FreeBSD 8.2-STABLE #45: > Sat Oct 8 10:57:43 CDT 2011 > root@thebighonker.lerctr.org:/usr/obj/usr/src/sys/THEBIGHONKER > amd64 > $ > > > > $ ifconfig em0 > em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 > options=2088<VLAN_MTU,VLAN_HWCSUM,WOL_MAGIC> > ether 00:30:48:8e:9f:f3 > inet 192.168.200.4 netmask 0xffffff00 broadcast 192.168.200.255 > inet6 fe80::230:48ff:fe8e:9ff3%em0 prefixlen 64 scopeid 0x1 > nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> > media: Ethernet autoselect (1000baseT <full-duplex>) > status: active > $ uname -a > FreeBSD borg.lerctr.org 9.0-BETA3 FreeBSD 9.0-BETA3 #1: Sun Oct 9 > 10:03:42 CDT 2011 > root@borg.lerctr.org:/usr/obj/usr/src/sys/BORG-DTRACE amd64 > $ Can you please provide output from the following commands executed on the machine showing the problem? The above commands show nothing useful, other than the fact that one machine is at 100/full and the other is at 1000/full (I don't know your network setup). Commands: * netstat -inbd -I em0 * sysctl -a dev.em.0 * Issue command "sysctl dev.em.0.debug=1", then type "dmesg" and provide all of the new output you will see at the bottom that pertains to the NIC If you Google this problem, you will find that the majority of the time it's caused by NIC drivers acting oddly. Also, I believe the em(4) driver in 9.x is slightly different than on 8.x, so I'm CC'ing Jack Vogel here. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20111010230207.GA85243>