Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 10 Oct 2011 16:02:07 -0700
From:      Jeremy Chadwick <freebsd@jdc.parodius.com>
To:        Larry Rosenman <ler@lerctr.org>
Cc:        freebsd-stable@freebsd.org, "Vogel, Jack" <jack.vogel@intel.com>, John Baldwin <jhb@freebsd.org>
Subject:   Re: rsync corrupted MAC
Message-ID:  <20111010230207.GA85243@icarus.home.lan>
In-Reply-To: <4E93606D.8070306@lerctr.org>
References:  <alpine.BSF.2.00.1110091604450.94525@lrosenman.dyndns.org> <201110101147.30558.jhb@freebsd.org> <4E933BBF.6070209@lerctr.org> <36C97D31-5D01-4AC2-8E48-9A8B04B98F91@transsys.com> <4E93606D.8070306@lerctr.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Oct 10, 2011 at 04:15:25PM -0500, Larry Rosenman wrote:
> On 10/10/2011 3:57 PM, Louis Mamakos wrote:
> >On Oct 10, 2011, at 2:38 PM, Larry Rosenman wrote:
> >
> >>On 10/10/2011 10:47 AM, John Baldwin wrote:
> >>>On Sunday, October 09, 2011 5:06:26 pm Larry Rosenman wrote:
> >>>>Any ideas on which side or what might be broke here?
> >>>>
> >>>>ler/MAIL-ARCHIVE/2008/12/INBOX
> >>>>Corrupted MAC on input.
> >>>>Disconnecting: Packet corrupt
> >>>>rsync: connection unexpectedly closed (33845045 bytes received so far)
> >>>[receiver]
> >>>>rsync error: error in rsync protocol data stream (code 12) at io.c(605)
> >>>[receiver=3.0.9]
> >>>>rsync: connection unexpectedly closed (1450 bytes received so far)
> >>>[generator]
> >>>>rsync error: unexplained error (code 255) at io.c(605) [generator=3.0.9]
> >>>I've had somewhat similar issues (ssh getting corruption in its data stream)
> >>>when a NIC in my netbook was corrupting packet data when it ran at 1G (it
> >>>worked fine at 10/100).  Pyun eventually fixed the issue by applying enough
> >>>workarounds (it was likely a hardware bug in the NIC's chipset).  However, it
> >>>wasn't easy to debug unfortunately. :(
> >>>
> >>Any ideas on where to start?
> >>
> >>from the 8.2 box (tbh.lerctr.org in the script):
> >>
> >>8.2->PIX->Provider->Internet->Motorola SBG6580 (Time-Warner)->Trendnet TEG-160WS Gig switch->9.0 box (borg.lerctr.org).
> >>
> >>So, where do I start?
> >I'd turn off IP / TCP / UDP checksum offloading on your NIC if it supports it, and see if you are getting network layer checksum errors.  If the IP checksum is wrong, then it happened on the last hops between the NIC and memory or across the previous network hop.
> >
> >
> >
> Good idea, but, it didn't show ANY errors on EITHER side (both are
> em nics).
> 
> Next?
> $ ifconfig em0
> em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>         options=2098<VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC>
>         ether 00:30:48:2e:99:ba
>         inet 192.147.25.65 netmask 0xffffff00 broadcast 192.147.25.255
>         inet6 fe80::230:48ff:fe2e:99ba%em0 prefixlen 64 scopeid 0x1
>         inet 192.147.25.45 netmask 0xffffff00 broadcast 192.147.25.255
>         inet 192.147.25.11 netmask 0xffffff00 broadcast 192.147.25.255
>         nd6 options=3<PERFORMNUD,ACCEPT_RTADV>
>         media: Ethernet autoselect (100baseTX <full-duplex>)
>         status: active
> $
> $ uname -a
> FreeBSD thebighonker.lerctr.org 8.2-STABLE FreeBSD 8.2-STABLE #45:
> Sat Oct  8 10:57:43 CDT 2011
> root@thebighonker.lerctr.org:/usr/obj/usr/src/sys/THEBIGHONKER
> amd64
> $
> 
> 
> 
> $ ifconfig em0
> em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>         options=2088<VLAN_MTU,VLAN_HWCSUM,WOL_MAGIC>
>         ether 00:30:48:8e:9f:f3
>         inet 192.168.200.4 netmask 0xffffff00 broadcast 192.168.200.255
>         inet6 fe80::230:48ff:fe8e:9ff3%em0 prefixlen 64 scopeid 0x1
>         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>         media: Ethernet autoselect (1000baseT <full-duplex>)
>         status: active
> $ uname -a
> FreeBSD borg.lerctr.org 9.0-BETA3 FreeBSD 9.0-BETA3 #1: Sun Oct  9
> 10:03:42 CDT 2011
> root@borg.lerctr.org:/usr/obj/usr/src/sys/BORG-DTRACE  amd64
> $

Can you please provide output from the following commands executed on
the machine showing the problem?  The above commands show nothing
useful, other than the fact that one machine is at 100/full and the
other is at 1000/full (I don't know your network setup).  Commands:

* netstat -inbd -I em0
* sysctl -a dev.em.0
* Issue command "sysctl dev.em.0.debug=1", then type "dmesg" and
  provide all of the new output you will see at the bottom that
  pertains to the NIC

If you Google this problem, you will find that the majority of the time
it's caused by NIC drivers acting oddly.

Also, I believe the em(4) driver in 9.x is slightly different than on
8.x, so I'm CC'ing Jack Vogel here.

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                   Mountain View, CA, US |
| Making life hard for others since 1977.               PGP 4BD6C0CB |




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20111010230207.GA85243>