Date: Wed, 6 Jul 2011 17:50:26 -0700 From: Kevin Oberman <kob6558@gmail.com> To: Chuck Swiger <cswiger@mac.com> Cc: freebsd-net@freebsd.org, Charles Sprickman <spork@bway.net> Subject: Re: bce packet loss Message-ID: <CAN6yY1satrdKHkteL_-_YGEASPEf_%2BXr1kS6KLM7io8hd6Kuhw@mail.gmail.com> In-Reply-To: <7575C8FD-4E99-4A27-833F-312230078E9E@mac.com> References: <alpine.OSX.2.00.1107042113000.2407@freemac> <BE3848B9-96C4-4F67-9565-60382DA7D6DB@mac.com> <CAN6yY1u6%2Bh3qcM6KmASMBQqGE8H7GuCoPYt-5U_aLS=BHz313Q@mail.gmail.com> <7575C8FD-4E99-4A27-833F-312230078E9E@mac.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Jul 6, 2011 at 1:04 PM, Chuck Swiger <cswiger@mac.com> wrote: > On Jul 6, 2011, at 12:27 PM, Kevin Oberman wrote: >> 1 in 10**6? That is totally excessive. > > It's high for a switched LAN, but I'd imagine you remember collision rate= s on hubs, which might well exceed 1% of the packets when the network is un= der load. > >> The Ethernet spec requires no worse than 10**13 and that is far worse th= an should ever be seen in the real world. At one in a million, any remotely= high volume transfer will crawl, especially over a long path. > > 10 Gigabit ethernet wants cabling spec'ed to a BER of 10e-13; standard gi= gabit ethernet cabling (Cat 5e) supposedly is rated for 10e-10. =A0However,= the BER of the cabling doesn't translate directly into octet error count p= er the NIC statistics, since a bad bit anywhere in a packet causes the enti= re packet to be dropped with a failed checksum. > >> If dropped packets ate being reported, the most common cause is fan-in. = If two input ports are both trying to talk a line rate to a single output p= ort, the buffer will fill an packets will be dropped. Most switches do tail= drop, so queue management is terrible, compounding the effects. > > Yes, I agree with this as a likely cause. Just to try to stamp out old Ethernet myths... Any modern Ethernet should be running full-duplex. The only cases where half-duplex should ever happen is with ancient hardware that does not support full-duplex and when an old hub (not a switch) is used to connect multiple systems. Neither is going to be the case for almost all users. If you are running full-duplex, there are NO collisions, by definition of full-duplex. If you ever see a collision, your end of the link IS half duplex and the other end might well be run full-duplex which produces many errors on the full-duplex end and lots of collisions as well as errors on the half-duplex end. Performance WILL suck, but this does not match the reported symptoms. I am not aware of any switch, router, not NIC that counts FCS (checksum) errors as "drops". Drops are not errors according to 802.The term is normally reserved for clean packets which are thrown away due to the lack of resources to retain them or due to policy (policer, RED, or other queue management technique). I erred in my statement that the Ethernet spec is a BER of 10**-13. It is 10*-12. There was a significant push by several committee members to raise it to 10**-13, but it failed. 10GWE is still 10**-12. I assure your that any WAN link my former network (ESnet) used that exceeded 10**-15 was considered bad and unacceptable. We required continuous transmission for at least 24 hours at a data rate of over 50% of the link speed (e.g. 500Mbps for a 1 Gbps circuit) before a link is accepted. 10**-10 would be rejected in a matter of minutes. Links (often over a thousand miles in length) were seldom rejected. These are, of course, fiber circuits, not twisted-pair. Finally, collisions are simply not errors. They are not counted as errors and do not result in any packet loss. They are simply a normal part of half-duplex operation. Years ago, when coaxial Ethernet the norm, Van Jacobson wrote a short article describing the lack of impact of collisions. He pointed out that in a common pathological case involving the ACK in an FTP transfer always colliding with the transmit of the of the next packet. He measured good-put of over 9Mbps with 100% collisions. (Collision rate is non-intuitive because the maximum collision rate is not 100%, but 1600% because a maximum of 16 collisions are allowed before the transmission attempt stops and an error of excessive collisions is declared. Again, this is just backgound information, not relevant to the issue at hand. --=20 R. Kevin Oberman, Network Engineer - Retired E-mail: kob6558@gmail.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAN6yY1satrdKHkteL_-_YGEASPEf_%2BXr1kS6KLM7io8hd6Kuhw>