Date: Wed, 3 Jun 2009 17:56:50 -0400 From: Alexander Sack <pisymbol@gmail.com> To: freebsd-net@freebsd.org Subject: bge(4) input errors and LINK_LOST condition problem Message-ID: <3c0b01820906031456h6db0e2e0w1becc6835c11c723@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
Hello: I'm running FreeBSD-6.1-amd64 on an Intel motherboard with an Intel Core 2 processor (6400 I believe) with 8GB of RAM. I'm using bge(4) as a monitoring port listening to traffic from a GIGE switch. The traffic is a pcap replay and its running at 2% utilization through a switch. The card auto-negotiates to the right speed as shown below. The problem is the following: input (bge3) output packets errs bytes packets errs bytes colls 32800 0 6933920 0 0 0 0 32800 0 6933920 0 0 0 0 32560 0 6883184 0 0 0 0 32800 0 6933920 0 0 0 0 32503 2 6871316 0 0 0 0 32639 0 6899718 0 0 0 0 32960 0 6967744 0 0 0 0 32880 0 6950832 0 0 0 0 32720 0 6917008 0 0 0 0 32720 0 6917008 0 0 0 0 32720 0 6917008 0 0 0 0 32437 1 6857197 0 0 0 0 32550 0 6881070 0 0 0 0 32400 0 6849360 0 0 0 0 32760 1 6925081 0 0 0 0 32832 0 6940316 0 0 0 0 32467 0 6863408 0 0 0 0 32640 0 6900096 0 0 0 0 32480 0 6866272 0 0 0 0 32668 0 6905617 0 0 0 0 32828 0 6939889 0 0 0 0 I am seeing ifp->ierrors because these receive bd descriptors are marked with the LINK_LOST bit in the bd_error_flag after some instrumentation. # ifconfig bge3 bge3: flags=48943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,MONITOR> mtu 9000 options=1b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING> inet6 fe80::2e0:edff:fe11:90b3%bge3 prefixlen 64 scopeid 0x4 ether 00:e0:ed:11:90:b3 media: Ethernet autoselect (1000baseTX <full-duplex>) status: active # pciconf -l | grep bge3 bge3@pci8:6:1: class=0x020000 card=0x164814e4 chip=0x164814e4 rev=0x10 hdr=0x00 I took the stats stuff off of CURRENT and backported it to my 6.1 kernel and I see: # sysctl -a | grep bge.3 dev.bge.3.%desc: Broadcom BCM5704 B0, ASIC rev. 0x2100 dev.bge.3.%driver: bge dev.bge.3.%location: slot=6 function=1 dev.bge.3.%pnpinfo: vendor=0x14e4 device=0x1648 subvendor=0x14e4 subdevice=0x1648 class=0x020000 dev.bge.3.%parent: pci8 dev.bge.3.rx_coal_ticks: 150 dev.bge.3.tx_coal_ticks: 1000000 dev.bge.3.rx_max_coal_bds: 16 dev.bge.3.tx_max_coal_bds: 32 dev.bge.3.debug_info: -1 dev.bge.3.reg_read: -1172242433 dev.bge.3.mem_read: -1172242433 dev.bge.3.stat_IfHcInOctets: 1824848515 dev.bge.3.stat_IfHcOutOctets: 0 dev.bge.3.stats.FramesDroppedDueToFilters: 0 dev.bge.3.stats.DmaWriteQueueFull: 0 dev.bge.3.stats.DmaWriteHighPriQueueFull: 0 dev.bge.3.stats.NoMoreRxBDs: 0 dev.bge.3.stats.InputDiscards: 0 dev.bge.3.stats.InputErrors: 3 dev.bge.3.stats.RecvThresholdHit: 1501751 dev.bge.3.stats.DmaReadQueueFull: 0 dev.bge.3.stats.DmaReadHighPriQueueFull: 0 dev.bge.3.stats.SendDataCompQueueFull: 0 dev.bge.3.stats.RingSetSendProdIndex: 0 dev.bge.3.stats.RingStatusUpdate: 1502233 dev.bge.3.stats.Interrupts: 544091 dev.bge.3.stats.AvoidedInterrupts: 958142 dev.bge.3.stats.SendThresholdHit: 0 dev.bge.3.stats.rx.Octets: 1825744579 dev.bge.3.stats.rx.Fragments: 0 dev.bge.3.stats.rx.UcastPkts: 8476045 dev.bge.3.stats.rx.MulticastPkts: 0 dev.bge.3.stats.rx.FCSErrors: 3 dev.bge.3.stats.rx.AlignmentErrors: 0 dev.bge.3.stats.rx.xonPauseFramesReceived: 0 dev.bge.3.stats.rx.xoffPauseFramesReceived: 0 dev.bge.3.stats.rx.ControlFramesReceived: 0 dev.bge.3.stats.rx.xoffStateEntered: 0 dev.bge.3.stats.rx.FramesTooLong: 0 dev.bge.3.stats.rx.Jabbers: 0 dev.bge.3.stats.rx.UndersizePkts: 0 dev.bge.3.stats.rx.inRangeLengthError: 0 dev.bge.3.stats.rx.outRangeLengthError: 0 dev.bge.3.stats.tx.Octets: 0 dev.bge.3.stats.tx.Collisions: 0 dev.bge.3.stats.tx.XonSent: 0 dev.bge.3.stats.tx.XoffSent: 0 dev.bge.3.stats.tx.flowControlDone: 0 dev.bge.3.stats.tx.InternalMacTransmitErrors: 0 dev.bge.3.stats.tx.SingleCollisionFrames: 0 dev.bge.3.stats.tx.MultipleCollisionFrames: 0 dev.bge.3.stats.tx.DeferredTransmissions: 0 dev.bge.3.stats.tx.ExcessiveCollisions: 0 dev.bge.3.stats.tx.LateCollisions: 0 dev.bge.3.stats.tx.UcastPkts: 0 dev.bge.3.stats.tx.MulticastPkts: 0 dev.bge.3.stats.tx.BroadcastPkts: 0 dev.bge.3.stats.tx.CarrierSenseErrors: 0 dev.bge.3.stats.tx.Discards: 0 dev.bge.3.stats.tx.Errors: 0 A colleague mentioned that because I am using bge(4) as a monitoring card it is passively listening and unable to send back a Ethernet clock resync if the GIGE frame clock gets out of sync which could cause micro drops during the window in which the clocks are trying to get back on track. I can believe that though I feel that after trying multiple switches I find this very odd. Do other folks who use bge(4) see this same behavior? I noticed that my FCSErrors == InputERrors which makes sense since I had 3 packets with FCS errors (CRC32 check fail I believe). Yet my input errors via LINK_LOST are constant, tiny, and random. What's more interesting is I don't even see a drop. If I record and dump the pcap, the traffic looks fine to me through Wireshark (I am going to look again but I don't see sequence out of order or lost messages anyway, its very simple TCP/IP traffic). Are these frames retried auto-magically? If so, then aren't LINK_LOST errors potentially not real drops in the monitoring case and should not be reported as such? Can someone please define causes of the LINK_LOST condition (bd_error_flag = 0x4)? Thanks! -aps
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3c0b01820906031456h6db0e2e0w1becc6835c11c723>