Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 3 Jun 2009 17:56:50 -0400
From:      Alexander Sack <pisymbol@gmail.com>
To:        freebsd-net@freebsd.org
Subject:   bge(4) input errors and LINK_LOST condition problem
Message-ID:  <3c0b01820906031456h6db0e2e0w1becc6835c11c723@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Hello:

I'm running FreeBSD-6.1-amd64 on an Intel motherboard with an Intel
Core 2 processor (6400 I believe) with 8GB of RAM.  I'm using bge(4)
as a monitoring port listening to traffic from a GIGE switch.  The
traffic is a pcap replay and its running at 2% utilization through a
switch.  The card auto-negotiates to the right speed as shown below.

The problem is the following:

     input         (bge3)           output
   packets  errs      bytes    packets  errs      bytes colls
     32800     0    6933920          0     0          0     0
     32800     0    6933920          0     0          0     0
     32560     0    6883184          0     0          0     0
     32800     0    6933920          0     0          0     0
     32503     2    6871316          0     0          0     0
     32639     0    6899718          0     0          0     0
     32960     0    6967744          0     0          0     0
     32880     0    6950832          0     0          0     0
     32720     0    6917008          0     0          0     0
     32720     0    6917008          0     0          0     0
     32720     0    6917008          0     0          0     0
     32437     1    6857197          0     0          0     0
     32550     0    6881070          0     0          0     0
     32400     0    6849360          0     0          0     0
     32760     1    6925081          0     0          0     0
     32832     0    6940316          0     0          0     0
     32467     0    6863408          0     0          0     0
     32640     0    6900096          0     0          0     0
     32480     0    6866272          0     0          0     0
     32668     0    6905617          0     0          0     0
     32828     0    6939889          0     0          0     0

I am seeing ifp->ierrors because these receive bd descriptors are
marked with the LINK_LOST bit in the bd_error_flag after some
instrumentation.

# ifconfig bge3
bge3: flags=48943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,MONITOR>
mtu 9000
	options=1b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING>
	inet6 fe80::2e0:edff:fe11:90b3%bge3 prefixlen 64 scopeid 0x4
	ether 00:e0:ed:11:90:b3
	media: Ethernet autoselect (1000baseTX <full-duplex>)
	status: active
# pciconf -l | grep bge3
bge3@pci8:6:1:	class=0x020000 card=0x164814e4 chip=0x164814e4 rev=0x10 hdr=0x00

I took the stats stuff off of CURRENT and backported it to my 6.1
kernel and I see:

# sysctl -a | grep bge.3
dev.bge.3.%desc: Broadcom BCM5704 B0, ASIC rev. 0x2100
dev.bge.3.%driver: bge
dev.bge.3.%location: slot=6 function=1
dev.bge.3.%pnpinfo: vendor=0x14e4 device=0x1648 subvendor=0x14e4
subdevice=0x1648 class=0x020000
dev.bge.3.%parent: pci8
dev.bge.3.rx_coal_ticks: 150
dev.bge.3.tx_coal_ticks: 1000000
dev.bge.3.rx_max_coal_bds: 16
dev.bge.3.tx_max_coal_bds: 32
dev.bge.3.debug_info: -1
dev.bge.3.reg_read: -1172242433
dev.bge.3.mem_read: -1172242433
dev.bge.3.stat_IfHcInOctets: 1824848515
dev.bge.3.stat_IfHcOutOctets: 0
dev.bge.3.stats.FramesDroppedDueToFilters: 0
dev.bge.3.stats.DmaWriteQueueFull: 0
dev.bge.3.stats.DmaWriteHighPriQueueFull: 0
dev.bge.3.stats.NoMoreRxBDs: 0
dev.bge.3.stats.InputDiscards: 0
dev.bge.3.stats.InputErrors: 3
dev.bge.3.stats.RecvThresholdHit: 1501751
dev.bge.3.stats.DmaReadQueueFull: 0
dev.bge.3.stats.DmaReadHighPriQueueFull: 0
dev.bge.3.stats.SendDataCompQueueFull: 0
dev.bge.3.stats.RingSetSendProdIndex: 0
dev.bge.3.stats.RingStatusUpdate: 1502233
dev.bge.3.stats.Interrupts: 544091
dev.bge.3.stats.AvoidedInterrupts: 958142
dev.bge.3.stats.SendThresholdHit: 0
dev.bge.3.stats.rx.Octets: 1825744579
dev.bge.3.stats.rx.Fragments: 0
dev.bge.3.stats.rx.UcastPkts: 8476045
dev.bge.3.stats.rx.MulticastPkts: 0
dev.bge.3.stats.rx.FCSErrors: 3
dev.bge.3.stats.rx.AlignmentErrors: 0
dev.bge.3.stats.rx.xonPauseFramesReceived: 0
dev.bge.3.stats.rx.xoffPauseFramesReceived: 0
dev.bge.3.stats.rx.ControlFramesReceived: 0
dev.bge.3.stats.rx.xoffStateEntered: 0
dev.bge.3.stats.rx.FramesTooLong: 0
dev.bge.3.stats.rx.Jabbers: 0
dev.bge.3.stats.rx.UndersizePkts: 0
dev.bge.3.stats.rx.inRangeLengthError: 0
dev.bge.3.stats.rx.outRangeLengthError: 0
dev.bge.3.stats.tx.Octets: 0
dev.bge.3.stats.tx.Collisions: 0
dev.bge.3.stats.tx.XonSent: 0
dev.bge.3.stats.tx.XoffSent: 0
dev.bge.3.stats.tx.flowControlDone: 0
dev.bge.3.stats.tx.InternalMacTransmitErrors: 0
dev.bge.3.stats.tx.SingleCollisionFrames: 0
dev.bge.3.stats.tx.MultipleCollisionFrames: 0
dev.bge.3.stats.tx.DeferredTransmissions: 0
dev.bge.3.stats.tx.ExcessiveCollisions: 0
dev.bge.3.stats.tx.LateCollisions: 0
dev.bge.3.stats.tx.UcastPkts: 0
dev.bge.3.stats.tx.MulticastPkts: 0
dev.bge.3.stats.tx.BroadcastPkts: 0
dev.bge.3.stats.tx.CarrierSenseErrors: 0
dev.bge.3.stats.tx.Discards: 0
dev.bge.3.stats.tx.Errors: 0

A colleague mentioned that because I am using bge(4) as a monitoring
card it is passively listening and unable to send back a Ethernet
clock resync if the GIGE frame clock gets out of sync which could
cause micro drops during the window in which the clocks are trying to
get back on track.  I can believe that though I feel that after trying
multiple switches I find this very odd.  Do other folks who use bge(4)
see this same behavior?

I noticed that my FCSErrors == InputERrors which makes sense since I
had 3 packets with FCS errors (CRC32 check fail I believe).  Yet my
input errors via LINK_LOST are constant, tiny, and random.

What's more interesting is I don't even see a drop.  If I record and
dump the pcap, the traffic looks fine to me through Wireshark (I am
going to look again but I don't see sequence out of order or lost
messages anyway, its very simple TCP/IP traffic).  Are these frames
retried auto-magically?  If so, then aren't LINK_LOST errors
potentially not real drops in the monitoring case and should not be
reported as such?  Can someone please define causes of the LINK_LOST
condition (bd_error_flag = 0x4)?

Thanks!

-aps



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3c0b01820906031456h6db0e2e0w1becc6835c11c723>