Date: Wed, 6 Jul 2011 13:15:09 -0700 From: YongHyeon PYUN <pyunyh@gmail.com> To: Charles Sprickman <spork@bway.net> Cc: freebsd-net@freebsd.org, David Christensen <davidch@freebsd.org> Subject: Re: bce packet loss Message-ID: <20110706201509.GA5559@michelle.cdnetworks.com> In-Reply-To: <alpine.OSX.2.00.1107042113000.2407@freemac> References: <alpine.OSX.2.00.1107042113000.2407@freemac>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jul 04, 2011 at 09:32:11PM -0400, Charles Sprickman wrote: > Hello, > > We're running a few 8.1-R servers with Broadcom bce interfaces (Dell R510) > and I'm seeing occasional packet loss on them (enough that it trips nagios > now and then). Cabling seems fine as neither the switch nor the sysctl > info for the device show any errors/collisions/etc, however there is one > odd one, which is "dev.bce.1.stat_IfHCInBadOctets: 539369". See [1] below > for full sysctl output. The switch shows no errors but for "Dropped > packets 683868". > > pciconf output is also below. [2] > > By default, the switch had flow control set to "on". I also let it run > with "auto". In both cases, the drops continued to increment. I'm now > running with flow control off to see if that changes anything. > > I do see some correlation between cpu usage and drops - I have cpu usage > graphed in nagios and cacti is graphing the drops on the dell switch. > There's no signs of running out of mbufs or similar. > > So given that limited info, is there anything I can look at to track this > down? Anything stand out in the stats sysctl exposes? Two things are > standing out for me - the number of changes in bce regarding flow control > that are not in 8.1, and the correlation between cpu load and the drops. > > What other information can I provide? > You had 282 RX buffer shortages and these frames were dropped. This may explain why you see occasional packet loss. 'netstat -m' will show which size of cluster allocation were failed. However it seems you have 0 com_no_buffers which indicates controller was able to receive all packets destined for this host. You may host lost some packets(i.e. non-zero mbuf_alloc_failed_count) but your controller and system was still responsive to the network traffic. Data sheet says IfHCInBadOctets indicates number of octets received on the interface, including framing characters for packets that were dropped in the MAC for any reason. I'm not sure this counter includes packets IfInFramesL2FilterDiscards which indicates number of good frames that have been dropped due to the L2 perfect match, broadcast, multicast or MAC control frame filters. If your switch runs STP it would periodically sends BPDU packets to destination address of STP multicast address 01:80:C2:00:00:00. Not sure this is the reason though. Probably David can explain more details on IfHCInBadOctets counter(CCed).
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110706201509.GA5559>