From owner-freebsd-net@FreeBSD.ORG Thu Jul 7 06:00:28 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4E323106564A for ; Thu, 7 Jul 2011 06:00:28 +0000 (UTC) (envelope-from spork@bway.net) Received: from xena.bway.net (xena.bway.net [216.220.96.26]) by mx1.freebsd.org (Postfix) with ESMTP id 16C2A8FC0A for ; Thu, 7 Jul 2011 06:00:27 +0000 (UTC) Received: (qmail 35306 invoked by uid 0); 7 Jul 2011 06:00:27 -0000 Received: from smtp.bway.net (216.220.96.25) by xena.bway.net with (DHE-RSA-AES256-SHA encrypted) SMTP; 7 Jul 2011 06:00:27 -0000 Received: (qmail 35298 invoked by uid 90); 7 Jul 2011 06:00:27 -0000 Received: from unknown (HELO ?10.3.2.40?) (spork@bway.net@96.57.144.66) by smtp.bway.net with (DHE-RSA-AES256-SHA encrypted) SMTP; 7 Jul 2011 06:00:27 -0000 Date: Thu, 7 Jul 2011 02:00:26 -0400 (EDT) From: Charles Sprickman X-X-Sender: spork@freemac To: YongHyeon PYUN In-Reply-To: <20110706201509.GA5559@michelle.cdnetworks.com> Message-ID: References: <20110706201509.GA5559@michelle.cdnetworks.com> User-Agent: Alpine 2.00 (OSX 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-net@freebsd.org, David Christensen Subject: Re: bce packet loss X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jul 2011 06:00:28 -0000 More inline, including a bigger picture of what I'm seeing on some other hosts, but I wanted to thank everyone for all the fascinating ethernet BER info and the final explanation of what the "IfHCInBadOctets" counter represents. Interesting stuff. On Wed, 6 Jul 2011, YongHyeon PYUN wrote: > On Mon, Jul 04, 2011 at 09:32:11PM -0400, Charles Sprickman wrote: >> Hello, >> >> We're running a few 8.1-R servers with Broadcom bce interfaces (Dell R510) >> and I'm seeing occasional packet loss on them (enough that it trips nagios >> now and then). Cabling seems fine as neither the switch nor the sysctl >> info for the device show any errors/collisions/etc, however there is one >> odd one, which is "dev.bce.1.stat_IfHCInBadOctets: 539369". See [1] below >> for full sysctl output. The switch shows no errors but for "Dropped >> packets 683868". >> >> pciconf output is also below. [2] >> >> By default, the switch had flow control set to "on". I also let it run >> with "auto". In both cases, the drops continued to increment. I'm now >> running with flow control off to see if that changes anything. >> >> I do see some correlation between cpu usage and drops - I have cpu usage >> graphed in nagios and cacti is graphing the drops on the dell switch. >> There's no signs of running out of mbufs or similar. >> >> So given that limited info, is there anything I can look at to track this >> down? Anything stand out in the stats sysctl exposes? Two things are >> standing out for me - the number of changes in bce regarding flow control >> that are not in 8.1, and the correlation between cpu load and the drops. >> >> What other information can I provide? >> > > You had 282 RX buffer shortages and these frames were dropped. This > may explain why you see occasional packet loss. 'netstat -m' will > show which size of cluster allocation were failed. Nothing of note: 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/0/0 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed > However it seems you have 0 com_no_buffers which indicates > controller was able to receive all packets destined for this host. > You may host lost some packets(i.e. non-zero mbuf_alloc_failed_count) > but your controller and system was still responsive to the network > traffic. OK. I recall seeing a thread in the -net archives where some folks had the "com_no_buffers" incrementing, but I'm not seeing that at all. > Data sheet says IfHCInBadOctets indicates number of octets received > on the interface, including framing characters for packets that > were dropped in the MAC for any reason. I'm not sure this counter > includes packets IfInFramesL2FilterDiscards which indicates number > of good frames that have been dropped due to the L2 perfect match, > broadcast, multicast or MAC control frame filters. If your switch > runs STP it would periodically sends BPDU packets to destination > address of STP multicast address 01:80:C2:00:00:00. Not sure this > is the reason though. Probably David can explain more details on > IfHCInBadOctets counter(CCed). Again, thanks for that. If I could just ask for a bit more assistance, it would be greatly appreciated. I collected a fair bit of data and it's done nothing but complicate the issue for me so far. -If I'm reading the switch stats correctly, most of my drops are host->switch, although I'm not certain of that, these Dell 2848s have no real cli interface to speak of. -I'm seeing similar drops, but not quite so bad, on other hosts. They all use the em interface but for one other with bge. This particular host (with the bce interface) just seems to get bad enough to trigger nagios alerts (simple ping check from a host on the same switch/subnet). All these hosts are forced to 100/FD as is the switch. The switch is our external (internet facing) switch with a 100Mb connection to our upstream. At *peak* our aggregate bandwidth on this switch is maybe 45Mb/s, most of it outbound. We are nowhere near saturating the switching fabric (I hope). -There are three reasons I set the ports at 100baseTX - the old Cisco that lost a few ports was a 10/100 switch and the hosts were already hard-coded for 100/FD, I figured if the Dell craps out I can toss the Cisco back without changing the speed/duplex on all the hosts, and lastly our uplink is only 100/FD so why bother. Also maybe some vague notion that I'd not use up some kind of buffers in the switch by matching the speed on all ports... -We have an identical switch (same model, same hardware rev, same firmware) for our internal network (lots of log analysis over nfs mounts, a ton of internal dns (upwards of 10K queries/sec at peak), and occasional large file transfers. On this host and all others, the dropped packet count on the switch ports is at worst around 5000 packets. The counters have not been reset on it and it's been up for 460 days. -A bunch of legacy servers that have fxp interfaces on the external switch and em on the internal switch show *no* significant drops nor do the switch ports they are connected to. -To see if forcing the ports to 100/FD was causing a problem, I set the host and switch to 1000/FD. Over roughly 24 hours, the switch is reporting 197346 dropped packets of 52166986 packets received. -Tonight's change was to turn off spanning tree. This is a long shot based on some Dell bug I saw discussed on their forums. Given our simple network layout, I don't really see spanning tree as being at all necessary. One of the first replies I got to my original post was private and amounted to "Dell is garbage". That may be true, but the excellent performance on the more heavily loaded internal network makes me doubt there's a fundamental shortcoming in the switch. It would have to be real garbage to crap out with a combined load of 45Mb/s. I am somewhat curious if some weird buffering issue is possible with a mix of 100/FD and 1000/FD ports. Any thoughts on that? It's the only thing that differs between the two switches. Before replacing the switch I'm also going to cycle through turning off TSO, rxcsum, and txcsum since it seems that has been a fix for some people with otherwise unexplained network issues. I assume those features all depend on the firmware of the NIC being bug-free, and I'm not quite ready to accept that. Thanks, Charles