Date: Fri, 5 Mar 2010 10:40:46 -0800 From: Pyun YongHyeon <pyunyh@gmail.com> To: Ian FREISLICH <ianf@clue.co.za> Cc: current@freebsd.org Subject: Re: dev.bce.X.com_no_buffers increasing and packet loss Message-ID: <20100305184046.GD14818@michelle.cdnetworks.com> In-Reply-To: <E1Nnc4d-0003mB-6e@clue.co.za> References: <20100305175639.GB14818@michelle.cdnetworks.com> <E1NnVaT-0003Ft-3p@clue.co.za> <E1Nnc4d-0003mB-6e@clue.co.za>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Mar 05, 2010 at 08:16:31PM +0200, Ian FREISLICH wrote: > Pyun YongHyeon wrote: > > On Fri, Mar 05, 2010 at 01:20:57PM +0200, Ian FREISLICH wrote: > > > Hi > > > > > > I have a system that is experiencing mild to severe packet loss. > > > The interfaces are configured as follows: > > > > > > lagg0: bce0, bce1, bce2, bce3 lagproto lacp > > > > > > lagg0 then is used as the hwdev for the vlan interfaces. > > > > > > I have pf with a few queues for bandwidth management. > > > > > > There isn't that much traffic on it (200-500Mbit/s). > > > > > > I see only the following suspect for packet loss: > > > > > > dev.bce.0.com_no_buffers: 140151466 > > > dev.bce.1.com_no_buffers: 514723247 > > > dev.bce.2.com_no_buffers: 10454050 > > > dev.bce.3.com_no_buffers: 369371 > > > > > > Most of the time, these numbers are static, but every once in a > > > while they increase massively by several thousand, but only on 2 > > > interfaces. The 1 minute average rate on those interfaces is 266/s > > > and 123/s. > > > > > > Does anyone think this is related to the packet loss or are these > > > counters just a red herring? Is there anything that can be done > > > to reduce this count? > > > > > > > I think this sysctl node indicates number of dropped frames in > > completion processor of NetXtreme II. The counter is incremented > > when the processor received a frame successfully but it couldn't > > pass the frame to system as there are no available RX buffers so > > completion processor dopped the received frame. > > If you see mbuf shortage from netstat that would be normal. But if > > system has a lot of free mbuf resources it may indicate other > > issue. bce(4) may not be able to replenish controller with RX > > buffer if system is suffering from high load. > > I don't think I've ever seen an mbuf shortage on this host, and > load isn't that high, typically 12% CPU or 88% idle. That's just > on 2 (of 16) cores busy. There's tons of free memory (~12G) if I > need to increase the number of buffers available, but I'm not sure > which tunable to use to do that. The routing table also isn't large > at about 4000 prefixes. > > [firewall1.jnb1] ~ # netstat -m > 4118/7147/11265 mbufs in use (current/cache/total) > 3092/6850/9942/131072 mbuf clusters in use (current/cache/total/max) > 2060/4212 mbuf+clusters out of packet secondary zone in use (current/cache) > 0/678/678/65536 4k (page size) jumbo clusters in use (current/cache/total/max) > 0/0/0/32768 9k jumbo clusters in use (current/cache/total/max) > 0/0/0/16384 16k jumbo clusters in use (current/cache/total/max) > 7214K/18198K/25412K bytes allocated to network (current/cache/total) > 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > 0/0/0 sfbufs in use (current/peak/max) > 0 requests for sfbufs denied > 0 requests for sfbufs delayed > 0 requests for I/O initiated by sendfile > 0 calls to protocol drain routines > > I currently set the following in loader.conf: > > net.isr.maxthreads="8" > net.isr.direct=0 > if_igb_load="yes" > kern.ipc.nmbclusters="131072" > kern.maxusers="1024" > Would you show me the output of dmesg(bce(4)/brgphy(4) only) and the output of "pciconf -lcbv" for the controller?
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100305184046.GD14818>