From owner-freebsd-net@FreeBSD.ORG Fri May 13 11:17:23 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 675E0106564A for ; Fri, 13 May 2011 11:17:23 +0000 (UTC) (envelope-from dudu@dudu.ro) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id 1D58F8FC0C for ; Fri, 13 May 2011 11:17:22 +0000 (UTC) Received: by qwc9 with SMTP id 9so1638635qwc.13 for ; Fri, 13 May 2011 04:17:22 -0700 (PDT) Received: by 10.229.88.198 with SMTP id b6mr1015158qcm.208.1305285442198; Fri, 13 May 2011 04:17:22 -0700 (PDT) MIME-Version: 1.0 Received: by 10.229.31.73 with HTTP; Fri, 13 May 2011 04:16:42 -0700 (PDT) In-Reply-To: References: <20110313011632.GA1621@michelle.cdnetworks.com> <20110330171023.GA8601@michelle.cdnetworks.com> From: Vlad Galu Date: Fri, 13 May 2011 13:16:42 +0200 Message-ID: To: freebsd-net@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Re: bge(4) on RELENG_8 mbuf cluster starvation X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 May 2011 11:17:23 -0000 On Wed, Mar 30, 2011 at 7:17 PM, Vlad Galu wrote: > > > On Wed, Mar 30, 2011 at 7:10 PM, YongHyeon PYUN wrote: > >> On Wed, Mar 30, 2011 at 05:55:47PM +0200, Vlad Galu wrote: >> > On Sun, Mar 13, 2011 at 2:16 AM, YongHyeon PYUN >> wrote: >> > >> > > On Sat, Mar 12, 2011 at 09:17:28PM +0100, Vlad Galu wrote: >> > > > On Sat, Mar 12, 2011 at 8:53 PM, Arnaud Lacombe > > >> > > wrote: >> > > > >> > > > > Hi, >> > > > > >> > > > > On Sat, Mar 12, 2011 at 4:03 AM, Vlad Galu wrote: >> > > > > > Hi folks, >> > > > > > >> > > > > > On a fairly busy recent (r219010) RELENG_8 machine I keep >> getting >> > > > > > -- cut here -- >> > > > > > 1096/1454/2550 mbufs in use (current/cache/total) >> > > > > > 1035/731/1766/262144 mbuf clusters in use >> (current/cache/total/max) >> > > > > > 1035/202 mbuf+clusters out of packet secondary zone in use >> > > > > (current/cache) >> > > > > > 0/117/117/12800 4k (page size) jumbo clusters in use >> > > > > > (current/cache/total/max) >> > > > > > 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) >> > > > > > 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) >> > > > > > 2344K/2293K/4637K bytes allocated to network >> (current/cache/total) >> > > > > > 0/70128196/37726935 requests for mbufs denied >> > > > > (mbufs/clusters/mbuf+clusters) >> > > > > > ^^^^^^^^^^^^^^^^^^^^^ >> > > > > > -- and here -- >> > > > > > >> > > > > > kern.ipc.nmbclusters is set to 131072. Other settings: >> > > > > no, netstat(8) says 262144. >> > > > > >> > > > > >> > > > Heh, you're right, I forgot I'd doubled it a while ago. Wrote that >> from >> > > the >> > > > top of my head. >> > > > >> > > > >> > > > > Maybe can you include $(sysctl dev.bge) ? Might be useful. >> > > > > >> > > > > - Arnaud >> > > > > >> > > > >> > > > Sure: >> > > >> > > [...] >> > > >> > > > dev.bge.1.%desc: Broadcom NetXtreme Gigabit Ethernet Controller, >> ASIC >> > > rev. >> > > > 0x004101 >> > > > dev.bge.1.%driver: bge >> > > > dev.bge.1.%location: slot=0 function=0 >> > > > dev.bge.1.%pnpinfo: vendor=0x14e4 device=0x1659 subvendor=0x1014 >> > > > subdevice=0x02c6 class=0x020000 >> > > > dev.bge.1.%parent: pci5 >> > > > dev.bge.1.forced_collapse: 2 >> > > > dev.bge.1.forced_udpcsum: 0 >> > > > dev.bge.1.stats.FramesDroppedDueToFilters: 0 >> > > > dev.bge.1.stats.DmaWriteQueueFull: 0 >> > > > dev.bge.1.stats.DmaWriteHighPriQueueFull: 0 >> > > > dev.bge.1.stats.NoMoreRxBDs: 680050 >> > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >> > > This indicates bge(4) encountered RX buffer shortage. Perhaps >> > > bge(4) couldn't fill new RX buffers for incoming frames due to >> > > other system activities. >> > > >> > > > dev.bge.1.stats.InputDiscards: 228755931 >> > > >> > > This counter indicates number of frames discarded due to RX buffer >> > > shortage. bge(4) discards received frame if it failed to allocate >> > > new RX buffer such that InputDiscards is normally higher than >> > > NoMoreRxBDs. >> > > >> > > > dev.bge.1.stats.InputErrors: 49080818 >> > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >> > > Something is wrong here. Too many frames were classified as error >> > > frames. You may see poor RX performance. >> > > >> > > > dev.bge.1.stats.RecvThresholdHit: 0 >> > > > dev.bge.1.stats.rx.ifHCInOctets: 2095148839247 >> > > > dev.bge.1.stats.rx.Fragments: 47887706 >> > > > dev.bge.1.stats.rx.UnicastPkts: 32672557601 >> > > > dev.bge.1.stats.rx.MulticastPkts: 1218 >> > > > dev.bge.1.stats.rx.BroadcastPkts: 2 >> > > > dev.bge.1.stats.rx.FCSErrors: 2822217 >> > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >> > > FCS errors are too high. Please check cabling again(I'm assuming >> > > the controller is not broken here). I think you can use vendor's >> > > diagnostic tools to verify this. >> > > >> > > > dev.bge.1.stats.rx.AlignmentErrors: 0 >> > > > dev.bge.1.stats.rx.xonPauseFramesReceived: 0 >> > > > dev.bge.1.stats.rx.xoffPauseFramesReceived: 0 >> > > > dev.bge.1.stats.rx.ControlFramesReceived: 0 >> > > > dev.bge.1.stats.rx.xoffStateEntered: 0 >> > > > dev.bge.1.stats.rx.FramesTooLong: 0 >> > > > dev.bge.1.stats.rx.Jabbers: 0 >> > > > dev.bge.1.stats.rx.UndersizePkts: 0 >> > > > dev.bge.1.stats.tx.ifHCOutOctets: 48751515826 >> > > > dev.bge.1.stats.tx.Collisions: 0 >> > > > dev.bge.1.stats.tx.XonSent: 0 >> > > > dev.bge.1.stats.tx.XoffSent: 0 >> > > > dev.bge.1.stats.tx.InternalMacTransmitErrors: 0 >> > > > dev.bge.1.stats.tx.SingleCollisionFrames: 0 >> > > > dev.bge.1.stats.tx.MultipleCollisionFrames: 0 >> > > > dev.bge.1.stats.tx.DeferredTransmissions: 0 >> > > > dev.bge.1.stats.tx.ExcessiveCollisions: 0 >> > > > dev.bge.1.stats.tx.LateCollisions: 0 >> > > > dev.bge.1.stats.tx.UnicastPkts: 281039183 >> > > > dev.bge.1.stats.tx.MulticastPkts: 0 >> > > > dev.bge.1.stats.tx.BroadcastPkts: 1153 >> > > > -- and here -- >> > > > >> > > > And now, that I remembered about this as well: >> > > > -- cut here -- >> > > > Name Mtu Network Address Ipkts Ierrs Idrop >> Opkts >> > > > Oerrs Coll >> > > > bge1 1500 00:11:25:22:0d:ed 32321767025 278517070 >> > > 37726837 >> > > > 281068216 0 0 >> > > > -- and here -- >> > > > The colo provider changed my cable a couple of times so I'd not >> blame it >> > > on >> > > > that. Unfortunately, I don't have access to the port statistics on >> the >> > > > switch. Running netstat with -w1 yields between 0 and 4 >> errors/second. >> > > > >> > > >> > > Hardware MAC counters still show high number of FCS errors. The >> > > service provider should have to check possible cabling issues on >> > > the port of the switch. >> > > >> > >> > After swapping cables and moving the NIC into another switch, there are >> some >> > improvements. However: >> > -- cut here -- >> > dev.bge.1.%desc: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC >> rev. >> > 0x004101 >> > dev.bge.1.%driver: bge >> > dev.bge.1.%location: slot=0 function=0 >> > dev.bge.1.%pnpinfo: vendor=0x14e4 device=0x1659 subvendor=0x1014 >> > subdevice=0x02c6 class=0x020000 >> > dev.bge.1.%parent: pci5 >> > dev.bge.1.forced_collapse: 0 >> > dev.bge.1.forced_udpcsum: 0 >> > dev.bge.1.stats.FramesDroppedDueToFilters: 0 >> > dev.bge.1.stats.DmaWriteQueueFull: 0 >> > dev.bge.1.stats.DmaWriteHighPriQueueFull: 0 >> > dev.bge.1.stats.NoMoreRxBDs: 243248 <- this >> > dev.bge.1.stats.InputDiscards: 9945500 >> > dev.bge.1.stats.InputErrors: 0 >> >> There are still discarded frames but I believe it's not related >> with any cabling issues since you don't have FCS or alignment >> errors. >> >> > dev.bge.1.stats.RecvThresholdHit: 0 >> > dev.bge.1.stats.rx.ifHCInOctets: 36697296701 >> > dev.bge.1.stats.rx.Fragments: 0 >> > dev.bge.1.stats.rx.UnicastPkts: 549334370 >> > dev.bge.1.stats.rx.MulticastPkts: 113638 >> > dev.bge.1.stats.rx.BroadcastPkts: 0 >> > dev.bge.1.stats.rx.FCSErrors: 0 >> > dev.bge.1.stats.rx.AlignmentErrors: 0 >> > dev.bge.1.stats.rx.xonPauseFramesReceived: 0 >> > dev.bge.1.stats.rx.xoffPauseFramesReceived: 0 >> > dev.bge.1.stats.rx.ControlFramesReceived: 0 >> > dev.bge.1.stats.rx.xoffStateEntered: 0 >> > dev.bge.1.stats.rx.FramesTooLong: 0 >> > dev.bge.1.stats.rx.Jabbers: 0 >> > dev.bge.1.stats.rx.UndersizePkts: 0 >> > dev.bge.1.stats.tx.ifHCOutOctets: 10578000636 >> > dev.bge.1.stats.tx.Collisions: 0 >> > dev.bge.1.stats.tx.XonSent: 0 >> > dev.bge.1.stats.tx.XoffSent: 0 >> > dev.bge.1.stats.tx.InternalMacTransmitErrors: 0 >> > dev.bge.1.stats.tx.SingleCollisionFrames: 0 >> > dev.bge.1.stats.tx.MultipleCollisionFrames: 0 >> > dev.bge.1.stats.tx.DeferredTransmissions: 0 >> > dev.bge.1.stats.tx.ExcessiveCollisions: 0 >> > dev.bge.1.stats.tx.LateCollisions: 0 >> > dev.bge.1.stats.tx.UnicastPkts: 64545266 >> > dev.bge.1.stats.tx.MulticastPkts: 0 >> > dev.bge.1.stats.tx.BroadcastPkts: 313 >> > >> > and >> > 0/1710531/2006005 requests for mbufs denied >> (mbufs/clusters/mbuf+clusters) >> > -- and here -- >> > >> > I'll start gathering some stats/charts on this host to see if I can >> > correlate the starvation with other system events. >> > >> >> Now MAC statistics counter show no abnormal things which in turn >> indicates the mbuf starvation came from other issues. The next >> thing is to identify which process or kernel subsystem consumes a >> lot of mbuf clusters. >> >> > Thanks for the feedback. Oh, there is a BPF consumer listening on bge1. > After noticing > http://www.mail-archive.com/freebsd-net@freebsd.org/msg25685.html, I > decided to shut it down for a while. It's pretty weird, my BPF buffer size > is set to 4MB and traffic on that interface is nowhere near that high. I'll > get back as soon as I have new data. > > >> > >> > >> > > However this does not explain why you have large number of mbuf >> > > cluster allocation failure. The only wild guess I have at this >> > > moment is some process or kernel subsystems are too slow to release >> > > allocated mbuf clusters. Did you check various system activities >> > > while seeing the issue? >> > > >> > > > > -- > Good, fast & cheap. Pick any two. > I've finally managed to see what triggers the symptom. It's a SYN flood. Tweaking the syncache and disabling PF made no measurable difference. What is odd is that the clock swi starts eating up more than 50% of the CPU. I tried both ACPI-fast and TSC. The machine is UP, so when the clock swi takes 50% of the CPU and the netisr swi takes another 50%, there isn't much CPU time left for user processes. -- Good, fast & cheap. Pick any two.