From owner-freebsd-net@FreeBSD.ORG Thu Sep 23 19:27:15 2010 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1CAA4106564A; Thu, 23 Sep 2010 19:27:15 +0000 (UTC) (envelope-from tom@tomjudge.com) Received: from eu1sys200aog120.obsmtp.com (eu1sys200aog120.obsmtp.com [207.126.144.149]) by mx1.freebsd.org (Postfix) with SMTP id 952078FC0A; Thu, 23 Sep 2010 19:27:13 +0000 (UTC) Received: from source ([63.174.175.251]) by eu1sys200aob120.postini.com ([207.126.147.11]) with SMTP ID DSNKTJuqDpG1qHZwKpPmexcG3sDzg2FKdoQu@postini.com; Thu, 23 Sep 2010 19:27:14 UTC Received: from [172.17.10.53] (unknown [172.17.10.53]) by bbbx3.usdmm.com (Postfix) with ESMTP id E31FAFD01D; Thu, 23 Sep 2010 19:27:06 +0000 (UTC) Message-ID: <4C9BA9FD.50406@tomjudge.com> Date: Thu, 23 Sep 2010 14:26:53 -0500 From: Tom Judge User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.12) Gecko/20100915 Lightning/1.0b1 Thunderbird/3.0.8 MIME-Version: 1.0 To: David Christensen References: <4C894A76.5040200@tomjudge.com> <20100910002439.GO7203@michelle.cdnetworks.com> <4C8E3D79.6090102@tomjudge.com> <20100913184833.GF1229@michelle.cdnetworks.com> <4C8E768E.7000003@tomjudge.com> <20100913193322.GG1229@michelle.cdnetworks.com> <4C8E8BD1.5090007@tomjudge.com> <20100913205348.GJ1229@michelle.cdnetworks.com> <4C9B6CBD.2030408@tomjudge.com> <5D267A3F22FD854F8F48B3D2B52381933B5A78B484@IRVEXCHCCR01.corp.ad.broadcom.com> In-Reply-To: <5D267A3F22FD854F8F48B3D2B52381933B5A78B484@IRVEXCHCCR01.corp.ad.broadcom.com> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: "pyunyh@gmail.com" , "freebsd-net@freebsd.org" , "yongari@freebsd.org" Subject: Re: bce(4) - com_no_buffers (Again) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Sep 2010 19:27:15 -0000 On 09/23/2010 01:21 PM, David Christensen wrote: >>>> Under testing I have yet to see a memory fragmentation issue with >>>> >> this >> >>>> driver. I follow up if/when I find a problem with this again. >>>> >>>> >>>> >> So here we are again. The system is locking up again because of 9k >> mbuf >> allocation failures. >> > Failure to allocate a new buffer should cause the driver to > drop the received frame and reuse the buffer, not lock up the > system. Are you seeing the lockup come from bce(4) or does > it come from somewhere else due to the dropped data? > > The lockup is not from the NIC as such, the systems have the appearance of locking up as home directories are on NFS and the user information is stored in a remote LDAP server. When the system starts to drop frames due to lack of 9k memory regions it tends to last for a few minutes (when it is really bad) and stop all traffic into the system. This appears to the average user as a complete system pause. >>>> Is there a way to fix the RX buffer shortage issues (when header >>>> splitting is turned on) so that they are guarded by flow control. >>>> >> Maybe >> >>>> change the low watermark for flow control when its enabled? >>>> >>>> >>>> >>> I'm not sure how much it would help but try changing RX low >>> watermark. Default value is 32 which seems to be reasonable value. >>> But it's only for 5709/5716 controllers and Linux seems to use >>> different default value. >>> >>> >> These are: NetXtreme II BCM5709 Gigabit Ethernet >> >> So my next task is to turn the watermark related defines into sysctls >> and turn on header splitting so that I can try to tune them without >> having to reboot. >> >> > Do you have flow control enabled? There are arguments both for > and against flow control. For bce(4), I haven't tested flow control > for quite a while and it's behavior may have changed since it is > controlled by firmware. Keep an eye on the hardware statistics > to see that's it's actively generating pause frames. > At the moment I have a number tests: 1) With flow control disabled and header splitting on or off flood the server with very small frames (200 bytes). This will trigger the firmware to drop frames due to BD shortages (incrementing dev.bce.X.com_no_buffers). Traffic source: route change test-system -mtu 200 dd if=/dev/zero bs=8000 | nc -l 1111 Test system: nc source 1111 > /dev/null 2) With flow control enabled and header splitting off flood the server with traffic with very slow userland processing: Traffic source: for I in 1 2 3 4 5 6 7 8; do ( dd if=/dev/zero bs=8000 | nc -l 1111$I & ); done Test system: 8* nc source 1111$I | throttle -k 1 > /dev/null On our systems this will reliably trigger denied 9k allocations. 3) With flow control enabled and header splitting on flood the server with very small frames (200 bytes). (Using the same test as in case 1). My aim is to tune the watermark here so that there are no frames dropped due to BD shortages. I am under the impression that the best solution is to tune the RX ring so that flow control can be disabled but I not sure I could do this. >> My next question is, is it possible to increase the size of the RX ring >> without switching to RSS? >> >> > I have a change I've been working on to allow RX/TX ring size > to be adjusted through a sysctl. Let me pretty it up a bit and > send it to you for test. You should be able to adjust the ring > size without enabling RSS. > > If you can provide a patch I have hardware available to test on. Thanks Tom -- TJU13-ARIN