From owner-freebsd-net@FreeBSD.ORG Mon Sep 13 15:04:39 2010 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 947AC1065672; Mon, 13 Sep 2010 15:04:39 +0000 (UTC) (envelope-from tom@tomjudge.com) Received: from eu1sys200aog110.obsmtp.com (eu1sys200aog110.obsmtp.com [207.126.144.129]) by mx1.freebsd.org (Postfix) with SMTP id 126548FC25; Mon, 13 Sep 2010 15:04:37 +0000 (UTC) Received: from source ([63.174.175.251]) by eu1sys200aob110.postini.com ([207.126.147.11]) with SMTP ID DSNKTI49g9JBjfe21ffOiFCziqNGiKg4k0id@postini.com; Mon, 13 Sep 2010 15:04:38 UTC Received: from [172.17.10.53] (unknown [172.17.10.53]) by bbbx3.usdmm.com (Postfix) with ESMTP id 133E3FD01A; Mon, 13 Sep 2010 15:04:34 +0000 (UTC) Message-ID: <4C8E3D79.6090102@tomjudge.com> Date: Mon, 13 Sep 2010 10:04:25 -0500 From: Tom Judge User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.12) Gecko/20100826 Lightning/1.0b1 Thunderbird/3.0.7 MIME-Version: 1.0 To: pyunyh@gmail.com References: <4C894A76.5040200@tomjudge.com> <20100910002439.GO7203@michelle.cdnetworks.com> In-Reply-To: <20100910002439.GO7203@michelle.cdnetworks.com> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org, davidch@broadcom.com, yongari@freebsd.org Subject: Re: bce(4) - com_no_buffers (Again) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Sep 2010 15:04:39 -0000 On 09/09/2010 07:24 PM, Pyun YongHyeon wrote: > On Thu, Sep 09, 2010 at 03:58:30PM -0500, Tom Judge wrote: > >> Hi, >> I am just following up on the thread from March (I think) about this issue. >> >> We are seeing this issue on a number of systems running 7.1. >> >> The systems in question are all Dell: >> >> * R710 R610 R410 >> * PE2950 >> >> The latter do not show the issue as much as the R series systems. >> >> The cards in one of the R610's that I am testing with are: >> >> bce0@pci0:1:0:0: class=0x020000 card=0x02361028 chip=0x163914e4 >> rev=0x20 hdr=0x00 >> vendor = 'Broadcom Corporation' >> device = 'NetXtreme II BCM5709 Gigabit Ethernet' >> class = network >> subclass = ethernet >> >> They are connected to Dell PowerConnect 5424 switches. >> >> uname -a: >> FreeBSD bandor.chi-dc.mintel.ad 7.1-RELEASE-p4 FreeBSD 7.1-RELEASE-p4 >> #3: Wed Sep 8 08:19:03 UTC 2010 >> tj@dev-tj-7-1-amd64.chicago.mintel.ad:/usr/obj/usr/src/sys/MINTELv10 amd64 >> >> We are also using 8192 byte jumbo frames, if_lagg and if_vlan in the >> configuration (the nics are in promisc as we are currently capturing >> netflow data on another vlan for diagnostic purposes. ): >> >> >> >> I have updated the bce driver and the Broadcomm MII driver to the >> version from stable/7 and am still seeing the issue. >> >> This morning I did a test with increasing the RX_PAGES to 8 but the >> system just hung starting the network. The route command got stuck in a >> zone state (Sorry can't remember exactly which). >> >> The real question is, how do we go about increasing the number of RX >> BDs? I guess we have to bump more that just RX_PAGES... >> >> >> The cause for us, from what we can see, is the openldap server sending >> large group search results back to nss_ldap or pam_ldap. When it does >> this it seems to send each of the 600 results in its own TCP segment >> creating a small packet storm (600*~100byte PDU's) at the destination >> host. The kernel then retransmits 2 blocks of 100 results each after >> SACK kicks in for the data that was dropped by the NIC. >> >> >> Thanks in advance >> >> Tom >> >> >> > FW may drop incoming frames when it does not see available RX > buffers. Increasing number of RX buffers slightly reduce the > possibility of dropping frames but it wouldn't completely fix it. > Alternatively driver may tell available RX buffers in the middle > of RX ring processing instead of giving updated buffers at the end > of RX processing. This way FW may see available RX buffers while > driver/upper stack is busy to process received frames. But this may > introduce coherency issues because the RX ring is shared between > host and FW. If FreeBSD has way to sync partial region of a DMA > map, this could be implemented without fear of coherency issue. > Another way to improve RX performance would be switching to > multi-RX queue with RSS but that would require a lot of work and I > had no time to implement it. > Does this mean that these cards are going to perform badly? This is was what I gathered from the previous thread. > BTW, given that you've updated to bce(4)/mii(4) of stable/7, I > wonder why TX/RX flow controls were not kicked in. > The working copy I used for grabbing the upstream source is at r212371. Last changes for the directories in my working copy: sys/dev/bce @ 211388 sys/dev/mii @ 212020 I discovered that flow control was disabled on the switches, so I set it to auto and added a pair of BCE_PRINTF's in the code where it enables and disables flow control and now it gets enabled. Without BCE_JUMBO_HDRSPLIT then we see no errors. With it we see number of errors, however the rate seems to be reduced compaired to the previous version of the driver. Tom -- TJU13-ARIN