From owner-freebsd-net@FreeBSD.ORG Fri Sep 10 00:24:59 2010 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A7DCB106566B; Fri, 10 Sep 2010 00:24:59 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pv0-f182.google.com (mail-pv0-f182.google.com [74.125.83.182]) by mx1.freebsd.org (Postfix) with ESMTP id 6FCE48FC20; Fri, 10 Sep 2010 00:24:59 +0000 (UTC) Received: by pvc21 with SMTP id 21so235305pvc.13 for ; Thu, 09 Sep 2010 17:24:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:received:from:date:to:cc :subject:message-id:reply-to:references:mime-version:content-type :content-disposition:in-reply-to:user-agent; bh=HTvSQPX7GM1XdxjIXZf/TPHQBUptDcjN1Kc2+et7sSQ=; b=A+nYoC5alKCIwpw8gdxt4OZ388IBAbr4F9xCcQYEjWK/bbs3DYRPAIZZGfZvkyjo08 NhJzJ1PG3OyoUhM+oq6t2bZV9DGbcVeLMRodL46cEP5num7aEBAnt7SNqrMazQHaOABs J6M/97veyKqcHRDjYLZjZLpbVegKraFQjxLZQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=Z34tIhud63GOvfe4VAtzLtLFkuETVi0wkPN0y+DtrBhFAsxu98b9Rhy2j61jzQjKbg oNvk3ANzelDRxjzyUf9mWO2Ove3o24wSRoQky6CxGces6OB2e0HEKVxToJ4KkN4sU6Bv NxSVaZ2yxhIxlCbSCVJHO5J8PBr2hC4xzmg0Y= Received: by 10.114.92.17 with SMTP id p17mr72046wab.226.1284078298624; Thu, 09 Sep 2010 17:24:58 -0700 (PDT) Received: from pyunyh@gmail.com ([174.35.1.224]) by mx.google.com with ESMTPS id o17sm3284287wal.9.2010.09.09.17.24.56 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 09 Sep 2010 17:24:57 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Thu, 9 Sep 2010 17:24:39 -0700 From: Pyun YongHyeon Date: Thu, 9 Sep 2010 17:24:39 -0700 To: Tom Judge Message-ID: <20100910002439.GO7203@michelle.cdnetworks.com> References: <4C894A76.5040200@tomjudge.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4C894A76.5040200@tomjudge.com> User-Agent: Mutt/1.4.2.3i Cc: freebsd-net@freebsd.org, davidch@broadcom.com, yongari@freebsd.org Subject: Re: bce(4) - com_no_buffers (Again) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Sep 2010 00:24:59 -0000 On Thu, Sep 09, 2010 at 03:58:30PM -0500, Tom Judge wrote: > Hi, > > I am just following up on the thread from March (I think) about this issue. > > We are seeing this issue on a number of systems running 7.1. > > The systems in question are all Dell: > > * R710 R610 R410 > * PE2950 > > The latter do not show the issue as much as the R series systems. > > The cards in one of the R610's that I am testing with are: > > bce0@pci0:1:0:0: class=0x020000 card=0x02361028 chip=0x163914e4 > rev=0x20 hdr=0x00 > vendor = 'Broadcom Corporation' > device = 'NetXtreme II BCM5709 Gigabit Ethernet' > class = network > subclass = ethernet > > They are connected to Dell PowerConnect 5424 switches. > > uname -a: > FreeBSD bandor.chi-dc.mintel.ad 7.1-RELEASE-p4 FreeBSD 7.1-RELEASE-p4 > #3: Wed Sep 8 08:19:03 UTC 2010 > tj@dev-tj-7-1-amd64.chicago.mintel.ad:/usr/obj/usr/src/sys/MINTELv10 amd64 > > We are also using 8192 byte jumbo frames, if_lagg and if_vlan in the > configuration (the nics are in promisc as we are currently capturing > netflow data on another vlan for diagnostic purposes. ): > > tj@bandor '20:51:17' '~' > > $ ifconfig bce0 > bce0: flags=8943 metric > 0 mtu 8192 > > options=400bb > ether 00:21:9b:95:7a:b8 > media: Ethernet autoselect (1000baseTX ) > status: active > lagg: laggdev lagg0 > tj@bandor '20:51:22' '~' > > $ ifconfig bce1 > bce1: flags=8943 metric > 0 mtu 8192 > > options=400bb > ether 00:21:9b:95:7a:b8 > media: Ethernet autoselect (1000baseTX ) > status: active > lagg: laggdev lagg0 > tj@bandor '20:51:35' '~' > > $ ifconfig lagg0 > lagg0: flags=8943 metric > 0 mtu 8192 > > options=400bb > ether 00:21:9b:95:7a:b8 > media: Ethernet autoselect > status: active > laggproto failover > laggport: bce1 flags=0<> > laggport: bce0 flags=5 > tj@bandor '20:51:40' '~' > > $ ifconfig vlan2 > vlan2: flags=8943 metric > 0 mtu 8192 > options=3 > ether 00:21:9b:95:7a:b8 > inet 172.30.XX.XX netmask 0xfffffe00 broadcast 172.30.XX.XX > media: Ethernet autoselect > status: active > vlan: 2 parent interface: lagg0 > > > I have updated the bce driver and the Broadcomm MII driver to the > version from stable/7 and am still seeing the issue. > > This morning I did a test with increasing the RX_PAGES to 8 but the > system just hung starting the network. The route command got stuck in a > zone state (Sorry can't remember exactly which). > > The real question is, how do we go about increasing the number of RX > BDs? I guess we have to bump more that just RX_PAGES... > > > The cause for us, from what we can see, is the openldap server sending > large group search results back to nss_ldap or pam_ldap. When it does > this it seems to send each of the 600 results in its own TCP segment > creating a small packet storm (600*~100byte PDU's) at the destination > host. The kernel then retransmits 2 blocks of 100 results each after > SACK kicks in for the data that was dropped by the NIC. > > > Thanks in advance > > Tom > > tj@bandor '20:57:33' '~' > > $ sysctl -a dev.bce.0 > dev.bce.0.%desc: Broadcom NetXtreme II BCM5709 1000Base-T (C0) > dev.bce.0.%driver: bce > dev.bce.0.%location: slot=0 function=0 > dev.bce.0.%pnpinfo: vendor=0x14e4 device=0x1639 subvendor=0x1028 > subdevice=0x0236 class=0x020000 > dev.bce.0.%parent: pci1 > dev.bce.0.l2fhdr_error_count: 0 > dev.bce.0.mbuf_alloc_failed_count: 0 > dev.bce.0.mbuf_frag_count: 0 > dev.bce.0.dma_map_addr_rx_failed_count: 0 > dev.bce.0.dma_map_addr_tx_failed_count: 0 > dev.bce.0.unexpected_attention_count: 0 > dev.bce.0.stat_IfHcInOctets: 439779802 > dev.bce.0.stat_IfHCInBadOctets: 0 > dev.bce.0.stat_IfHCOutOctets: 108341440 > dev.bce.0.stat_IfHCOutBadOctets: 0 > dev.bce.0.stat_IfHCInUcastPkts: 2341369 > dev.bce.0.stat_IfHCInMulticastPkts: 26065 > dev.bce.0.stat_IfHCInBroadcastPkts: 9191 > dev.bce.0.stat_IfHCOutUcastPkts: 1230052 > dev.bce.0.stat_IfHCOutMulticastPkts: 2870 > dev.bce.0.stat_IfHCOutBroadcastPkts: 45 > dev.bce.0.stat_emac_tx_stat_dot3statsinternalmactransmiterrors: 0 > dev.bce.0.stat_Dot3StatsCarrierSenseErrors: 0 > dev.bce.0.stat_Dot3StatsFCSErrors: 0 > dev.bce.0.stat_Dot3StatsAlignmentErrors: 0 > dev.bce.0.stat_Dot3StatsSingleCollisionFrames: 0 > dev.bce.0.stat_Dot3StatsMultipleCollisionFrames: 0 > dev.bce.0.stat_Dot3StatsDeferredTransmissions: 0 > dev.bce.0.stat_Dot3StatsExcessiveCollisions: 0 > dev.bce.0.stat_Dot3StatsLateCollisions: 0 > dev.bce.0.stat_EtherStatsCollisions: 0 > dev.bce.0.stat_EtherStatsFragments: 0 > dev.bce.0.stat_EtherStatsJabbers: 0 > dev.bce.0.stat_EtherStatsUndersizePkts: 0 > dev.bce.0.stat_EtherStatsOversizePkts: 0 > dev.bce.0.stat_EtherStatsPktsRx64Octets: 3381 > dev.bce.0.stat_EtherStatsPktsRx65Octetsto127Octets: 98883 > dev.bce.0.stat_EtherStatsPktsRx128Octetsto255Octets: 2255959 > dev.bce.0.stat_EtherStatsPktsRx256Octetsto511Octets: 12508 > dev.bce.0.stat_EtherStatsPktsRx512Octetsto1023Octets: 4247 > dev.bce.0.stat_EtherStatsPktsRx1024Octetsto1522Octets: 522 > dev.bce.0.stat_EtherStatsPktsRx1523Octetsto9022Octets: 1125 > dev.bce.0.stat_EtherStatsPktsTx64Octets: 496 > dev.bce.0.stat_EtherStatsPktsTx65Octetsto127Octets: 1176041 > dev.bce.0.stat_EtherStatsPktsTx128Octetsto255Octets: 29079 > dev.bce.0.stat_EtherStatsPktsTx256Octetsto511Octets: 2933 > dev.bce.0.stat_EtherStatsPktsTx512Octetsto1023Octets: 23898 > dev.bce.0.stat_EtherStatsPktsTx1024Octetsto1522Octets: 234 > dev.bce.0.stat_EtherStatsPktsTx1523Octetsto9022Octets: 286 > dev.bce.0.stat_XonPauseFramesReceived: 0 > dev.bce.0.stat_XoffPauseFramesReceived: 0 > dev.bce.0.stat_OutXonSent: 0 > dev.bce.0.stat_OutXoffSent: 0 > dev.bce.0.stat_FlowControlDone: 0 > dev.bce.0.stat_MacControlFramesReceived: 0 > dev.bce.0.stat_XoffStateEntered: 0 > dev.bce.0.stat_IfInFramesL2FilterDiscards: 0 > dev.bce.0.stat_IfInRuleCheckerDiscards: 0 > dev.bce.0.stat_IfInFTQDiscards: 0 > dev.bce.0.stat_IfInMBUFDiscards: 0 > dev.bce.0.stat_IfInRuleCheckerP4Hit: 35256 > dev.bce.0.stat_CatchupInRuleCheckerDiscards: 0 > dev.bce.0.stat_CatchupInFTQDiscards: 0 > dev.bce.0.stat_CatchupInMBUFDiscards: 0 > dev.bce.0.stat_CatchupInRuleCheckerP4Hit: 0 > dev.bce.0.com_no_buffers: 13021 > FW may drop incoming frames when it does not see available RX buffers. Increasing number of RX buffers slightly reduce the possibility of dropping frames but it wouldn't completely fix it. Alternatively driver may tell available RX buffers in the middle of RX ring processing instead of giving updated buffers at the end of RX processing. This way FW may see available RX buffers while driver/upper stack is busy to process received frames. But this may introduce coherency issues because the RX ring is shared between host and FW. If FreeBSD has way to sync partial region of a DMA map, this could be implemented without fear of coherency issue. Another way to improve RX performance would be switching to multi-RX queue with RSS but that would require a lot of work and I had no time to implement it. BTW, given that you've updated to bce(4)/mii(4) of stable/7, I wonder why TX/RX flow controls were not kicked in.