From owner-freebsd-net@FreeBSD.ORG Fri Jan 14 09:28:44 2011 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 788B41065670 for ; Fri, 14 Jan 2011 09:28:44 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail03.syd.optusnet.com.au (mail03.syd.optusnet.com.au [211.29.132.184]) by mx1.freebsd.org (Postfix) with ESMTP id 161278FC1E for ; Fri, 14 Jan 2011 09:28:43 +0000 (UTC) Received: from c122-106-165-206.carlnfd1.nsw.optusnet.com.au (c122-106-165-206.carlnfd1.nsw.optusnet.com.au [122.106.165.206]) by mail03.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id p0E9Se0g018217 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 14 Jan 2011 20:28:42 +1100 Date: Fri, 14 Jan 2011 20:28:40 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: sthaug@nethelp.no In-Reply-To: <20110114.093936.74681829.sthaug@nethelp.no> Message-ID: <20110114195049.I28551@besplex.bde.org> References: <0B45B324-A819-4230-BBE3-F8468F2DA88F@mac.com> <20110114154326.E27511@besplex.bde.org> <54D25D8E-ED8C-41E8-BD14-4EB86F4D63C3@mac.com> <20110114.093936.74681829.sthaug@nethelp.no> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-net@FreeBSD.org Subject: Re: igb watchdog timeouts X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Jan 2011 09:28:44 -0000 On Fri, 14 Jan 2011 sthaug@nethelp.no wrote: >>> They have enough buffers (128 for each of tx and rx IIRC). The only thing >>> polling mode gave for them was lower latency, but this cost enabling >>> polling in the idle loop, which wastes 100% of at least 1 CPU and some >>> power. Without polling in idle, polling gives very high latency (even >>> worse than low-quality interrupt moderation does). >> >> Sure-- there are circumstances where a machine would always have traffic to process, for which idle polling was beneficial to enable. > > I have a couple of servers with Broadcom (bge) GigE interfaces. These > servers became completely unresponsive/unusable at high network traffic > (presumably due to the interrupt processing) but were able to handle the > same traffic with no problems after switching to polling. This was in > the 7.0 timeframe. > > I still have the same servers/interfaces running with polling, but now > at 7.3. I had the opposite experience with a Broadcom 5701 (old but not low-end bge for PCI-X on PCI-33). Some mostly-uncommitted work on its interrupt handling improved its latency from 100uS to 50uS (average for best case) and its throughput from 240000 to 640000 tiny packets/second. -current should be at least 2/3 as good. Its polling mode saturated at 400000 tiny packets/second for tx with poll-in-idle and at about half that without. Rx started dropping packets at about the same thresholds that tx saturated. The reasons for more than 240kpps not working with polling are especially easy to understand for rx. Most bge's have a 512 entry rx ring, and FreBSD bge only enables 256 entries in it. Poll this at 1 KHz and you are sure to drop packets above 256 kpps, and likely to above 240 kpps. Higher polling rates and polling in idle allow higher packet rates without loss, but for some reason polling saturates before interrupt handling. Since I don't believe in polling, I didn't try to fix this (except to use the whole rx ring). I think polling consumes resources which are better used for doing work. Always CPU resources, but here also NIC resources. This bge works even better with larger packets (1500 bytes; not so good with 9000). Interrupt load is significant at 640 kpps but not at the 81 kpps which is the maximum for 1500-byte packets. OTOH, with a Broadcon 5705+ (not so old but low-end bge for PCI-33), interrupt moderation (host coelescing in bge-speak) is broken. It interrupts immediately (?) once per packet despite accepting the programming to interrupt once per many packets or after many microseconds. This results in about 1/6 of the performance of the 5701 (1/3 of the performance to saturate twice as many CPUs). Polling might help, but I tried it even less on this NIC than the other. Bruce