From owner-freebsd-net@FreeBSD.ORG Sun Apr 5 17:40:19 2009 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1EF4E1065691; Sun, 5 Apr 2009 17:40:19 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id ED0138FC1C; Sun, 5 Apr 2009 17:40:18 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [65.122.17.41]) by cyrus.watson.org (Postfix) with ESMTPS id A409646B8F; Sun, 5 Apr 2009 13:40:18 -0400 (EDT) Date: Sun, 5 Apr 2009 18:40:18 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Barney Cordoba In-Reply-To: <285323.31546.qm@web63901.mail.re1.yahoo.com> Message-ID: References: <285323.31546.qm@web63901.mail.re1.yahoo.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-net@freebsd.org, Ivan Voras Subject: Re: Advice on a multithreaded netisr patch? X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 05 Apr 2009 17:40:20 -0000 On Sun, 5 Apr 2009, Barney Cordoba wrote: > I'm curious as to your assertion that hardware transmit queues are a big > win. You're really just loading a transmit ring well ahead of actual > transmission; there's no need to force a "start" for each packet queued. You > then have more overheard managing the multiple queues; more memory used, > more cpu cache needed, more interrupts (perhaps), overhead generating the > flowid. It seems to me that a more efficient method of transmitting, such as > offloading the transmit workload to a kernel task, would be more effective > than using multiple transmit queues. All the source thread has to do is > queue the packet and get out. When using multiple cores, we've observed significant contention on the transmit-side locks protecting a single output queue; when multiple queues are used, that contention is avoided. The lock only coveres the queue, but the overhead of a single high contention lock twice for every packet (enqeueu, later dequeue) is significant at high pps and with many cores. > As an aside, why is Kip doing development on a Chelsio card rather than a > more mainstream product such as Intel or Broadcom that would generate more > widespread interest? Because they paid him to to write their driver? :-) Robert N M Watson Computer Laboratory University of Cambridge