From owner-freebsd-net@FreeBSD.ORG  Thu Jun 15 11:04:35 2006
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
X-Original-To: freebsd-net@freebsd.org
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 25F9B16A47B
	for <freebsd-net@freebsd.org>; Thu, 15 Jun 2006 11:04:35 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 0BE9743D62
	for <freebsd-net@freebsd.org>; Thu, 15 Jun 2006 11:04:33 +0000 (GMT)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [209.31.154.41])
	by cyrus.watson.org (Postfix) with ESMTP id 5916B46C22;
	Thu, 15 Jun 2006 07:04:33 -0400 (EDT)
Date: Thu, 15 Jun 2006 12:04:33 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: John Polstra <jdp@polstra.com>
In-Reply-To: <XFMail.20060614114924.jdp@polstra.com>
Message-ID: <20060615115738.J2512@fledge.watson.org>
References: <XFMail.20060614114924.jdp@polstra.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-net@freebsd.org
Subject: Re: IF_HANDOFF vs. IFQ_HANDOFF
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 15 Jun 2006 11:04:35 -0000

On Wed, 14 Jun 2006, John Polstra wrote:

> Can somebody explain why there is both an IF_HANDOFF macro and an 
> IFQ_HANDOFF macro?  Except for a slight difference in parameters, they both 
> seem to do roughly the same thing using completely distinct blocks of code. 
> Is IF_HANDOFF supposed to be used only when the target queue is not the 
> interface's if_snd queue?  That seems likely, but a few pieces of code 
> (e.g., ng_fec.c) use IF_HANDOFF to put a packet into the if_snd queue.

In essense, yes.  One is used when the default interface queue is used, and 
the other is used when a specific queue is desired.  However, the names are, 
in fact, backwards from what you expect, for historical reasons.  You'll 
notice, also, that the behavior of an interface queue in the presence of altq 
is quite different than the non-altq case, also for historical reasons, and 
that the handoff routines are sometimes without used any ifnet at all (e.g., 
the netisr handoff).  I've looked at cleaning this up a few times, as well as 
a potential race in the use of the IFF_OACTIVE flag, but have gotten 
side-tracked in the socket and protocol code for the last 4-6 months.  I hope 
to get back to it later this summer, but if someone else gets there first, 
that wouldn't be such a problem. :-)

On a number of occasions, we've discussed moving towards an if_startmbuf() 
call, which pushes the selection of queueing logic into the network device 
driver, which is increasingly desirable when network interfaces support 
multiple queues, significant queues in hardware, and where we have 
pseudo-interfaces wrapped around real ones and want to avoid multiple enqueue 
operations (i.e., if_vlan).  A related question is whether or not the generice 
queueing primitives should provide implicit locking (as they do now), and 
whether coalescing that locking with existing driver locking makes sense. 
For example, in the case where oactive isn't currently set, we actually 
perform six lock operations during mbuf handoff: lock/unlock the queue for 
enqueue, enter the device driver, lock driver mutex, lock/unlock the queue for 
dequeue, unlock the device driver lock.  In the case where oactive is set 
(i.e., there's queued output), which is frequently the case under load, this 
becomes significantly more efficient, because having a separate queue mutex 
avoids contention on the driver mutex during send if the driver is busy, etc. 
Evaluating the benefits of the additional granularity there is something that 
has never been done thoroughly, though.

In short, it's a bit of a mess, but it's non-trivial to fix beyond some basics 
because it will require us to figure out where we want to go with interface 
queuing and handoff.  Moving to if_startmbuf is relatively easy (I've 
prototyped it at least once, and have a relatively recent version in p4 
somewhere), as you can migrate drivers gradually.  On the other hand, without 
a clear message to driver writers about how we want them to do queueing in the 
general case, this also comes with risks.

Robert N M Watson
Computer Laboratory
Universty of Cambridge