From owner-freebsd-net@FreeBSD.ORG  Wed Oct 12 20:36:02 2005
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
X-Original-To: net@FreeBSD.org
Delivered-To: freebsd-net@FreeBSD.ORG
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id E9F9D16A41F
	for <net@FreeBSD.org>; Wed, 12 Oct 2005 20:36:02 +0000 (GMT)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 9C5B943D45
	for <net@FreeBSD.org>; Wed, 12 Oct 2005 20:36:02 +0000 (GMT)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [209.31.154.41])
	by cyrus.watson.org (Postfix) with ESMTP id 765DA46BDA;
	Wed, 12 Oct 2005 16:36:01 -0400 (EDT)
Date: Wed, 12 Oct 2005 21:36:01 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Andrew Gallatin <gallatin@cs.duke.edu>
In-Reply-To: <17229.29164.891534.200216@grasshopper.cs.duke.edu>
Message-ID: <20051012212915.E66014@fledge.watson.org>
References: <20051008143854.B84936@fledge.watson.org>
	<17229.29164.891534.200216@grasshopper.cs.duke.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: net@FreeBSD.org
Subject: Re: Call for performance evaluation: net.isr.direct (fwd)
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Oct 2005 20:36:03 -0000


On Wed, 12 Oct 2005, Andrew Gallatin wrote:

> Speaking of net.isr, is there any reason why if_simloop() calls 
> netisr_queue() rather than netisr_dispatch()?

Yes -- it's basically to prevent recursion for loopback traffic, which can 
result in both lock orders and general concerns regarding reentrance.  To 
be specific: if you send a packet on a loopback TCP socket, it gets 
processes asynchronously in the netisr rather than immediately walking 
back into the TCP code again.  Right now WITNESS would warn about this, 
but there were also quite bad things that could happen before we did the 
locking work -- for example, when connections are torn down.  It also 
avoids Really Deep Stacks.

At some point, someone needs to look at some scheduler traces and make 
sure we're not seeing anything silly like the following:

- Socket output delivers to TCP, which outputs to loopback, which inserts
   the packet into the netisr queue, waking up the netisr thread.

- The netisr, running at a lower priority, preempts the running thread,
   which may still hold TCP locks, causing it to hit to the lock and yield
   to the user thread, which will now run briefly with depressed priority
   due to priority propagation.

I.e., it may be that we're taking untimely context switches on UP for 
loopback traffic.  I've not actually seen this, but we should make sure 
we're not seeing it.

Robert N M Watson