From owner-freebsd-net@FreeBSD.ORG Sun Oct 16 04:06:44 2005 Return-Path: X-Original-To: net@freebsd.org Delivered-To: freebsd-net@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CF2DC16A421; Sun, 16 Oct 2005 04:06:44 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailout1.pacific.net.au (mailout1.pacific.net.au [61.8.0.84]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2BE7043D49; Sun, 16 Oct 2005 04:06:43 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.0.87]) by mailout1.pacific.net.au (8.13.4/8.13.4/Debian-3) with ESMTP id j9G46Y6B022772; Sun, 16 Oct 2005 14:06:34 +1000 Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailproxy2.pacific.net.au (8.13.4/8.13.4/Debian-3) with ESMTP id j9G46Ueu017591; Sun, 16 Oct 2005 14:06:31 +1000 Date: Sun, 16 Oct 2005 14:06:32 +1000 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Robert Watson In-Reply-To: <20051015194738.C66245@fledge.watson.org> Message-ID: <20051016135234.T86712@delplex.bde.org> References: <17231.43525.446450.161986@grasshopper.cs.duke.edu> <13600.1129298731@critter.freebsd.dk> <17231.50841.442047.622878@grasshopper.cs.duke.edu> <20051015092141.F1403@epsplex.bde.org> <20051015194738.C66245@fledge.watson.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Garrett Wollman , Poul-Henning Kamp , Andrew Gallatin , net@freebsd.org Subject: Re: Call for performance evaluation: net.isr.direct (fwd) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Oct 2005 04:06:45 -0000 On Sat, 15 Oct 2005, Robert Watson wrote: > On Sat, 15 Oct 2005, Bruce Evans wrote: >> ... However, for netisrs I think it is >> common to process only 1 packet per context switch, at least in the >> loopback case. > > The Mach scheduler allows deferred wakeups to be issued -- "wake up a thread > in the sleep queue -- but when it's convenient" to avoid premature preemtion. > This helps avoid immediate preemption where it's unnecessary and/or > undesirable: specifically, to avoid preemption when the preempting thread > will immediately require a lock held by the signalling thread, as occurs with > the netisr and TCP. Hmm, does it still do that? A year or two, at least the loopback case used to run into itself on Giant, and pushing down giant made things worse (scheduling the netisr would switch immediately because the netisr was "MPSAFE" but then the netisr would immediately block on Giant and switch back, so it took 2 expensive context switches instead of just 1 to get a netisr to do anything). > I've not yet investigated tweaking things, or even a > scheduler trace to see for sure that this is happening, but it wouldn't > surprise me at all if we're seeing extra context switches due to premature or > untimely preemption following wakeup. Probably the problem is largest for latency, especially in benchmarks. Latency benchmarks probably have to start cold, so they have no chance of queue lengths > 1, so there must be a context switch per packet and may be 2. Bruce