From owner-cvs-all@FreeBSD.ORG  Mon Jan 29 19:52:23 2007
Return-Path: <owner-cvs-all@FreeBSD.ORG>
X-Original-To: cvs-all@FreeBSD.org
Delivered-To: cvs-all@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 27E7016A404;
	Mon, 29 Jan 2007 19:52:23 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42])
	by mx1.freebsd.org (Postfix) with ESMTP id A39B413C48D;
	Mon, 29 Jan 2007 19:52:22 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [209.31.154.41])
	by cyrus.watson.org (Postfix) with ESMTP id 7630050BF2;
	Mon, 29 Jan 2007 14:52:21 -0500 (EST)
Date: Mon, 29 Jan 2007 19:52:20 +0000 (GMT)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Pawel Jakub Dawidek <pjd@FreeBSD.org>
In-Reply-To: <20070129193205.GE87767@garage.freebsd.pl>
Message-ID: <20070129194158.N32458@fledge.watson.org>
References: <20070128202917.5B67916A5A6@hub.freebsd.org>
	<45BD82D2.20301@root.org>
	<20070129175222.GA87767@garage.freebsd.pl> <45BE37DC.6080509@root.org>
	<20070129184522.GD87767@garage.freebsd.pl>
	<45BE46B7.8000406@samsco.org>
	<20070129193205.GE87767@garage.freebsd.pl>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: cvs-src@FreeBSD.org, Scott Long <scottl@samsco.org>,
	src-committers@FreeBSD.org, cvs-all@FreeBSD.org,
	Nate Lawson <nate@root.org>
Subject: Re: cvs commit: src/sys/geom/eli g_eli.c
X-BeenThere: cvs-all@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: arch@FreeBSD.org
List-Id: CVS commit messages for the entire tree <cvs-all.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-all>,
	<mailto:cvs-all-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/cvs-all>
List-Post: <mailto:cvs-all@freebsd.org>
List-Help: <mailto:cvs-all-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-all>,
	<mailto:cvs-all-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 29 Jan 2007 19:52:23 -0000


On Mon, 29 Jan 2007, Pawel Jakub Dawidek wrote:

>> Why?  You're proposing yet another intrusive change to the kernel to handle 
>> yet another one-off requirement of your code.  Why not do what I suggested 
>> before with hooking the appropriate SYSINIT in your module? Or why not 
>> follow Robert's suggestion and implement a simple event mechanism so that 
>> any module can know when a CPU has come online or offline.  Heck, you 
>> probably don't even need to implement a new mechanism, just hook the 
>> existing EVENTHANLER mechanism.  That's what it's designed for!!
>
> I'm afraid Scott that your proposals are hacks. As a GEOM class I should not 
> use SYSINIT, EVENTHANDLER, etc. I shouldn't bother if CPUs are online or 
> not. All events I need to implement a GEOM class I should receive from the 
> infrastructure. Also I shouldn't be called by the infrastructure when the 
> system is not yet ready for my activity, that's why I proposed to implement 
> this functionality in the infrastructure (ie. delay GEOM tasting machanism), 
> that hack SYSINITs in every single GEOM class that need to bind to a CPU.

I guess I'm not sure I entirely agree.  I think that we lack some important 
infrastructure, which we've been talking on and off for a dev summit or two 
now, for handling the arrival and departure of CPU resources ("dynamic 
reconfiguration").  While once this wasn't really an issue on PC hardware, it 
now is, with the advent of hypervisors, virtualization, not to mention more 
multiprocessing, etc.  We have quite a few algorithms and data structures that 
assume that the set of CPUs is static, and fail quite badly (i.e., memory 
leaks, work lost, etc) if a CPU were to stop scheduling threads.  Geli is not 
alone in wanting to know what and when CPUs are available for concurrent work, 
and like other pieces of code (UMA is the piece I have particular familiarity 
with), finds our infrastructure lacking.  I'm also not entirely convinced I 
agree with you as by the same token that you might claim sysinits and event 
handlers shouldn't be used by GEOM modules, perhaps kthreads should also not 
be used :-).  Sysinits, eventhandlers, and kthreads are all ways for 
scheduling and dispatching work.

So perhaps we need to start having the conversation about CPU events more 
seriously now.  What do you think of the idea of the following: two event 
handlers, a CPU start event and a CPU stop event, which are guaranteed to run 
on each CPU as as the CPU comes online, and just before the CPU goes offline. 
Kernel subsystems could use these events to determine when CPU resources were 
arriving and departing in some serious sense (not just "busy") in order to 
initialize and tear down per-CPU data structures, rebalance workloads, start 
or stop per-CPU works, etc.  The example I have in mind here is the network 
stack, which might reasonably wish to have per-CPU netisr (worker) threads. 
When the set of CPUs changes, it would like to increase or decrease the number 
of workers -- having the same number of workers compressed down to a smaller 
number of CPUs by migration would be a disaster for performance.

How to handle the boot processor is an interesting question -- are we 
interested in configuring away the boot processor at run-time?  If not, we 
probably want to handle it as a special case via sysinit.  If all CPUs are 
equal and any may go away, then we might need to rework our notion of 
shutdown, and provide these same events for the boot CPU (which does sound 
desirable so as not to end up with lots of special casing in subsystems). 
Regardless, we are hardly the first OS to try to address these issues via a 
clean architectural solution, and my thinking is we should do a bit of 
research.  A first place to look would definitely be OpenSolaris.

Robert N M Watson
Computer Laboratory
University of Cambridge