From owner-freebsd-current@FreeBSD.ORG  Sat Oct 28 19:47:09 2006
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
X-Original-To: freebsd-current@freebsd.org
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 9682716A407;
	Sat, 28 Oct 2006 19:47:09 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42])
	by mx1.FreeBSD.org (Postfix) with ESMTP id A0F3B43D46;
	Sat, 28 Oct 2006 19:47:06 +0000 (GMT)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [209.31.154.41])
	by cyrus.watson.org (Postfix) with ESMTP id B06EC46C7D;
	Sat, 28 Oct 2006 15:47:05 -0400 (EDT)
Date: Sat, 28 Oct 2006 20:47:05 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Paul Allen <nospam@ugcs.caltech.edu>
In-Reply-To: <20061028194125.GL30707@riyal.ugcs.caltech.edu>
Message-ID: <20061028204357.A83519@fledge.watson.org>
References: <45425D92.8060205@elischer.org>
	<200610281132.21466.davidxu@freebsd.org>
	<20061028105454.S69980@fledge.watson.org>
	<20061028194125.GL30707@riyal.ugcs.caltech.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-current@freebsd.org, David Xu <davidxu@freebsd.org>,
	Julian Elischer <julian@elischer.org>
Subject: Re: Comments on the  KSE option
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 28 Oct 2006 19:47:09 -0000


On Sat, 28 Oct 2006, Paul Allen wrote:

> From Robert Watson <rwatson@freebsd.org>, Sat, Oct 28, 2006 at 11:04:48AM +0100:
>> This is my single biggest concern: our scheduling, thread/process, and 
>> context management paths in the kernel are currently extremely complex. 
>> This has a number of impacts: it makes it extremely hard to read and 
>> understand, it adds significant overhead, and it makes it quite hard to 
>> modify and optimize for increasing numbers of processors.  We need to be 
>> planning on a world of 128 hardware threads/machine on commodity server 
>> hardware in the immediate future, which means that the current "giant 
>> sched_lock" cannot continue much longer. Kip's prototypes of breaking out 
>> sched_lock as part of the sun4v work have been able to benefit 
>> significantly from the reduced complexity of a KSE-free kernel, and it's 
>> fairly clear that the task of improving schedule scalability is 
>> dramatically simpler when the kernel model for threading is more simple. 
>> Regardless of where the specific NO_KSE option in the kernel goes, reducing 
>> kernel scheduler/etc complexity should be a first order of business, 
>> because effective SMP work really depends on that happening.
>
> Let us suppose that this M:N business is important, perhaps something to 
> consider is why and whether the kernel has so much knowledge of it.
>
> If I read Matt Dillon's comment closely enough, I believe his precise 
> recommendation was not "something like kse as Julian read it" but rather 
> something where this M:N component was entirely part of the userland 
> threading support and therefore would just go away or not depending on which 
> library you linked with.
>
> I think posix might require a global priority space though...
>
> Anyways it remains dubious in my mind that the kernel should allow a user to 
> create many processes but penalize creating threads.
>
> The only reason I can think of is that you expect people to be sloppy with 
> their threads and careful with their processes.
>
> Still if I am ray-tracing why should I need to make a point of picking my 
> thread/process balance to get around your mechanism.  If fairness is the 
> goal why am I even allowed to do so?

I think the notion of fairness is orthogonal to M:N threading.  M:N is about 
efficiently representing user threading to kernel space, as well as avoiding 
kernel involvement in user context switches when not needed.  Fairness is 
about how the kernel allocates time slices to user processes/threads. 
Fairness can be implemented for both 1:1 and M:N, with the primary differences 
being in bookkeeping.

Programmers often use threading based on a misunderstanding about performance. 
Threading actually increases kernel lock contention when compared with using 
processes for parallelism, so if the benefits from address space sharing don't 
outweigh the added costs of contention, threaded applications can run 
significantly slower than multi-process ones.  Many programmers believe that 
threading is necessarily faster than using multiple processors, so I think 
we're fundamentally forced to deal with the way they do use them, rather than 
how they should use them.

Robert N M Watson
Computer Laboratory
University of Cambridge