From owner-freebsd-current@FreeBSD.ORG Sat Oct 28 19:41:27 2006 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 659B316A403; Sat, 28 Oct 2006 19:41:27 +0000 (UTC) (envelope-from jd@ugcs.caltech.edu) Received: from riyal.ugcs.caltech.edu (riyal.ugcs.caltech.edu [131.215.176.123]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2978743D46; Sat, 28 Oct 2006 19:41:27 +0000 (GMT) (envelope-from jd@ugcs.caltech.edu) Received: by riyal.ugcs.caltech.edu (Postfix, from userid 3640) id 14A8545806; Sat, 28 Oct 2006 12:41:25 -0700 (PDT) Date: Sat, 28 Oct 2006 12:41:25 -0700 From: Paul Allen To: Robert Watson Message-ID: <20061028194125.GL30707@riyal.ugcs.caltech.edu> References: <45425D92.8060205@elischer.org> <200610281132.21466.davidxu@freebsd.org> <20061028105454.S69980@fledge.watson.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20061028105454.S69980@fledge.watson.org> Sender: jd@ugcs.caltech.edu Cc: freebsd-current@freebsd.org, David Xu , Julian Elischer Subject: Re: Comments on the KSE option X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Oct 2006 19:41:27 -0000 >From Robert Watson , Sat, Oct 28, 2006 at 11:04:48AM +0100: > This is my single biggest concern: our scheduling, thread/process, and > context management paths in the kernel are currently extremely complex. > This has a number of impacts: it makes it extremely hard to read and > understand, it adds significant overhead, and it makes it quite hard to > modify and optimize for increasing numbers of processors. We need to be > planning on a world of 128 hardware threads/machine on commodity server > hardware in the immediate future, which means that the current "giant > sched_lock" cannot continue much longer. Kip's prototypes of breaking out > sched_lock as part of the sun4v work have been able to benefit > significantly from the reduced complexity of a KSE-free kernel, and it's > fairly clear that the task of improving schedule scalability is > dramatically simpler when the kernel model for threading is more simple. > Regardless of where the specific NO_KSE option in the kernel goes, reducing > kernel scheduler/etc complexity should be a first order of business, > because effective SMP work really depends on that happening. Let us suppose that this M:N business is important, perhaps something to consider is why and whether the kernel has so much knowledge of it. If I read Matt Dillon's comment closely enough, I believe his precise recommendation was not "something like kse as Julian read it" but rather something where this M:N component was entirely part of the userland threading support and therefore would just go away or not depending on which library you linked with. I think posix might require a global priority space though... Anyways it remains dubious in my mind that the kernel should allow a user to create many processes but penalize creating threads. The only reason I can think of is that you expect people to be sloppy with their threads and careful with their processes. Still if I am ray-tracing why should I need to make a point of picking my thread/process balance to get around your mechanism. If fairness is the goal why am I even allowed to do so?