From owner-freebsd-hackers Thu Jul 2 16:16:41 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id QAA27382 for freebsd-hackers-outgoing; Thu, 2 Jul 1998 16:16:41 -0700 (PDT) (envelope-from owner-freebsd-hackers@FreeBSD.ORG) Received: from smtp01.primenet.com (daemon@smtp01.primenet.com [206.165.6.131]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id QAA27376 for ; Thu, 2 Jul 1998 16:16:38 -0700 (PDT) (envelope-from tlambert@usr09.primenet.com) Received: (from daemon@localhost) by smtp01.primenet.com (8.8.8/8.8.8) id QAA29672; Thu, 2 Jul 1998 16:16:35 -0700 (MST) Received: from usr09.primenet.com(206.165.6.209) via SMTP by smtp01.primenet.com, id smtpd029649; Thu Jul 2 16:16:30 1998 Received: (from tlambert@localhost) by usr09.primenet.com (8.8.5/8.8.5) id QAA12148; Thu, 2 Jul 1998 16:16:22 -0700 (MST) From: Terry Lambert Message-Id: <199807022316.QAA12148@usr09.primenet.com> Subject: Re: pthreads To: rotel@indigo.ie Date: Thu, 2 Jul 1998 23:16:21 +0000 (GMT) Cc: tlambert@primenet.com, jabley@clear.co.nz, freebsd-hackers@FreeBSD.ORG In-Reply-To: <199807021321.OAA00589@indigo.ie> from "Niall Smart" at Jul 2, 98 02:21:53 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > > John Birrell rewrote lots of it in -current, with an eye toward > > bringing the code up to the Draft 10 standard (the ratified standard), > > and he and John Dyson did a lot of work to support a kernel > > implementation, also in -current, using rfork() and some rather > > complicated stack management. > > This is basically sharing a number of kernel processes among a set > of threads, right? Do you know if any progress was made towards > a LWP scheme? If John Dyson's async I/O code is in place that > would help a lot on that area I think. The async I/O is easily used to implement a call conversion scheduler; it's not much help in an LWP scheme (in the Solaris, not the SunOS, sense of LWP). What it buys you is overlapped I/O, which you don't really get with the current pthreads implementation (it's more of a "just-in-time" I/O). Far better than simple async I/O would be an async call gate. This would let you make blocking calls that were unrelated to I/O in an async fashion as well (for example, acquisition of a semaphore). Alas, according to POSIX, async I/O is the future (though it could be implemented on an async call gate in a library, and then ignored). The rfork()-based kernel threading is for SMP scalability. It is generally limited to one kernel thread per user space thread. This can be subjugated to N kernel threads for M user space threads, M > N, in two ways. The first is to allow only N blocking calls to be outstanding, and to starve those threads that are ready to run, but waiting on a kernel scheduling context (kernel thread) in which to run. The second is to create a new kernel thread when the blocking threshold (generally by cooperative scheduling: the kernel signals a user space scheduler thread, which wakes up and spawns a kernel thread to add to the thread group). These approached both have problems, but the second has the highest scalability without starvation of its own threads (but it can't be throttled on its competition for quantum without some hard limit that turns it into the first approach when the limit is enforced). > > John Dyson did a number of patches for CPU affinity > > CPU affinity? You mean the threading library can pass scheduling > hints to the kernel for a set of processes? No. CPU affinity is for protection of the L1 and L2 cache contents by making threads "prefer" one CPU over the other(s). It is an important prescondition for SMP scaling of multithreaded applications. Without it, your effective cache is reduced by the Nth root of the cache size, for N processors. Each kernel thread or process is, in the abstract, a kernel schedulable entity. You want to minimize the context switching between kernel schedulable entites. If you make a blocking call on a kernel thread, that kernel thread is preempted, and the competition is thrown open to all other threads/processes to be the next scheduled. This is not very optimal. An optimal implementation would combine an async call gate, which would mean that once you got the quantum, the threaded process got to use all of it, with kernel threads for SMP scalability. There is some merit to kernel threads in terms of saying: There are a total of E kernel schedulable entites on the system; I want my threaded process to get N quanta out of every E quanta ( N << E ). Basically, establishing that a threaded process competes as N processes. The merit in this approach is very small, however, and is predicated on two ideas: (1) that a threaded process will be competing with conventional processes for quantum, and (2) that the process priority system is not sufficient to make the competition "fair". Fairness arguments are always arguments about transitioning from an old system to a new system. > Was this threading model an interim measure until someone wrote one > based on LWP or intended to be the way that it would always be done? Well, opinions are varied; I've presented mine, above. If I had to boil it down to one (long) sentence, I'd say: Once the scheduler gives me a quantum, it's *my* quantum, and I shouldn't be penalized with context switch overhead and losing the remainder of my partially used quantum just because I want to make a system call. > There are a number of problems with this approach (outlined in > a paper called "Scheduler Activations: Effective Kernel Support for > the User Level Management of Parallelism", ask me for a copy if you > want one) althought it is much easier to implement than a LWP based > model. I've read the "activations" paper. I don't like them; they imply a message passing architecture. There are also unaddressed starvation issues that occur when you are ready to block and some other thread in your group is ready to run. Without an overall accounting of quanta outside your programs virtual machine, there are problems. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message