FreeBSD Mail Archives

Date:      Wed, 4 Dec 96 13:47:57 +0100
From:      Martin Cracauer <cracauer@wavehh.hanse.de>
To:        Terry Lambert <terry@lambert.org>
Cc:        julian@whistle.com (Julian Elischer), cracauer@wavehh.hanse.de, nawaz921@cs.uidaho.EDU, freebsd-hackers@freebsd.org
Subject:   Re: clone()/rfork()/threads (Re: Inferno for FreeBSD)
Message-ID:  <9612041247.AA21652@wavehh.hanse.de>
In-Reply-To: <199612021933.MAA11060@phaeton.artisoft.com>
References:  <32A27CB2.59E2B600@whistle.com> <199612021933.MAA11060@phaeton.artisoft.com>

Terry Lambert writes:
> > > The additonal options are needed to produce a Posix-compatible thread
> > > interface that has no userlevel threads anymore. Linus claims Linux
> > > syscalls are fast enough to be acceptable even in applications with
> > > heavy use of locking (and therefore resheduling by the kernel).
> > 
> > He might be correct.
> > sharing memory spaces makes for a much smaller contect switch.
> 
> Assuming you switch from one context in a thread group to another.

> In which case, it is possible for a threaded process to starve all
> other processes, depending on if its resource requests are satisfied
> before all the remaining threads in the thread group have also made
> blocking requests (otherwise, you are not prioritizing by being in
> the thread group, and there are virtually no contex switch overhead
> wins from doing the threading -- only the win that a single process
> can compete for N quantums instead of 1 quantum when there are N
> kernel threads in the thread group).

If I understand you right, your say that the sheduler (the one in the
kernel, we don't have a userlevel sheduler in this model) must be
changed to prefer to switch to a thread/process that has the same VM
bits. If not, the number of cases where the VM space stays across the
switch are too infrequent. The result would be that the theoretical
advantage of threads, the faster context switch will be lost in
practice.

Your concern is that this required change can lead to situations where
one thread group takes over more resources than planned. Why that? Why
can't the kernel keep track of the resources spent on one
process/thread-group and act on that basis?

In any case, changing the sheduler to favor thread switches instead of
process switches complicates things so that the implementation effort
advantage a kernel-only thread solution is at least partially lost.

> A good thread scheduler requires async system calls (not just I/O)
> and conversion of blocking calls to non-blocking calls plus a context

It does? [you probably know tyhe following, but for others]. 

All thread implementations I know details of that are not pure
userlevel don't change system calls to non-blocking equivalents.

These systems have either one kernel-thread for each userlevel thread
(Win32) or they manage some communication between the kernel and the
userlevel thread sheduler to make sure the process doesn't stall when
all threads are blocking.

The Solaris kernel informs the userlevel thread sheduler when the last
kernel thread is blocking and the sheduler creates more kernel
threads. The programmer should plan in advance and set the number of
initial kernel threads high enough.

In Digital Unix 4.0, userlevel threads are newly assigned to a pool of
kernel threads when they are sheduled. The kernel reports each
blocking syscall to the userlevel sheduler, which will immedeatly
shedule another userlevel thread on that kernel thread.

The people I listend to so far were all convincend that turning all
system calls into nonblocking versions will lead into serious
implementation difficulties, especially if you take further changes
into account (those will have to be made on two versions of the
library). Another concern is that most exsiting async I/O interfaces
don't work reliable.

I still like the simpliticity of a kernel-only thread solution. If
that way turns out to be too inefficient, the DEC way seems to be a
solution that doesn't need async system calls and has no efficiency
disadvantage I can see (compared to a sysyem with async syscalls
only).

I hope to get further details on the DEC implementation.

> switch in user space (quasi-cooperative scheduling, like SunOS 4.1.3
> liblwp).  This would result in a kernel thread consuming its full
> quantum, potentially on several threads, before going on.  One of

I still don't know why we can't made the kernel keeping track of the
timeslices spent on thread groups and shedule on that basis.

> the consequences of this is that sleep intervals below the quantum
> interval, which will work now, without a high degree of reliability,
> will now be guaranteed to *not* work at all.  Timing on most X games
> using a select() with a timeout to run background processing, for
> instance, will fail on systems that use this, unless a kernel preempt
> (a "real time" interrupt) is generated as a result of time expiration,
> causing the select()-held process to run at the time the event occurs,
> instead of simply scheduling the process to run.  This leads to buzz
> loop starvation unless you limit the number of times in an interval
> that you allow a process to preeempt (ie: drop virtual priority on
> a process each time it preempts this way, and rest on quantum interval).

Another reason why I'd like to have only one sheduler (in the kernel).

Martin
-- 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Martin Cracauer <cracauer@wavehh.hanse.de>
http://cracauer.cons.org
Fax +49 40 522 85 36

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9612041247.AA21652>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation