From owner-freebsd-arch  Mon Nov 29  9: 6:31 1999
Delivered-To: freebsd-arch@freebsd.org
Received: from ns1.yes.no (ns1.yes.no [195.204.136.10])
	by hub.freebsd.org (Postfix) with ESMTP id 6D41614A0D
	for <freebsd-arch@freebsd.org>; Mon, 29 Nov 1999 09:06:26 -0800 (PST)
	(envelope-from eivind@bitbox.follo.net)
Received: from bitbox.follo.net (bitbox.follo.net [195.204.143.218])
	by ns1.yes.no (8.9.3/8.9.3) with ESMTP id SAA21380
	for <freebsd-arch@freebsd.org>; Mon, 29 Nov 1999 18:06:25 +0100 (CET)
Received: (from eivind@localhost)
	by bitbox.follo.net (8.8.8/8.8.6) id SAA64702
	for freebsd-arch@freebsd.org; Mon, 29 Nov 1999 18:06:25 +0100 (MET)
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by hub.freebsd.org (Postfix) with ESMTP id D393314C28
	for <freebsd-arch@FreeBSD.ORG>; Mon, 29 Nov 1999 09:05:55 -0800 (PST)
	(envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.9.3/8.9.1) id JAA06592;
	Mon, 29 Nov 1999 09:05:55 -0800 (PST)
	(envelope-from dillon)
Date: Mon, 29 Nov 1999 09:05:55 -0800 (PST)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <199911291705.JAA06592@apollo.backplane.com>
To: Nate Williams <nate@mt.sri.com>,
	Julian Elischer <julian@whistle.com>,
	Jason Evans <jasone@canonware.com>,
	"Daniel M. Eischen" <eischen@vigrid.com>, freebsd-arch@freebsd.org
Subject: Re: Threads
References: <19991124220406.X301@sturm.canonware.com>
	<Pine.BSF.4.10.9911250109290.12692-100000@current1.whistle.com>
	<199911291611.JAA19058@mt.sri.com>
	<199911291621.IAA06301@apollo.backplane.com> <199911291629.JAA19154@mt.sri.com>
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

:>     The terminology I have been using, which I thought was the same as 
:>     Julian's but may not be, is:
:> 
:>     Thread
:> 
:> 	Two entities.  A kernel structure 'Thread' and also a similarly
:> 	named but independant user structure within the UTS.
:
:So far so good.  However, we need to differentiate between a 'Userland'
:thread, and a 'kernel thread' somehow.  Also, how does a Userland thread
:'become' a Kernel thread?  (We need a hybrid of Userland/Kernel threads
:in order for this to be a viable/effecient solution.)

    In my scheme the kernel controls the ultimate scheduling of threads so
    'Thread' is a kernel thread.  The UTS may implement its own thread
    structure for userland in order to tie the kernel Thread into the userland
    libraries, but there would have to be a 1:1 correspondance (since userland
    needs to make a system call to schedule runnability of a thread).

    The idea here is that the kernel Thread structure is very small, holding
    only the userland %esp, a reference to the governing struct proc, and
    a reference to the KSE (which need only be present in certain situations,
    see below).  The kernel will store the userland register state on the
    userland stack instead of in the Thread structure, keeping the Thread
    structure *very* small.  If we take a stack fault and no memory is
    available to allocate a new page, the situation is simply treated the
    same as any other time the kernel blocks while in the supervisor
    (see below).  No special case or extra code is required at all, really.

    Since the case only occurs under extreme conditions we don't have to
    worry about it except to ensure that we do not enter a low-memory
    deadlock when the case occurs.  It is precisely the ability to block
    in supervisor mode that prevents the low memory deadlock from occuring.

:>     KSE
:> 
:> 	A kernel scheduleable entity.  I was using this to mean the
:> 	contextual information (such as a kernel stack) required for
:> 	the kernel to be able to run a thread.  Not required for 
:> 	runnability, only required to actually run the thread and
:> 	also held over of the thread blocks while in the kernel.
:                       ^^
:if?  Can you expound on this more?  Is this transferrable to another
:'thread' in the kernel?  If so, what is left?  If not, what is the
:'thing' that we are transferring across?

    Yes, the KSE is transferable.  There are two situations in my
    scheme where a KSE is required:

	* The thread is currently running on a physical cpu (not just
	  runnable, but actually *running* ... just like you might
	  have dozens of processes in a 'R'un state (runnable) but on a 
	  UP system only one can actually be *running* at a time.

	* The thread has gone to sleep while in the kernel.  i.e. the
	  thread has blocked while in the kernel.

    Taking them one at a time:

    (1) The thread is currently running - a KSE is required in case the
	thread wishes to make a system call, an interrupt occurs, or
	a fault occurs.  The thread is not actually *using* the KSE
	except for those occurances so if the thread is switched out
	without the above occuring, the KSE can be reused for the next
	runnable thread that is switched in by the kernel scheduler.

	We CANNOT allocate a KSE for a thread upon entering the supervisor
	without getting into potential deadlock situations, thus the KSE
	must already exist so entry into the supervisor can occur without
	having to allocate memory.

    (2) The thread blocks while in the kernel.  For example, the thread
	makes a synchronous system call such as a read() which blocks.
	The thread utilizes the KSE that was previously idle and now
	blocks.  The kernel cannot migrate that KSE to another thread
	because it is needed to hold state for the current thread.

	Another example, the kernel decides to switch away from a thread
	currently running in usermode (not a system call, or a special
	synchronous give-up-the-cpu call) The kernel will attempt to save
	the register & FP state onto the thread's own user stack.  If this
	succeeds the kernel can switch away from the thread and reuse
	the KSE for the next thread (assuming it doesn't already have one).

	On the otherhand, if the act of saving the registers onto the
	userstack blocks due to a fault and a low-memory condition, swap-in,
	or other state, the kernel is forced to use the existing KSE to
	block in the supervisor and saves the register state there (just like
	normal).  When memory becomes available the kernel can complete the
	switchout for the userstack and destroy the KSE.

	As you might have gathered, in the case where the thread being 
	switched in by the kernel has no KSE, the kernel simply resumes
	the user thread with an iret.  The previous state that was pushed
	onto the user thread's stack is restored by a restore vector that
	was also pushed onto the user thread's stack.  The kernel can use
	the per-cpu supervisor stack to temporarily hold the interrupt return
	stack to properly restore the user ring.

	This may sound complex, but it is very close to what our kernel
	already does currently and would not require significant programming
	to implement.  It would allow us to reduce the kernel memory footprint
	required to manage threads to a size that is so small that we 
	could scale to hundreds of thousands of threads if we needed to.

:>     Process
:> 
:> 	Our good old process.
:
:I think this is probably the *only* thing we all agree upon the
:definition of. :)

     Kinda hard to screw this one up, yup!

:...
:>     into the kernel's scheduler but if you have a two-cpu system, only 2 of
:>     those 10 will actually be running at any given moment and require KSE's.
:
:So far so good.
:
:>     With my system we change the kernel scheduling entity from a 'Process'
:>     to a 'Thread' and a Thread can then optionally (dynamically) be assigned
:>     a KSE as required to actually run.
:
:I think the term you are using for 'Thread' would be an SA, but I'm not
:sure everyone else would agree.
:
:>     The KSE is a kernel abstraction and
:>     essentially *invisible* to user mode.  The Thread is a kernel abstraction
:>     that is visible to user mode.
:
:I see KSE as being 'kernel context', and that *everytime* a 'thread'
:(userland) makes a system call, it gets assigned (created, whatever) a
:KSE.  However, in order to do proper thread priorities, who determines
:which threads get a 'SA' in this context?  There could be lots of
:threads vying for a SA (or kernel 'Thread') in this context, and the
:only entity with enough context to decide correctly is the UTS.
:
:Nate

    I see a KSE as being a 'kernel context' as well.  The only difference
    between your description and mine is that the threads currently
    running on a cpu (not runnable, just running) requires a KSE to be
    assigned to it so the kernel context is available to switch into
    when the system call is made.  The KSE *cannot* be allocate after
    the fact without getting into low-memory deadlock situations.

    A runnable thread (or a stopped thread) which has *no* kernel context
    does not need a KSE while not being run by a cpu.

    Thread priorities are controlled by the UTS.  The UTS schedules and
    deschedules kernel Threads.  For example, if you have 10 user mode
    threads which are all runnable, and the UTS is implementing a 
    serialized FIFO scheduling class, the UTS simply schedules one of those
    threads at a time with the kernel.  If the UTS wants to run all of
    them simultaniously, the UTS schedules all of the threads with the
    kernel.  The kernel's 'run' and 'runnability' state for a thread
    is entirely separate from the UTS's 'run' and 'runnability' state.
    This difference is how the UTS imposes the scheduling class on the
    thread.

    The kernel aggregates the cpu use for all running kernel threads
    simply by having the cpu use counter be in the governing process
    structure.  In otherwords, you could have 50 threads running in 
    the kernel associated with a single process but they will only get
    the same aggregate cpu as that single process would get.  This is
    trivial to do - essentially two lines of code.  The purpose of this
    is not to impose a scheduling class on the threads, that is what
    the UTS does when the UTS schedules and deschedules the threads with
    the kernel.  The purpose is to place all threads related to an umbrella
    under the scheduling class and priority of the governing process.

    Did I explain that well enough?  I know its confusing.   We are talking
    about two totally unassociated pieces of the scheduling puzzle here.  The
    kernel only deals with the piece that governs cpu resource sharing 
    between processes (even though the scheduling entity is a 'Thread'),
    the UTS deals with the scheduling of threads within the slice of cpu(es)
    (not the plural) the kernel gives the process. 

    The UTS does not have to know how many real cpu's exist in the 
    system.  It simulates N virtual cpu's simply by scheduling N threads
    with the kernel at any given moment.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message