From owner-freebsd-arch  Tue Nov 28 12:48:43 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from mail.interware.hu (mail.interware.hu [195.70.32.130])
	by hub.freebsd.org (Postfix) with ESMTP id 35DD837B402
	for <arch@freebsd.org>; Tue, 28 Nov 2000 12:48:37 -0800 (PST)
Received: from luanda-25.budapest.interware.hu ([195.70.51.25] helo=elischer.org)
	by mail.interware.hu with esmtp (Exim 3.16 #1 (Debian))
	id 140rg1-0004xc-00; Tue, 28 Nov 2000 21:48:34 +0100
Message-ID: <3A24064C.A3DF52A8@elischer.org>
Date: Tue, 28 Nov 2000 11:23:56 -0800
From: Julian Elischer <julian@elischer.org>
X-Mailer: Mozilla 4.7 [en] (X11; U; FreeBSD 5.0-CURRENT i386)
X-Accept-Language: en, hu
MIME-Version: 1.0
To: Jason Evans <jasone@canonware.com>
Cc: arch@FreeBSD.ORG
Subject: Re: Threads (KSE etc) comments
References: <Pine.SUN.3.91.1001121160717.7102A-100000@pcnet1.pcnet.com> <3A1B0B64.6D694248@elischer.org> <3A211C82.2464D07E@elischer.org> <20001127154800.M4140@canonware.com>
Content-Type: text/plain; charset=iso-8859-15
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

Jason Evans wrote:
> 
>
> I don't understand how we're usurping the UTS's scheduling decisions.  If a
> KSE is preempted, then an upcall (resulting in yet another preemption, if
> necessary) must be done right away in order to give the UTS enough
> information to correctly schedule threads on the KSEs that are still
> running.  This is one of the basic tenets of scheduler activations, which
> we really have to follow.

I don't think it's practical to upcall to the UTS when you preempt one of 
it's KSEs. Would you hold off servicing a higher priority process while 
you are chatting with the UTS? Of course not.. If the UTS wanted a thread 
to have a priority high enough to avoid being pre-empted by  a process of 
priority X then it should have put it into a KSEG running at priority X+1.
What you DO want to do is notify the UTS at the next possible convenient 
moment of the fact. This would be at the completion of a syscall in some
other KSE, or the resumption of a previously suspended KSE.
(these are the same times when signals are delivered).  You COULD
pre-empt another KSE (on another processor)
 (if it was in a thread) and do an upcall to it's UTS
but I don't know that the complexity (including inter-CPU communications
within the kernel and delivery of an inter-CPU interupt) is worth it.
You really can't do anything about the period of time between the
pre-emption and the first available notification moment. We put up with 
a delay equal to, or greater than this with signal delivery today.
You cannot hold up the higher priority process. (As I said, we give a 
method by which the UTS can ensure that some threads are harder to 
pre-empt than others(KSEGs. It should use this to avoid the pre-emption 
in the first place..

> 
> > If the Former (All KSEs report all events) then there is no real
> > advantage to having more than N KSEs (N processors), because that means
> > that the UTS will probably keep swapping the threads it thinks are most
> > important to the KSEs which means that the thread that was pre-empted on
> > KSE-A will probably be re-scheduled on KSE-B when it preempts KSE-A. So
> > why have KSE-B at all? All it does is massively confuse things, and
> > creates a whole new class of scheduling problems.
> 
> The main advantage I see of allowing more KSEs than processors (total
> across all KSEGs) is that it simplifies the implementation considerably.
> Very little has to be changed about how things currently work, which also
> means that single-threaded applications work the same as they do now,
> without a lot of extra work.

We are agreed that the TOTAL number of KSEs may be greater than the number
of processes.

> 
> I agree that there's no good reason to have more KSEs in a KSEG than there
> are processors, but it doesn't actually break anything to allow this, and
> simply using process resource limits to control the number of KSEs is
> simpler than enforcing limits on the number of KSEs per KSEG.

I think that if you think of a KSEG as a contention domain it gives
a different viewpoint.
Threads are assigned to a KSEG and will not contend against each other
in the system-scope. All threads in the same KSEG share the same 
CPU resources and can migrate pretty easily around the KSEs assigned to
the KSEG. (the only reason to keep them on a cpu would be for cache effects).

KSEs within a KSEG do not directly contend with each other, and KSEGs
DO contend with each other (potentially), so you can see that it would 
simplify some things if you make the following rules..

1/ Only one KSE from any given KSEG may be on any single processor at any
given time. Maybe you only shift a KSE when it's idle, or in the kernel, and
only onto a processor which doesn;t already have a KSE from that KSEG in it.

2/ You only allow the number of KSEs to be <= N where N is the number of 
processors available.

I can't prove it yet, but thinking about the implementation, I can't help
but feel in my gut that making these rules will allow some solutions to just
"fall out" that otherwise may require a lot more work.

I'm particularly worried about a KSE being pre-empted while in the UTS.
The kernel isn't going to hold off the pre-emption just because the  
process thinks it's a bad idea... it has a higher priority process 
screaming in its ear wanting cycles. If, later, we then come in on 
another KSE, on the same processor we can't really guarantee that 
we will not have a locking collision within our resourses, with 
the UTS that is presently swapped out. It's not really a 
deadlock but we will take a big hit in time as we have to 
wait for the other KSE to be run again before we can get what 
would otherwise have been a very short term lock.

I'm not explaining it very well, because it sort of relies on a lot 
of other stuff in my head. 

maybe I need to write that down.. ok here goes.

In my imagination of the implementation, KSEs each have a user upcall
stack and a mailbox. The UTS is run on those stacks. Threads are 
assigned to a KSEG and the UTS will prefer to run them on a single KSE
for as long as it's easy to do so, but will often and easily switch them
between KSEs (if there are several) for load balancing. i.e. there
is very little binding of threads to KSEs within a single KSEG. If there
are only N KSEs (N=numProcessors) then locking between the UTS instances
in the same KSEG is rather trivial, and can be pretty much limitted 
to brief spinlocks. Since these contentions will be more common than 
contentions between UTS agents in other KSEGs (threads won't often be 
migrating between KSEGs), keeping the locks simple is important. 
If we have N>numProcessors then we need to take into account 
(by my thinking) the potential serialisation of KSEs and pre-emptions
such as that I mentionned above. If we don't then we can allow
communications between threads in the same KSEG to use a much simpler
locking and synchronisation scheme. You use a more heavyweight scheme
between threads in different KSEGs. if you wnat to make a program that has 
many KSEs, just put them all in different KSEGs all with the same
scheduler priority, but be prepared to pay the price of heavier 
weight communications and synchronisations. With a limit of N KSEs 
we can also experiment with such things as gang scheduling, where 
we might ask that all KSEs in a  KSEG are iff possible sschedlued 
across all the processors at once. This can give massive throughput 
imporvvements in some applications. Particularly ones where threads
are communicating with each other using a ping-pong protocol.

>From the kernel point of view we need not limit the number. but I think
that it is foolish to not do so.


> 
> One example of why enforcing KSE/KSEG limits could become hard in the
> future is if the number of processors is dynamic (i.e. processors can be
> added and removed).  In discussions I've had with Mike Smith, this is a
> very real possibility, and is something we should keep in mind.

OK I agree that this is possible. And I see that this would require that
we can pre-empt a KSE in user mode, while it is in the UTS, and
allow it to try run to completion (get out of the UTS) somewher else.
yuk.
> 

> The UTS doesn't need to be any more complex.  It would simply get more
> upcalls if there were more preemptions as a result of excessive KSEs, which
> I don't think would happen anyway.

As I said, I'm worried about the UTS itslef.

> 
> > The reason for having KSEGs is simply as an entity that competes for CPU
> > to assure fairness.
> > It may not even exist as a separate structure in the case where there
> > are separate per-CPU scheduling queues, (though I think it would for
> > efficiency's sake). It would PROBABLY have a analogous partner in the
> > UTS that represents the virtual machine that runs all the threads that
> > are competing at the same scope.
> 
> I agree with everything you say here.
> 
> > On a single scheduling queue system, I
> > think I would have the KSEG in the queue rather than the independent
> > KSEs. When it get's to the head, you schedule
> > KSEs on all the CPUs. This allows the threads to communicate quickly
> > using shared memory should they want. The UTS has the entire quantum
> > across as many CPUs as it has.
> 
> As I mentioned in another email, I don't think we should plan on having a
> production release that is implemented with only a single scheduling queue.

fair enough

> 
> Jason

-- 
      __--_|\  Julian Elischer
     /       \ julian@elischer.org
    (   OZ    ) World tour 2000
---> X_.---._/  presently in:  Budapest
            v


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message