Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 20 Sep 2002 14:36:29 -0700
From:      Bill Huey (Hui) <billh@gnuppy.monkey.org>
To:        Terry Lambert <tlambert2@mindspring.com>
Cc:        Rik van Riel <riel@conectiva.com.br>, Julian Elischer <julian@elischer.org>, freebsd-arch@freebsd.org
Subject:   Re: New Linux threading model
Message-ID:  <20020920213629.GA1527@gnuppy.monkey.org>
In-Reply-To: <3D8B8E35.EDAF4450@mindspring.com>
References:  <Pine.LNX.4.44L.0209201652330.1857-100000@imladris.surriel.com> <3D8B8E35.EDAF4450@mindspring.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Sep 20, 2002 at 02:08:05PM -0700, Terry Lambert wrote:
> > 1) per-cpu runqueues instead of a global one, which wants ...
> 
> Yes.
> 
> > 2) ... load balancer between these per-cpu queues
> 
> No.
> 
> This is the problem with the Linux model.  With an active load
> balancing algorithm, you end up with global contention for
> scheduler queue locks.

That's not true, they're all per CPU run queue locks.

> In reality, you can get away from having *any* scheduler queue
> locks *at all*, for the normal case, and then only contend at
> the inter-CPU level for the CPUs using a push migration model.

That's what the Mingo's scheduler does. It only does a double lock
when the load balance computation makes it do a CPU migrate, otherwise
there's no contention and no per CPU locks are aquired.

> So the migrate away is one-behind the head of the scheduling queue,
> and the migrate-to is one ahead.  On migration, an artificial
> inflation of the figure of merit can be handled -- or a the value
> of the pointer can be examined, and assumed to add a constant
> weighting to the target CPU's figure of merit, if the pointer value
> is non-NULL.
> 
> Thus, it is only ever required to lock when you are actively doing
> process migration, and process migration is rare (as it should be,
> particularly if one of  your target architectures is NUMA, but in
> the general case on small CPU count shared memory multiprocessors,
> as well).
> 
> This also permits preference weighting based on locality, for
> affinity on hyper-threaded CPUs, and negaffinity in CPU sets under
> the same circumstances.

I don't normally ask this, but what the hell are you saying here ?

> > 3) two runqueue arrays (current and expired) instead of
> >    just one, which enables ...
> 
> Not required.

That's part of mingo's algorithm to avoid recalculation if I understand
it correctly. Not exactly sure, I'm stretching my knowledge of his
algorithm here.

> > 4) ... event-driver priority recalculation, instead of
> >    recalculating the priority of each task separately
> 
> This actually doesn't work.  The worst case failure is under
> overload, which is exactly where you don't want it to be.

What kind of overload ? he does a number of things to make sure that
all processes behave properly by demoting priorities.

> The scheduling for the BSD scheduler, as was pointed out, takes
> time not run into the priority weighting.  A granularity of 3
> seconds until the disctinction between the two queues for
> enqueueing delayed jobs is realized is really gross.  8-(.

bill


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020920213629.GA1527>