From owner-freebsd-arch  Sat Jan 25 14:23:44 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 12C8737B401; Sat, 25 Jan 2003 14:23:42 -0800 (PST)
Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 323EE43EB2; Sat, 25 Jan 2003 14:23:41 -0800 (PST)
	(envelope-from jroberson@chesapeake.net)
Received: from localhost (jroberson@localhost)
	by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id h0PMNYq96481;
	Sat, 25 Jan 2003 17:23:34 -0500 (EST)
	(envelope-from jroberson@chesapeake.net)
Date: Sat, 25 Jan 2003 17:23:34 -0500 (EST)
From: Jeff Roberson <jroberson@chesapeake.net>
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
Cc: Robert Watson <rwatson@FreeBSD.ORG>,
	Gary Jennejohn <garyj@jennejohn.org>, <arch@FreeBSD.ORG>
Subject: Re: New scheduler
In-Reply-To: <20030125192521.GA18048@troutmask.apl.washington.edu>
Message-ID: <20030125171217.D18109-100000@mail.chesapeake.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

On Sat, 25 Jan 2003, Steve Kargl wrote:

> On Sat, Jan 25, 2003 at 01:58:33PM -0500, Robert Watson wrote:
> >
> > Part of the problem is that the load average is a poor measure of system
> > utilization.  Jeff's new scheduler may defer scheduling a process that's
> > ready to run to improve throughput and wait for a "better" CPU to run a
> > process on based on affinity.  Potentially the result might be that the
> > run queues are (on average) deeper, and what the load average does is
> > measure the depth of the run queue over time.  So for the moment, it's
> > probably best to disregard load average as a measure of performance.  On
> > the other hand, actual interactivity regressions and performance changes
> > are very relevant.  Load average is intended to capture the degree of
> > contention for CPU resources, but what exactly that means is always an
> > interesting question.
> >
>
> Robert, I'm sure your analysis is correct.  All I can
> say is that Jeff's experimental scheduler will bring
> a UP system to its knees.  The system I tested on
> runs NTP to sync the clock, and the system clock lost
> 6 minutes of wall-clock time in 45 minutes.  The two
> possible causes of the problem (that I can think of)
> are (1) deadlock in the scheduler or (2) processes are
> ping-ponging between run queues without actually getting
> a time slice.  Unfortunately, I was running X window
> at the time and could not break into the debugger.  I'll
> try again later today to what ddb says.
>

A process will not leave it's current run queue until it has exhausted its
slice.  If it does, it is a bug.  You can see that this is the case by
looking at sched_clock() which resets the runq to NULL when the slice is
exhausted.  Then in sched_add() we pick a new queue if it is NULL
otherwise we use the queue that the kse is using.

There are a few potential problems.  One is that processes which are
deemed interactive are ALWAYS put on the current queue.  They will hold
the current queue in place until they sleep which will starve any process
which is on the alternate queue.   The code that decides interactivity is
far too simple.  It could be that non interactive tasks are being marked
as interactive and holding up the queues.

There is another potential problem.  I think that tasks which are
interactive are not getting reassigned to the front queue when they wake
up.  I believe the runq should be reassigned in sched_wakeup().  This
would cause horrible interactivity.

I can come up with patches for the second problem but probably not for
another day.  If someone else wants to experiment you can look at the code
in sched_add that reassigns the runq and do that in sched_wakeup() if we
think the process is interactive.

I appreciate the testing.  I must admit that the interactivity testing
that I had done was with parallel buildworlds and vi.  I haven't done much
with guis, other than run this on my laptop which is 2ghz.  I think these
problems should be relatively quick to address.

Thanks,
Jeff


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message