From owner-freebsd-current@FreeBSD.ORG  Thu Feb 24 14:22:07 2011
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D3D691065672
	for <freebsd-current@freebsd.org>; Thu, 24 Feb 2011 14:22:07 +0000 (UTC)
	(envelope-from jerome.flesch@netasq.com)
Received: from work.netasq.com (mars.netasq.com [91.212.116.3])
	by mx1.freebsd.org (Postfix) with ESMTP id 586198FC12
	for <freebsd-current@freebsd.org>; Thu, 24 Feb 2011 14:22:06 +0000 (UTC)
Received: from pc_jf.netasq.com (unknown [10.2.200.254])
	by work.netasq.com (Postfix) with ESMTPSA id 629C8740004;
	Thu, 24 Feb 2011 15:20:42 +0100 (CET)
Message-ID: <4D6668B7.5070005@netasq.com>
Date: Thu, 24 Feb 2011 15:18:31 +0100
From: Jerome Flesch <jerome.flesch@netasq.com>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US;
	rv:1.9.1.16) Gecko/20110106 Thunderbird/3.0.11
MIME-Version: 1.0
To: Ryan Stone <rysto32@gmail.com>
References: <4D6291A5.4050206@netasq.com>	<8C8FE4A5-F031-466A-9CB8-46D79EEA280D@mac.com>	<4D638050.2010906@netasq.com>	<20110222195713.GW66284@funkthat.com>
	<AANLkTik4LEgGBgg4vuUzfMib36Egb2bkOxBL3m1_rYAa@mail.gmail.com>
In-Reply-To: <AANLkTik4LEgGBgg4vuUzfMib36Egb2bkOxBL3m1_rYAa@mail.gmail.com>
Content-Type: multipart/signed; protocol="application/pkcs7-signature";
	micalg=sha1; boundary="------------ms080808040407020805030706"
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: John-Mark Gurney <jmg@funkthat.com>, freebsd-current@freebsd.org
Subject: Re: Process timing issue
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 24 Feb 2011 14:22:08 -0000

This is a cryptographically signed message in MIME format.

--------------ms080808040407020805030706
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable

Thanks for your explanations. It helped greatly. Using ktrdump and=20
schedgraph.py and after modifying our test program to set and unset=20
automatically debug.ktr.mask, I've been able to get useful information.

First, It made me realize that task switching, with default settings and =

2 active processes, only occurs each 100ms. Knowing that, expecting a=20
latency time around 100ms was kind of silly :)

Next, it seems most of the latency pikes are due to a process starting=20
or waking up. For instance, it usually happens when the openssl speed=20
test is started (=20
http://jflesch.kwain.net/~jflesch/sys_latence/sched/sched_graph_openssl_s=
tart.png=20
) or when pagedaemon wakes up (I forgot to disable the swap and my test=20
program used too much memory to store the result values ...). I'm not=20
sure why, but when we start openssl, it is often allowed to run for >=3D =

300ms, even with our test program set to real time priority. My=20
intuition is that, at first, it's considered as an interactive process,=20
until the scheduler realizes it's not. But then, does anyone know why it =

would take more than 300ms for the scheduler to realize that ?

Anyway, by setting kern.sched.interact=3D5 (so openssl isn't considered a=
s=20
an interactive process), kern.sched.slice=3D3 (to get an high enough=20
scheduling resolution), and our program to real-time priority, we got=20
rid of both problems. I'm just a little bit worried about=20
kern.sched.slice=3D3. Is there any known side effect when reducing slices=
=20
size ?

Also, another issue remain: We were hoping to keep our program with a=20
normal priority. However when we set our test program to a normal=20
priority (but still an higher priority than openssl), both get 50% of=20
the CPU (I guess this is to be expected), and from time to time we have=20
a "hiccup" in the scheduling:=20
http://jflesch.kwain.net/~jflesch/sys_latence/sched/sched_graph_hicups.pn=
g=20
=2E Is there any way to avoid them ? In other words, is it possible to=20
make sure that the low priority process never gets more CPU time than=20
the high priority one ?


On 23.02.2011 02:08, Ryan Stone wrote:
> To debug weird scheduling issues I find it helpful to start by looking
> at a schedgraph.  schedgraph is a tool that can display a graphical
> representation of what the scheduler was doing over a small slice of
> time.  The one downside is that you have to recompile your kernel to
> get the hooks that collect the necessary data, but it sounds like your
> problem is pretty easy to reproduce and so that shouldn't be a big
> problem.
>
> To enable the hooks, put the following in your kernel conf:
>
> options KTR
> options KTR_COMPILE=3DKTR_SCHED
> options KTR_MASK=3DKTR_SCHED
> options KTR_ENTIRES=3D(128*1024)
>
> Then rebuild and install the new kernel.  Next, run your test.  The
> instant that your test has detected the long delay, set the sysctl
> debug.ktr.mask to 0.  The scheduler hooks record data into a ring
> buffer, so the interesting data can be flushed out within seconds.
> Clearing that sysctl will stop any further events from being logged,
> which means that the interesting data will stay there until you go and
> collect it.
>
> You can get the raw data by redirecting the output of "ktrdump -ct"
> into a file.  Then from any machine with a graphical interface(FreeBSD
> with X installed, Windows, Mac, whatever) and python installed, run:
> $ python schedgraph.py /path/to/ktrdump/output
>
> You can get schedgraph.py from /usr/src/tools/sched.
>
> If you want to collect the data again, set the sysctl debug.ktr.mask
> back to 0x20000000 to restart gathering scheduler data.
>
> Ryan
>


--------------ms080808040407020805030706--