From owner-freebsd-hackers@FreeBSD.ORG  Mon Feb  6 16:29:20 2012
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 3CDA5106566B;
	Mon,  6 Feb 2012 16:29:20 +0000 (UTC)
	(envelope-from mavbsd@gmail.com)
Received: from mail-wi0-f182.google.com (mail-wi0-f182.google.com
	[209.85.212.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 901AD8FC18;
	Mon,  6 Feb 2012 16:29:19 +0000 (UTC)
Received: by wibhn14 with SMTP id hn14so6911416wib.13
	for <multiple recipients>; Mon, 06 Feb 2012 08:29:18 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject
	:references:in-reply-to:content-type:content-transfer-encoding;
	bh=qBdmdL7spYKtiw1wyq7zA0WFkhXArs5difn2OiKhUuo=;
	b=mf1jXtpJdDADINaKHYrc8CxE6FHzk0toI7y72ratGa6/QjwJloZs0i3E0C3/C7G0rC
	Ner+rVdV7mmCt6tRymjlcnRnXLaN6KC9OqKWZnYO9L3onJRycFW0NypSPatzFPTc9CpK
	G5VZMk1yRTN46KUacxZJuky67l4W0SPJfCIZE=
Received: by 10.180.83.97 with SMTP id p1mr17583038wiy.19.1328545758354;
	Mon, 06 Feb 2012 08:29:18 -0800 (PST)
Received: from mavbook2.mavhome.dp.ua (pc.mavhome.dp.ua. [212.86.226.226])
	by mx.google.com with ESMTPS id ex2sm47889613wib.1.2012.02.06.08.29.16
	(version=SSLv3 cipher=OTHER); Mon, 06 Feb 2012 08:29:17 -0800 (PST)
Sender: Alexander Motin <mavbsd@gmail.com>
Message-ID: <4F2FFFDA.2080608@FreeBSD.org>
Date: Mon, 06 Feb 2012 18:29:14 +0200
From: Alexander Motin <mav@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
	rv:9.0) Gecko/20111227 Thunderbird/9.0
MIME-Version: 1.0
To: Alexander Best <arundel@freebsd.org>
References: <4F2F7B7F.40508@FreeBSD.org> <20120206160136.GA35918@freebsd.org>
In-Reply-To: <20120206160136.GA35918@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-hackers@freebsd.org
Subject: Re: [RFT][patch] Scheduling for HTT and not only
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Feb 2012 16:29:20 -0000

On 02/06/12 18:01, Alexander Best wrote:
> On Mon Feb  6 12, Alexander Motin wrote:
>> I've analyzed scheduler behavior and think found the problem with HTT.
>> SCHED_ULE knows about HTT and when doing load balancing once a second,
>> it does right things. Unluckily, if some other thread gets in the way,
>> process can be easily pushed out to another CPU, where it will stay for
>> another second because of CPU affinity, possibly sharing physical core
>> with something else without need.
>>
>> I've made a patch, reworking SCHED_ULE affinity code, to fix that:
>> http://people.freebsd.org/~mav/sched.htt.patch
>>
>> This patch does three things:
>>   - Disables strict affinity optimization when HTT detected to let more
>> sophisticated code to take into account load of other logical core(s).
>>   - Adds affinity support to the sched_lowest() function to prefer
>> specified (last used) CPU (and CPU groups it belongs to) in case of
>> equal load. Previous code always selected first valid CPU of evens. It
>> caused threads migration to lower CPUs without need.
>>   - If current CPU group has no CPU where the process with its priority
>> can run now, sequentially check parent CPU groups before doing global
>> search. That should improve affinity for the next cache levels.

>> Who wants to do independent testing to verify my results or do some more
>> interesting benchmarks? :)
>
> i don't have any benchmarks to offer, but i'm seeing a massive increase in
> responsiveness with your patch. with an unpatched kernel, opening xterm while
> unrar'ing some huge archive could take up to 3 minutes!!! with your patch the
> time it takes for xterm to start is never>  10 seconds!!!

Thank you for the report. I can suggest explanation for this. Original 
code does only one pass looking for CPU where the thread can run 
immediately. That pass limited to the first level of CPU topology (for 
HTT systems it is one physical core). If it sees no good candidate, it 
just looks for the CPU with minimal load, ignoring thread priority. I 
suppose that may lead to priority violation, scheduling thread to CPU 
where higher-priority thread is running, where it may wait for a very 
long time, while there is some other CPU with minimal priority thread. 
My patch does more searches, that allows to handle priorities better.

Unluckily on my newer tests of context-switch-intensive workloads (like 
doing 40K MySQL requests per second) I've found about 3% slowdown 
because of these additional searches. I'll finish some more tests and 
try to find some compromise solution.

-- 
Alexander Motin