From owner-freebsd-hackers@FreeBSD.ORG  Thu Apr  5 05:08:02 2012
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 27FC11065672
	for <freebsd-hackers@freebsd.org>; Thu,  5 Apr 2012 05:08:02 +0000 (UTC)
	(envelope-from listlog2011@gmail.com)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id EB5F68FC12
	for <freebsd-hackers@freebsd.org>; Thu,  5 Apr 2012 05:08:01 +0000 (UTC)
Received: from [127.0.0.1] (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q3557u9u088159
	for <freebsd-hackers@freebsd.org>; Thu, 5 Apr 2012 05:07:58 GMT
	(envelope-from listlog2011@gmail.com)
Message-ID: <4F7D28AB.605@gmail.com>
Date: Thu, 05 Apr 2012 13:07:55 +0800
From: David Xu <listlog2011@gmail.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1;
	rv:11.0) Gecko/20120327 Thunderbird/11.0.1
MIME-Version: 1.0
To: freebsd-hackers@freebsd.org
References: <1333590846.58474.YahooMailClassic@web180011.mail.gq1.yahoo.com>
	<20120405035645.GO2358@deviant.kiev.zoral.com.ua>
In-Reply-To: <20120405035645.GO2358@deviant.kiev.zoral.com.ua>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: Startvation of realtime piority threads
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: davidxu@freebsd.org
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Apr 2012 05:08:02 -0000

On 2012/4/5 11:56, Konstantin Belousov wrote:
> On Wed, Apr 04, 2012 at 06:54:06PM -0700, Sushanth Rai wrote:
>> I have a multithreaded user space program that basically runs at realtime priority. Synchronization between threads are done using spinlock. When running this program on a SMP system under heavy memory pressure I see that thread holding the spinlock is starved out of cpu. The cpus are effectively consumed by other threads that are spinning for lock to become available.
>>
>> After instrumenting the kernel a little bit what I found was that under memory pressure, when the user thread holding the spinlock traps into the kernel due to page fault, that thread sleeps until the free pages are available. The thread sleeps PUSER priority (within vm_waitpfault()). When it is ready to run, it is queued at PUSER priority even thought it's base priority is realtime. The other siblings threads that are spinning at realtime priority to acquire the spinlock starves the owner of spinlock.
>>
>> I was wondering if the sleep in vm_waitpfault() should be a MAX(td_user_pri, PUSER) instead of just PUSER. I'm running on 7.2 and it looks like this logic is the same in the trunk.
> It just so happen that your program stumbles upon a single sleep point in
> the kernel. If for whatever reason the thread in kernel is put off CPU
> due to failure to acquire any resource without priority propagation,
> you would get the same effect. Only blockable primitives do priority
> propagation, that are mutexes and rwlocks, AFAIR. In other words, any
> sx/lockmgr/sleep points are vulnerable to the same issue.
This is why I suggested that POSIX realtime priority should not be 
boosted, it should be
only higher than PRI_MIN_TIMESHARE but lower than any priority all 
msleep() callers
provided.  The problem is userland realtime thread 's busy looping code 
can cause
starvation a thread in kernel which holding a critical resource.
In kernel we can avoid to write dead-loop code, but userland code is not 
trustable.

If you search "Realtime thread priorities" in 2010-december within @arch 
list.
you may find the argument.


> Speaking of exactly your problem, did you considered wiring the memory
> of your realtime process ? This is a common practice, used e.g. by ntpd.