Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 5 Apr 2012 12:01:58 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        freebsd-hackers@freebsd.org, davidxu@freebsd.org
Subject:   Re: Startvation of realtime piority threads
Message-ID:  <201204051201.58651.jhb@freebsd.org>
In-Reply-To: <4F7D28AB.605@gmail.com>
References:  <1333590846.58474.YahooMailClassic@web180011.mail.gq1.yahoo.com> <20120405035645.GO2358@deviant.kiev.zoral.com.ua> <4F7D28AB.605@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday, April 05, 2012 1:07:55 am David Xu wrote:
> On 2012/4/5 11:56, Konstantin Belousov wrote:
> > On Wed, Apr 04, 2012 at 06:54:06PM -0700, Sushanth Rai wrote:
> >> I have a multithreaded user space program that basically runs at realtime 
priority. Synchronization between threads are done using spinlock. When 
running this program on a SMP system under heavy memory pressure I see that 
thread holding the spinlock is starved out of cpu. The cpus are effectively 
consumed by other threads that are spinning for lock to become available.
> >>
> >> After instrumenting the kernel a little bit what I found was that under 
memory pressure, when the user thread holding the spinlock traps into the 
kernel due to page fault, that thread sleeps until the free pages are 
available. The thread sleeps PUSER priority (within vm_waitpfault()). When it 
is ready to run, it is queued at PUSER priority even thought it's base 
priority is realtime. The other siblings threads that are spinning at realtime 
priority to acquire the spinlock starves the owner of spinlock.
> >>
> >> I was wondering if the sleep in vm_waitpfault() should be a 
MAX(td_user_pri, PUSER) instead of just PUSER. I'm running on 7.2 and it looks 
like this logic is the same in the trunk.
> > It just so happen that your program stumbles upon a single sleep point in
> > the kernel. If for whatever reason the thread in kernel is put off CPU
> > due to failure to acquire any resource without priority propagation,
> > you would get the same effect. Only blockable primitives do priority
> > propagation, that are mutexes and rwlocks, AFAIR. In other words, any
> > sx/lockmgr/sleep points are vulnerable to the same issue.
> This is why I suggested that POSIX realtime priority should not be 
> boosted, it should be
> only higher than PRI_MIN_TIMESHARE but lower than any priority all 
> msleep() callers
> provided.  The problem is userland realtime thread 's busy looping code 
> can cause
> starvation a thread in kernel which holding a critical resource.
> In kernel we can avoid to write dead-loop code, but userland code is not 
> trustable.

Note that you have to be root to be rtprio, and that there is trustable
userland code (just because you haven't used any doesn't mean it doesn't
exist).

> If you search "Realtime thread priorities" in 2010-december within @arch 
> list.
> you may find the argument.

I think the bug here is that sched_sleep() should not lower the priority of
an rtprio process.  It should arguably not raise the priority of an idprio
process either, but sched_sleep() should probably only apply to timesharing
threads.

All that said, userland rtprio code is going to have to be careful.  It should
be using things like wired memory as Kostik suggested, and probably avoiding
most system calls.  You can definitely blow your foot off quite easily in lots 
of ways with rtprio.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201204051201.58651.jhb>