Date: Thu, 5 Apr 2012 18:08:24 -0700 (PDT) From: Sushanth Rai <sushanth_rai@yahoo.com> To: John Baldwin <jhb@freebsd.org> Cc: freebsd-hackers@freebsd.org Subject: Re: Startvation of realtime piority threads Message-ID: <1333674504.97862.YahooMailClassic@web180016.mail.gq1.yahoo.com> In-Reply-To: <201204051201.58651.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
I understand the downside of badly written realtime app. In my case applic= ation runs in userspace without making much syscalls and by all means it is= a well behaved application. Yes, I can wire memory, change the application= to use mutex instead of spinlock and those changes should help but they ar= e still working around the problem. I still believe kernel should not lower= the realtime priority when blocking on resources. This can lead to priorit= y inversion, especially since these threads run at fixed priorities and ker= nel doesn't muck with them.=0A =0AAs you suggested _sleep() should not adju= st the priorities for realtime threads. =0A=0AThanks,=0ASushanth=0A=0A--- O= n Thu, 4/5/12, John Baldwin <jhb@freebsd.org> wrote:=0A=0A> From: John Bald= win <jhb@freebsd.org>=0A> Subject: Re: Startvation of realtime piority thre= ads=0A> To: freebsd-hackers@freebsd.org, davidxu@freebsd.org=0A> Date: Thur= sday, April 5, 2012, 9:01 AM=0A> On Thursday, April 05, 2012 1:07:55=0A> am= David Xu wrote:=0A> > On 2012/4/5 11:56, Konstantin Belousov wrote:=0A> > = > On Wed, Apr 04, 2012 at 06:54:06PM -0700, Sushanth=0A> Rai wrote:=0A> > >= > I have a multithreaded user space program that=0A> basically runs at real= time =0A> priority. Synchronization between threads are done using=0A> spin= lock. When =0A> running this program on a SMP system under heavy memory=0A>= pressure I see that =0A> thread holding the spinlock is starved out of cpu= . The cpus=0A> are effectively =0A> consumed by other threads that are spin= ning for lock to=0A> become available.=0A> > >>=0A> > >> After instrumentin= g the kernel a little bit=0A> what I found was that under =0A> memory press= ure, when the user thread holding the spinlock=0A> traps into the =0A> kern= el due to page fault, that thread sleeps until the free=0A> pages are =0A> = available. The thread sleeps PUSER priority (within=0A> vm_waitpfault()). W= hen it =0A> is ready to run, it is queued at PUSER priority even thought=0A= > it's base =0A> priority is realtime. The other siblings threads that are= =0A> spinning at realtime =0A> priority to acquire the spinlock starves the= owner of=0A> spinlock.=0A> > >>=0A> > >> I was wondering if the sleep in= =0A> vm_waitpfault() should be a =0A> MAX(td_user_pri, PUSER) instead of ju= st PUSER. I'm running=0A> on 7.2 and it looks =0A> like this logic is the s= ame in the trunk.=0A> > > It just so happen that your program stumbles upon= =0A> a single sleep point in=0A> > > the kernel. If for whatever reason the= thread in=0A> kernel is put off CPU=0A> > > due to failure to acquire any = resource without=0A> priority propagation,=0A> > > you would get the same e= ffect. Only blockable=0A> primitives do priority=0A> > > propagation, that = are mutexes and rwlocks, AFAIR.=0A> In other words, any=0A> > > sx/lockmgr/= sleep points are vulnerable to the same=0A> issue.=0A> > This is why I sugg= ested that POSIX realtime priority=0A> should not be =0A> > boosted, it sho= uld be=0A> > only higher than PRI_MIN_TIMESHARE but lower than any=0A> prio= rity all =0A> > msleep() callers=0A> > provided.=A0 The problem is userland= realtime thread=0A> 's busy looping code =0A> > can cause=0A> > starvation= a thread in kernel which holding a critical=0A> resource.=0A> > In kernel = we can avoid to write dead-loop code, but=0A> userland code is not =0A> > t= rustable.=0A> =0A> Note that you have to be root to be rtprio, and that the= re=0A> is trustable=0A> userland code (just because you haven't used any do= esn't=0A> mean it doesn't=0A> exist).=0A> =0A> > If you search "Realtime th= read priorities" in=0A> 2010-december within @arch =0A> > list.=0A> > you m= ay find the argument.=0A> =0A> I think the bug here is that sched_sleep() s= hould not lower=0A> the priority of=0A> an rtprio process.=A0 It should arg= uably not raise the=0A> priority of an idprio=0A> process either, but sched= _sleep() should probably only apply=0A> to timesharing=0A> threads.=0A> =0A= > All that said, userland rtprio code is going to have to be=0A> careful.= =A0 It should=0A> be using things like wired memory as Kostik suggested, an= d=0A> probably avoiding=0A> most system calls.=A0 You can definitely blow y= our foot=0A> off quite easily in lots =0A> of ways with rtprio.=0A> =0A> --= =0A> John Baldwin=0A> _______________________________________________=0A> = freebsd-hackers@freebsd.org=0A> mailing list=0A> http://lists.freebsd.org/m= ailman/listinfo/freebsd-hackers=0A> To unsubscribe, send any mail to "freeb= sd-hackers-unsubscribe@freebsd.org"=0A>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1333674504.97862.YahooMailClassic>