From owner-freebsd-arch@FreeBSD.ORG Sat Dec 11 01:51:35 2010 Return-Path: Delivered-To: arch@freebsd.org Received: from alona.my.domain (localhost [127.0.0.1]) by hub.freebsd.org (Postfix) with ESMTP id CC39B106564A; Sat, 11 Dec 2010 01:51:34 +0000 (UTC) (envelope-from davidxu@freebsd.org) Message-ID: <4D02D90C.20503@freebsd.org> Date: Sat, 11 Dec 2010 09:51:08 +0800 From: David Xu User-Agent: Thunderbird 2.0.0.21 (X11/20090522) MIME-Version: 1.0 To: John Baldwin References: <201012101050.45214.jhb@freebsd.org> In-Reply-To: <201012101050.45214.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: arch@freebsd.org Subject: Re: Realtime thread priorities X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 11 Dec 2010 01:51:35 -0000 John Baldwin wrote: > So I finally had a case today where I wanted to use rtprio but it doesn't seem > very useful in its current state. Specifically, I want to be able to tag > certain user processes as being more important than any other user processes > even to the point that if one of my important processes blocks on a mutex, the > owner of that mutex should be more important than sshd being woken up from > sbwait by new data (for example). This doesn't work currently with rtprio due > to the way the priorities are laid out (and I believe I probably argued for > the current layout back when it was proposed). > > The current layout breaks up the global thread priority space (0 - 255) into a > couple of bands: > > 0 - 63 : interrupt threads > 64 - 127 : kernel sleep priorities (PSOCK, etc.) > 128 - 159 : real-time user threads (rtprio) > 160 - 223 : time-sharing user threads > 224 - 255 : idle threads (idprio and kernel idle procs) > > The problem I am running into is that when a time-sharing thread goes to sleep > in the kernel (waiting on select, socket data, tty, etc.) it actually ends up > in the kernel priorities range (64 - 127). This means when it wakes up it > will trump (and preempt) a real-time user thread even though these processes > nominally have a priority down in the 160 - 223 range. We do drop the kernel > sleep priority during userret(), but we don't recheck the scheduler queues to > see if we should preempt the thread during userret(), so it effectively runs > with the kernel sleep priority for the rest of the quantum while it is in > userland. > > My first question is if this behavior is the desired behavior? Originally I > think I preferred the current layout because I thought a thread in the kernel > should always have priority so it can release locks, etc. However, priority > propagation should actually handle the case of some very important thread > needing a lock. In my use case today where I actually want to use rtprio I > think I want different behavior where the rtprio thread is more important than > the thread waking up with PSOCK, etc. > > If we decide to change the behavior I see two possible fixes: > > 1) (easy) just move the real-time priority range above the kernel sleep > priority range > > This is not always correct, a userland realtime process may not be always more urgent than a normal time-sharing code which is backing up a file system or doing some important things, for example receiving money account from a socket. Process sleeping in kernel seems doing really important thing, for example removing data from a device interrupt or writing into device, while a thread which is realtime consuming 100% cpu time might be a deadloop thread. > 2) (harder) make sched_userret() check the run queue to see if it should > preempt when dropping the kernel sleep priority. I think bde@ has suggested > that we should do this for correctness previously (and I've had some old, > unfinished patches to do this in a branch in p4 for several years). > > This is too overhead, try it and benchmark it for real world application.