Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 28 Jan 1996 15:47:44 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        jau@jau.csc.fi (Jukka Ukkonen)
Cc:        hackers@FreeBSD.org
Subject:   Re: POSIX.4 scheduler interface for FreeBSD-2.1
Message-ID:  <199601282247.PAA01758@phaeton.artisoft.com>
In-Reply-To: <199601281231.OAA00704@jau.csc.fi> from "Jukka Ukkonen" at Jan 28, 96 02:31:40 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> 	What is the benefit of having separate realtime policies at all,
> 	if there still is the possibility of a problem known as priority
> 	inversion?
> 	E.g. a process with relatively high (realtime) priority wishes to
> 	communicate with another process with worse priority using a pipe
> 	in which there is no free buffer space available. Now the process
> 	with higher priority would have to wait for the process with lower
> 	priority to read from the pipe to make room for the data to be
> 	written by the higher priority process. While waiting for the
> 	buffer to drain the process with higher priority could become
> 	pre-empted by a process which has its priority somewhere between
> 	the priorities of the communicating processes.
> 	And so we see the grand surprise, the higher priority process waits
> 	for the lower priority process, which in turn waits for the middle
> 	priority process. The realtime model has just been broken because
> 	the process with the best priority lost its place in the scheduling
> 	queue to a process with lower priority.

The typical soloution for this is to have a real and effective priority
and/or priority class for each process.

When a high priority process is blocked on a resource, the resource
holder has his effective priority advanced from his real priority to
the effective priority of the high priority process.  The effective
rather than the real priority of the high priority process is used
because it may, in turn, be blocking a resource for an even higher
priority process.

Scheduling policy is set by real priority and implemented by effective
priority.

This soloution is called "priority lending".


Lending introduces another potential problem, since the low priority
process may only acquire and unacquire the scarce resource.  That
means that the low priority process could buzz at the lent priority,
preventing other processes from running.

This means that resource release mechanisms musth have the capability
of causing a context switch of a process releasing a resource if there
is a higher priority wait on the resource.

This is accomplished by setting a priority field on the resource equal
to the max of the former (or default, if no former) and the real priority
of each blocker.  This is the "release priority" and is used in two ways:

1)	If the release priority is higher than the real priority of
	the releasing process, that process is involuntarily context
	switched.  This prevents buzz-loop monopolization of a resource.

2)	If a context switch takes place, the context switch will be a
	"switch and release".  On the release, the wakeup will be via
	priority preference instead of thundering herd.  There are a
	number of associated issues, dealing with FIFO ordering of
	equaly priority processes for wakeup() vs. wakeone() processing
	that I will gloss over at this time.

When a context switch takes place, it must be determined if the released
resource should cause the lent priority to be retacted.

This still leaves the possiblity of deadlock, as in any IPC situation,
but allows the high priority.  This is, I think, a problem we can ignore,
though if we wish to deal with this, internally, the kernel resource
locking must be a hiearchical graph over which we compute transitive
closure to prevent deadlock (as opposed to detecting it).  This is
similar to the resource issues in an SMP or kernel multithreading
situation.  A full soloution based on this approach would require a
FILO (stack) based pushing of priority, since low priority process
A may block medium priority process B's resource and be lent B's
priority, only to later block high priority process C's resource.  So
A is lent B's priority, then C's.  If A then releases C's resource,
his effective priroity is returned to A's real priority instead of
the priority lent by B (and B is screwed).  To handle this, the
resource would have to own the queue, and the entry would have to
have the desiring process as well as the lent priority (or just the
desiring process, from which you could look up the lent priority).
The FIFO ordering in #2 above would then need to be able to have
insertionby priority, so there would be multiple entries into the
FIFO (these could be an AVL tree, or bsearched for initial priority
cahnge and linearly traversed to find the actual change point for
the insertion).

Priority inversion is a non-trivial problem.  8-).

I recommend both "UNIX For Modern Architectures" and "UNIX Internals:
The New Frontiers" for advanced discussion of priority inversion.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199601282247.PAA01758>