Date: Sat, 19 Dec 1998 01:53:06 -0800 (PST) From: Matthew Dillon <dillon@apollo.backplane.com> To: Don Lewis <Don.Lewis@tsc.tdk.com> Cc: freebsd-current@FreeBSD.ORG Subject: Re: asleep()/await(), M_AWAIT, etc... Message-ID: <199812190953.BAA07138@apollo.backplane.com> References: <199812190844.AAA11936@salsa.gv.tsc.tdk.com>
index | next in thread | previous in thread | raw e-mail
:On Dec 17, 12:05am, Matthew Dillon wrote:
:} Subject: asleep()/await(), M_AWAIT, etc...
:
:} We add an await() kernel function. This function initiates any timeout
:} and puts the process to sleep, but only if it is still on a sleep queue.
:} If someone (i.e. an interrupt) wakes up the sleep address after the
:} process calls asleep() but before it calls await(), the slpque is
:} cleared and the await() winds up being a NOP.
:
:How likely is this to happen if the process doesn't go to sleep for some
:other reason inbetween the asleep() and the await()? The CPU can execute
:a *lot* of code in the time it takes for physical I/O to happen.
Well, the idea is for asleep() to not interfere with a normal sleep.
If a process does an asleep() and then, for some reason, does a
normal sleep or another asleep() without waiting for the prior event
to occur, the original asleep() condition is lost and an await() later
on that, code-wise, was expecting to wait for the condition earmarked
by the original asleep() will not wait for it, instead causing an
immediate return and thus an immediate retry. This shouldn't cause
a problem, though.
The chance of a condition being signalled after an asleep() but before
the associated await(), assuming no blocking inbetween, is not very
high but I expect it would happen under normal operating conditions
maybe 1 out of every 5000 or so uses.
The situation becomes more interesting when you get into SMP situations,
especially once we start allowing all N processors to enter into supervisor
mode and run mainstream supervisor code simultaniously. It should be
noted that event interlocks can be done very easily with asleep()/await()
without having to mess with the ipl mask. Since the ipl mask doesn't
work when SMP supervisor operation is allowed on > 1 cpu at a time,
it is just as well that another mechanism exists.
:} The purpose of the new routines is to allow blocking conditions to
:} propogate up a subroutine chain and get handled at a higher level rather
:} then at a lower level in those areas of code that cannot afford to
:} leave exclusive locks sitting around. For example, if bread() blocks
:} waiting for a low level disk I/O on a block device, the vnode remains
:} locked throughout which badly mars potential parallelism when multiple
:} programs are accessing the same file. There is no reason to leave the
:} high level vnode locked while bringing a page into the VM buffer cache!
:
:What happens if some other process decides to truncate the file while
:another process is in the middle of paging in a piece of it? If there
:is no reason to care about this sort of thing, then there is no reason
:to hold the lock across the bread(), which would probably be a simple
Well, in this particular case we don't care because it isn't the pagein
into the process's VM space that we are waiting on, it's the bringing
of the page from the underlying block device into the filesystem cache,
which is independant of the overlayed filesystem structure and was
queued to the disk device on the original attempt.
In the case of a truncate, this higher level operation will not effect
the lower level I/O in progress (or, if it does abort it, will wakeup
anybody waiting for that page anyway). The wakeup occurs and the
original requesting task retries its vm fault. On this attempt it
notices the fact that the file has been truncated and does the right
thing. Effectively we are retrying an operation 'from scratch', so
the fact that the truncate occured is handled properly.
Another indirect use for asleep() would be to unwind locks when an inner
lock cannot be obtained and to then retry the entire sequence later when
the inner lock 'might' become attainable. You do this by asleep()ing on
the event of the inner lock getting unlocked, then popping back through
the call stack and unwinding the locks you were able to get, then
sleeping (calling await()) at the top level (holding no locks) and retrying
when you wake up again. This wouldn't work very well for complex locking
(4 or more levels), but I would guess that it would work quite nicely
for the 2-layer locking that we typically do in the kernel.
:} allocation fails would be able to unwind the lock(s), await(), and retry.
:} This is something the current code cannot do at all.
:
:Most things that allocate memory want to scribble on it right after they
:allocate it. Using M_AWAIT would take a fair amount of rewriting. You
:can already do something similar without M_AWAIT by using M_NOWAIT. If
:that fails, unwind the lock, use M_WAITOK, and relock the object. However,
:it would probably be cleaner to just do do MALLOC(..., M_WAITOK) before
:grabbing the lock, if possible.
The point here is that if you cannot afford to block in the procedure
that is doing the memory allocation, you may be able to block in a
higher level procedure. M_NOWAIT and M_WAITOK cannot cover that
situation at all. M_AWAIT (which is like M_NOWAIT but it calls
asleep() as well as returns NULL) *can*. The only implementation
requirement is that the procedure call chain being implemented with
asleep() understand a temporary failure condition and do the right
thing with it (eventually await() and retry from the top level).
:There may be cases where this is not possible. For example, the amount of
:the memory you need to allocate depends on the object that you have locked.
Oh, certainly, but asleep/await do not have to be implemented everywhere,
only in those places where it makes sense to. We aren't removing any
of the prior functionality, we are adding new functionality to allow
us to solve deadlock situations that occur with the old functionality.
:If you have the object unlocked while the memory is being allocated, another
:process may touch the object while it is unlocked and you'll end up allocating
:the wrong amount of memory. The only scheme that works in this case is
:locking the object first and leaving it locked across MALLOC(..., M_WAITOK).
:
:NOTE: some of the softupdates panics before 3.0-RELEASE were caused by
I think you missed the primary point of asleep()/await(). The idea
is that you pop back through subroutine levels, undoing the entire
operation (or a good portion of it), the 'retry later'. What you
describe is precisely the already-existant situation that asleep() and
await() can be used to fix. This might sound expensive, but most
of the places where we would need to use asleep()/await() would not
actually have to pop back more then a few subroutine levels to be
effective.
-Matt
:vnodes inadvertently being unlocked and then relocked in some low level
:routines, which allowed files to be fiddled with by one process while
:another process thought it had exclusive access.
Matthew Dillon Engineering, HiWay Technologies, Inc. & BEST Internet
Communications & God knows what else.
<dillon@backplane.com> (Please include original email in any response)
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message
help
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199812190953.BAA07138>
