Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 07 Jul 1999 20:06:33 -0700
From:      Jason Thorpe <thorpej@nas.nasa.gov>
To:        Matthew Dillon <dillon@apollo.backplane.com>
Cc:        Julian Elischer <julian@whistle.com>, David Greenman <dg@root.com>, freebsd-hackers@FreeBSD.ORG, freebsd-current@FreeBSD.ORG
Subject:   Re: Heh heh, humorous lockup 
Message-ID:  <199907080306.UAA21280@lestat.nas.nasa.gov>

next in thread | raw e-mail | index | archive | help
On Wed, 7 Jul 1999 18:21:03 -0700 (PDT) 
 Matthew Dillon <dillon@apollo.backplane.com> wrote:

 >     Now, I also believe that when UVM maps those pages, it makes them 
 >     copy-on-write so I/O can be initiated on the data without having to
 >     stall anyone attempting to make further modifications to the VM object.
 >     Is this correct?  This is something I would like to throw into FreeBSD
 >     at some point.  It would get rid of all the freeze/bogus-page hacks
 >     already in there and avoid a number of I/O blocking conditions that we 
 >     currently face. 

Um...

In UVM+UBC, VOP_GETPAGES() and VOP_PUTPAGES() operate on pages marked w/
PG_BUSY.  In the case of faulting a page in, the mapping isn't yet entered
into the VA at which it will be accessed, and in the case of a page being
paged out, the page has been deactivated (and thus has had all mappings
removed) and marked PG_BUSY.  Thus, if a fault which would reactivate the
page occurs, the fault handler waits for PG_BUSY to clear before reentering
the mapping at the VA where the page will be accessed.

Now, while the fault handler is waiting for PG_BUSY to clear, something else
can certainly modify the object... But in the case of faulting on the same
page, the second thread will wait for PG_BUSY to clear, too, since the page
has already been inserted into the object.

The pager's KVA for the page is read/write, for obvious reasons[*].

[*] Mapping a faulting page *at all* is suboptimal, of course.  You'd
prefer to see if the device can handle a physical address (or a uio with
physical addresses), and if so, use that.  This is faster, and eliminates
bad cache interactions on VAC systems.  You really only want to map the
page if you have to do PIO to/from it.

This is all handled via the ubc_pager (which has a special fault routine,
part of UVM's basic infrastructure).  VOP_GETPAGES() and VOP_PUTPAGES()
are basically helper routines for the ubc_pager (effectively turning the
file systems themselves into "pagers").

 >     However, I do not like the idea of taking page faults in kernel mode, 
 >     which I believe UVM also does -- but I think the above could be 
 >     implemented in FreeBSD without taking page faults.

Taking page faults in kernel mode is a perfectly reasonable thing to
do, if you don't need to access those addresses in interrupt context.

What is the reason for your aversion to pageable mappings in the kernel?

 >     Well, I do not like the "nuke the object chains" part of UVM.  From what
 >     I can tell UVM is doing a considerable amount of extra work to avoid the 
 >     object chain stuff, but only saving a small amount of overhead on
 >     vm_fault's ( though, compared to the original Mach stuff the UVM stuff is
 >     much, much better ).  We've made a considerable number of improvements
 >     to our vm_object's in the last few months.  But I do like the idea 
 >     of a VM-specific substructure for vnodes and I do agree that embedding
 >     the master VM object in the vnode is a good idea.

Nuking object chains actually made things *simpler*.  The locking protocol,
in the face of object chains, is nighmareish.  With amap-on-top-object-on-
bottom, it's simple, makes fault handling quite fast, and eliminates all
the complexity otherwise necessary in collasping those nasty object chains
(where the various objects in the chain may be referenced by more than one
map entry).

...and, in some cases, it's NOT a "small amount of overhead".  Whereas the
Mach object chains may be of arbitrary length (think about a program which
forks often), involving a potentially hash computation for each object
in the chain, the UVM case is always, at the most, two layers (in the
current amap implementation, the lookup is always a direct index into an
array, and the object underneath is a hash lookup).

        -- Jason R. Thorpe <thorpej@nas.nasa.gov>



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199907080306.UAA21280>