Date: Thu, 7 Jan 1999 18:19:34 -0800 (PST) From: Matthew Dillon <dillon@apollo.backplane.com> To: Alfred Perlstein <bright@hotjobs.com> Cc: Terry Lambert <tlambert@primenet.com>, dyson@iquest.net, pfgiffun@bachue.usc.unal.edu.co, freebsd-hackers@FreeBSD.ORG Subject: Re: questions/problems with vm_fault() in Stable Message-ID: <199901080219.SAA36593@apollo.backplane.com>
next in thread | raw e-mail | index | archive | help
:...
:..
:[snip]
:
:MFS is just lazyness, if you want it to grow/work right, you rewrite it
:instead of hacking FFS on top of it.
:
:or you design it in such a way that it's a device that FFS is somewhat
:aware of. this way when a block is asked to be filled what really happens
:is that the block passed in is put on the free block list and FFS is given
:a page of the MFS to use, when FFS pushes the block back to MFS the
:replaced page is put back under the vnode.
:
:MFS then becomes a device instead of a filesystem.
:
:although i think it violates some abstraction, does this make sense?
:
:-Alfred
This actually does make some sense. What you are basically saying
is that it should be possible for the MFS device to rename a
VM page cached at a lower layer (mfsobj,page#) to a higher layer
(ffs_sub_object,page#).
This isn't possible with the current VFS layering. That is, the
current VFS layering will pass a KVA mapped buffer down, but it
does not expect the lower layer to physically replace the pages
associated with the buffer with its own pages. Also, while the
clean/dirty state of the page could be retained, the relationship
with the lower layer's page's backing store would be lost when it
renames the page ( backing store works differently depend on the
type of object and cannot be transported across VFS layers ). The
page, clean, dirty, or TBD (to be destroyed) state would have to
eventually be passed back down to the lower layer when the upper layer
is done with it... an extremely dangerous proposition.
Implementing a vm_alias would solve half the problem - the lower layer
would no longer have to 'loose' its reference to the page, and the
upper layer can manipulate the pages in its own object space without
having to worry about odd interactions with other layers. If the
VFS/BIO system were then changed to *NOT* pass KVA buffers down but
instead work solely with bp->b_pages[] arrays, then the upper layer
could theoretically instantiate vm_alias's in the array that are
initially not associated with any real VM page and pass that down to
the lower layer. The lower layer could then simply [re]link the
vm_alias's into the proper VM page chains, allocating new physical
pages as necessary.
If we were to do that, then we would have about 70% of the cache
coherency problem solved too - 90% if we discount crossing a network.
If the vm_aliases teardown always occurs from the top-down (either by
the devices or by the vm_pageout process), pages passed back and
forth in this manner would be cache coherent within any given machine.
The exceptions would be, mainly, file fragments less then a page in
size. Each alias would be able to maintain its own clean/dirty state
to optimize teardown operations ( there would also be a general dirty
state in the root vm_page ).
So to round out the solution a two-way cache coherency protocol is
required on top of the vm_aliasing. The protocol is necessary to handle
both special cases like file fragments, and changes in coherency that
propogate from the bottom-up ( for example, if some other host modifies
the same file on the NFS server that you are messing with ). If we make
this protocol slightly more complex, it could be made to work over a
network hop as well as internally.
This is effectively what John and I are putting forth as a solution
(though we are debating other ways of doing the equivalent function
of 'vm_alias').
-Matt
Matthew Dillon Engineering, HiWay Technologies, Inc. & BEST Internet
Communications & God knows what else.
<dillon@backplane.com> (Please include original email in any response)
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199901080219.SAA36593>
