From owner-freebsd-hackers Thu Jan 7 18:20:15 1999 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id SAA14365 for freebsd-hackers-outgoing; Thu, 7 Jan 1999 18:20:15 -0800 (PST) (envelope-from owner-freebsd-hackers@FreeBSD.ORG) Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id SAA14354 for ; Thu, 7 Jan 1999 18:20:12 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.9.1/8.9.1) id SAA36593; Thu, 7 Jan 1999 18:19:34 -0800 (PST) (envelope-from dillon) Date: Thu, 7 Jan 1999 18:19:34 -0800 (PST) From: Matthew Dillon Message-Id: <199901080219.SAA36593@apollo.backplane.com> To: Alfred Perlstein Cc: Terry Lambert , dyson@iquest.net, pfgiffun@bachue.usc.unal.edu.co, freebsd-hackers@FreeBSD.ORG Subject: Re: questions/problems with vm_fault() in Stable Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG :... :.. :[snip] : :MFS is just lazyness, if you want it to grow/work right, you rewrite it :instead of hacking FFS on top of it. : :or you design it in such a way that it's a device that FFS is somewhat :aware of. this way when a block is asked to be filled what really happens :is that the block passed in is put on the free block list and FFS is given :a page of the MFS to use, when FFS pushes the block back to MFS the :replaced page is put back under the vnode. : :MFS then becomes a device instead of a filesystem. : :although i think it violates some abstraction, does this make sense? : :-Alfred This actually does make some sense. What you are basically saying is that it should be possible for the MFS device to rename a VM page cached at a lower layer (mfsobj,page#) to a higher layer (ffs_sub_object,page#). This isn't possible with the current VFS layering. That is, the current VFS layering will pass a KVA mapped buffer down, but it does not expect the lower layer to physically replace the pages associated with the buffer with its own pages. Also, while the clean/dirty state of the page could be retained, the relationship with the lower layer's page's backing store would be lost when it renames the page ( backing store works differently depend on the type of object and cannot be transported across VFS layers ). The page, clean, dirty, or TBD (to be destroyed) state would have to eventually be passed back down to the lower layer when the upper layer is done with it... an extremely dangerous proposition. Implementing a vm_alias would solve half the problem - the lower layer would no longer have to 'loose' its reference to the page, and the upper layer can manipulate the pages in its own object space without having to worry about odd interactions with other layers. If the VFS/BIO system were then changed to *NOT* pass KVA buffers down but instead work solely with bp->b_pages[] arrays, then the upper layer could theoretically instantiate vm_alias's in the array that are initially not associated with any real VM page and pass that down to the lower layer. The lower layer could then simply [re]link the vm_alias's into the proper VM page chains, allocating new physical pages as necessary. If we were to do that, then we would have about 70% of the cache coherency problem solved too - 90% if we discount crossing a network. If the vm_aliases teardown always occurs from the top-down (either by the devices or by the vm_pageout process), pages passed back and forth in this manner would be cache coherent within any given machine. The exceptions would be, mainly, file fragments less then a page in size. Each alias would be able to maintain its own clean/dirty state to optimize teardown operations ( there would also be a general dirty state in the root vm_page ). So to round out the solution a two-way cache coherency protocol is required on top of the vm_aliasing. The protocol is necessary to handle both special cases like file fragments, and changes in coherency that propogate from the bottom-up ( for example, if some other host modifies the same file on the NFS server that you are messing with ). If we make this protocol slightly more complex, it could be made to work over a network hop as well as internally. This is effectively what John and I are putting forth as a solution (though we are debating other ways of doing the equivalent function of 'vm_alias'). -Matt Matthew Dillon Engineering, HiWay Technologies, Inc. & BEST Internet Communications & God knows what else. (Please include original email in any response) To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message