From owner-freebsd-hackers  Wed Jan  6 20:04:34 1999
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Received: (from majordom@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id UAA28093
          for freebsd-hackers-outgoing; Wed, 6 Jan 1999 20:04:34 -0800 (PST)
          (envelope-from owner-freebsd-hackers@FreeBSD.ORG)
Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2])
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id UAA28086
          for <freebsd-hackers@FreeBSD.ORG>; Wed, 6 Jan 1999 20:04:32 -0800 (PST)
          (envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.9.1/8.9.1) id UAA27395;
	Wed, 6 Jan 1999 20:03:58 -0800 (PST)
	(envelope-from dillon)
Date: Wed, 6 Jan 1999 20:03:58 -0800 (PST)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <199901070403.UAA27395@apollo.backplane.com>
To: Terry Lambert <tlambert@primenet.com>
Cc: dyson@iquest.net, pfgiffun@bachue.usc.unal.edu.co,
        freebsd-hackers@FreeBSD.ORG
Subject: Re: questions/problems with vm_fault() in Stable
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

:>     The VFS layer should make no
:>     assumptions whatsoever as to who attaches to it on the frontside,
:>     and who it is attached to on the backside.
:
:Fine and dandy, if you can tell me the answers to the following
:questions:
:
:1)	The system call layer makes VFS calls.  How can I stack a
:	VFS *on top of* the system call layer?

    The system call layer is not a VFS layer. 

:2)	The NFS server VFS makes RPC calls.  How can I stack a
:	VFS *under* the NFS server VFS?

    An NFS server in its current incarnation is not a VFS layer.  

:The problem exists in streams as well.  Somewhere, there has to be a
:stream head.  And on the other end, somewhere there has to be a driver.

    Yes, but these are not VFS layers.  They have nothing at all to do
    with anything.  The VFS layers do not and should not know or care (A) who
    calls them, or (B) who they call underneath, as long as the API and
    protocol is followed.

    The point is that if you go around trying to assign special circumstances
    to the head or tail of VFS layers, you wind up in the same boat where we 
    are now - where VFS layers which were never designed to terminate on
    anything other then hard block device now, magically, are being terminated
    on a soft block.  For example, UFS/FFS was never designed to terminate
    on memory, much less swap-backed memory.  Then came along MFS and
    suddently were (and still are) all sorts of problems.

:>     frontside 'provider'.  And so forth.  But don't try to 'type' a VFS
:>     layer -- it doesn't work.  It was precisely that sort of thinking
:>     that required something like the MFS filesystem, which blurs distinctions,
:>     to be a major hack in existing kernels.
:
:I'm not trying to 'type' a VFS layer.  The problem is that some
:idiot (who was right) thought it's be faster to implement block
:access in FS's that need block access, instead of creating a generic
:"stream tail" that implemented the buffer cache interface.
:
:If they had done that, then the VOP_GETPAGES/VOP_PUTPAGES would
:directly access the VOP_GETBLOCKRANGE/VOP_PUTBLOCKRANGE of the

    There's nothing wrong with block access - it is, after all, what most
    VFS layers expect.  But to implement block access only through 
    VOP_GETPAGES/PUTPAGES is insane.  That's why the vm_object model
    needs to be extended to encompass VFS layering - so the VM
    system can use it's ability to cache pages to shortcut (when possible)
    multiple VFS layers and to maintain cache coherency between VFS
    layers, and in order to get the efficiency that cache coherency gives
    you - much less memory waste.

    The GETPAGES/PUTPAGES model *cannot* maintain cache coherency across
    VFS layers.  It doesn't work.  It has never worked.  That's the fraggin
    problem!

:>     The only way to do cache coherency through a multi-layered VFS design
:>     is to extend the vm_object model.  You *cannot* require that a VM
:>     system use VOP_GETPAGES or VOP_PUTPAGES whenever it wants to verify
:>     the validity of a page it already has in the cache.  If a page is sitting
:...
:
:OK.  You are considering the case where I have two vnodes pointing
:to the same page, and I invalidate the page in the underlying vnode,
:and asking "how do I make the reference in the upper vnode go away?",
:right?
:
:The way you "make the reference in the upper vnode go away" is by
:not putting a blessed reference there in the first place.  Problem
:solved.  No coherency problem because the problem page is not
:cached in two places.

    Uh, I think you missed the point.  What you are basically saying is:
    "I don't want cache coherency".... because that is what you get.  That
    is, in fact, what we have now, and it means that things like MFS
    waste a whole lot of memory double-caching pages and that it is not
    possible to span VFS layers across a network in any meaningful way.

:The page's validity is known by whether or not it's valid bit is
:set.  What you *do* have to do is go through the routines for
:VOP_GETPAGES/VOP_PUTPAGES if you want to change the status of a

    This doesn't work if the VFS layering traverses a network.  Furthermore,
    it is *extremely* inefficient.  Hell, it doesn't even maintain cache
    coherency even if you DO use VOP_GETPAGES/PUTPAGES.

    Now, Terry, if you are arguing that we don't need cache coherency, then
    ok ... but if you are arguing that we should have cache coherency, you
    need to reexamine the plate.

    I, and John, are arguing that a lack of cache coherency (covering both
    data pages and filesystem ops) is a serious stumbling block that needs
    to be addressed.  Big time.

:					Terry Lambert
:					terry@lambert.org

    Matthew Dillon  Engineering, HiWay Technologies, Inc. & BEST Internet 
                    Communications & God knows what else.
    <dillon@backplane.com> (Please include original email in any response)    

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message