From owner-freebsd-smp  Fri Apr 27 11:24:53 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from earth.backplane.com (earth-nat-cw.backplane.com [208.161.114.67])
	by hub.freebsd.org (Postfix) with ESMTP
	id 1833B37B423; Fri, 27 Apr 2001 11:24:49 -0700 (PDT)
	(envelope-from dillon@earth.backplane.com)
Received: (from dillon@localhost)
	by earth.backplane.com (8.11.2/8.11.2) id f3RIOlW10289;
	Fri, 27 Apr 2001 11:24:47 -0700 (PDT)
	(envelope-from dillon)
Date: Fri, 27 Apr 2001 11:24:47 -0700 (PDT)
From: Matt Dillon <dillon@earth.backplane.com>
Message-Id: <200104271824.f3RIOlW10289@earth.backplane.com>
To: Terry Lambert <tlambert@primenet.com>
Cc: bright@wintelcom.net (Alfred Perlstein), smp@FreeBSD.ORG,
	jhb@FreeBSD.ORG
Subject: Re: that vm diff now makes it into single user mode.
References:  <200104271803.LAA02107@usr07.primenet.com>
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

:FWIW:
:
:The correact approach to this is to lock the vnode, and treat
:everything lower than that as "opaque".  If you can lock the
:vnode, you can make any call you want on it.
:
:Part of the problem here os some of the earlier "clean up"
:work that damaged the VFS interface.
:
:Really, the locks need to be asserted in the VOP_* macro,
:prior to calldown, and the locks need to be reentrant for
:the time being, for the same VFS.
:
:					Terry Lambert
:					terry@lambert.org

    Eeek!  This isn't possible with our code base, Terry.  What we really
    need to do is separate out read/write/truncate operations and make
    them functions of the VM object rather then the vnode.

   direct API     direct API
      |		     |
      V              V
    vnode -+--> VM object ---> filesystem (read/write/truncate)
	   |
	   +-----------------> filesystem (other ops)
	  

    The problem we face now is that the VM system mostly bypasses the VNode...
    it bypasses the vnode entirely for anything that does not require actual
    I/O, so you have a lock reversal situation where VOP operations lock the
    vnode and then might lock the VM object, and VM operations (would have to)
    lock the VM object and then lock the VNode.  When you add vm_map's and
    the buffer cache into the fray it gets even worse.

    The solution is to funnel all VMIO operations: read, write, and
    truncate/extend, through the VM object always.

    Now a VNODE op can lock the vnode and, if a read/write/truncate,
    then lock the VM object.  A direct VMIO operation can bypass the vnode
    and lock the VM object directly (i.e. never need to lock the vnode).

    This allows us to separate meta filesystem operations from reads, writes,
    and truncate/extend operations, which in turn allows us to optimize 
    reads, writes, and truncate/extend operations through the VM Object
    (which is well suited for such optimizations) rather then having to build
    such optimizations into the VNode (which is not well suited for such
    operations).  This also allows the VM Object to serve as the cache
    for all I/O operations rather then having to go deep into the filesystem
    code and then access the VM Page cache indirectly through the buffer
    cache.

						-Matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message