From owner-freebsd-current  Sun Aug 29 16:44:43 1999
Delivered-To: freebsd-current@freebsd.org
Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2])
	by hub.freebsd.org (Postfix) with ESMTP id 22ED914C3B
	for <current@FreeBSD.ORG>; Sun, 29 Aug 1999 16:44:41 -0700 (PDT)
	(envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.9.3/8.9.1) id QAA07737;
	Sun, 29 Aug 1999 16:44:33 -0700 (PDT)
	(envelope-from dillon)
Date: Sun, 29 Aug 1999 16:44:33 -0700 (PDT)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <199908292344.QAA07737@apollo.backplane.com>
To: Dmitrij Tejblum <tejblum@arc.hq.cti.ru>
Cc: Doug Rabson <dfr@nlsystems.com>, current@FreeBSD.ORG
Subject: Re: NFSv3 on freebsd<-->solaris 
References:  <199908292220.CAA00778@tejblum.pp.ru>
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

:Note that the application can do lseek on the directory, that is change 
:the next cookie used. It is used by seekdir(). (And, of course, the
:application may lseek to anywhere it like, and the filesystem will have 
:to deal with the bogus cookie.
:...
:
:>     * an NFS readdir rpc is stateless and not monotonic.  The server cannot
:>       tell the difference between a new rpc, a retry, or several different
:>       processes on the client scanning the same directory (running at different
:>       points in the directory).
:
:With the local applications, VOP_READDIR cannot tell the difference 
:too. There may be several program scanning one directory, the program 
:may do seekdir(), the only known thing is the uio_offset, that is the 
:cookie.

    First of all, the positional information returned by the various
    directory calls is only good for the life of the open descriptor.
    This descriptor is stateful.

    Under NFS, file descriptors (actually 'handles') are stateless.

:> 
:>     * An NFS readdir rpc can only approximate cache coherency, but that
:>       doesn't mean you can throw cache coherency out the window.  
:
:What cache coherency? Noone ever mmap() a directory, I hope. After 
:getdirentries() syscall finished, someone may change the directory in 
:any way (just after read() call and a regular file). After the nfs 
:readdir reply sent to the client, someone may change the directory in 
:any way. Again, I don't see any difference. 

    Nobody said anything about mmap().  The client system -- A FreeBSD
    client system - has a buffer cache.  The buffer cache holds an abstraction
    for both files and directories.  

    Our NFS implementation on the client caches the NFS directory via the
    buffer cache.  It translates the cookies returned by the server to
    a block number and offset as cached in the client's buffer cache.

    See nfs_readdirrpc() in sys/nfs/nfs_vnops.c

    This creates a directory-block abstraction on the client.  The 'cookies'
    the client returns to processes are based on this abstraction and do not
    match the cookies returned by the server.

    The problem that we have is that our buffer cache abstraction essentially
    fits a variable number of directory entries returned from the server.  If
    a file is created or deleted on the server, our buffer cache abstraction
    gets thrown for a loop.

    In order to maintain consistency within the set of cached pages (note:
    I'm not talking about cache coherency with the server here, just 
    consistency within the buffer cache on the client), our buffer cache
    abstraction is currently dependant on the verifier key changing on the
    server.  I don't why it was done this way -- perhaps mtime was found to
    not be sufficient.  Maybe because it doesn't have sufficient resolution
    under NFSv2.  Under NFSv3 it should theoretically have sufficient 
    resolution but how many servers do you know keep the nanoseconds field
    updated?

    When applied to files, the use of mtime to determine when to flush the
    cache is nothing more then an inconvenience.  But the use of mtime to
    determine when to flush a directory cache can be fatal.

    -

    If you want to change the way our directory verifier works, you have to
    completely rewrite the directory caching code for the client.  I think
    you can argue that the verifier is not being implemented properly, but
    I'm not going to let anyone change it unless the directory caching code
    on the client is rewritten at the same time to use the server's cookies
    directly.  

    Right now the server's cookies are only used by the client to demark 
    client-buffer-cache buffer boundries.  The actual cookies returned to
    the *process* running on the client are translated from the client's
    buffer cache abstraction of the NFS directory.

    The change that would have to be made would be for the server's cookies
    to be passed through all the way to the process sitting on the client
    rather then translated in the buffer cache.  Then cache consistency in
    our client would then not be as sensitive to the varying amounts of
    information the server sends us and we could safely leave the verifier 
    alone on the server.  This would require us to change the abstraction our
    client uses significantly -- it would not longer be able to use the 
    cookies passed to it by the user process as direct offsets into the
    client's buffer cache.

    So, that's my position.  You can 'fix' the verifier only if you fix
    the client along with it.  It would be an excellent project.  I might
    even have time to do it myself -- but not right now.  If someone wants to
    take this on I'm willing to provide technical support!

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message