From owner-freebsd-stable  Mon May 10  9:57:19 1999
Delivered-To: freebsd-stable@freebsd.org
Received: from herring.nlsystems.com (nlsys.demon.co.uk [158.152.125.33])
	by hub.freebsd.org (Postfix) with ESMTP id C692B1561A
	for <stable@freebsd.org>; Mon, 10 May 1999 09:56:03 -0700 (PDT)
	(envelope-from dfr@nlsystems.com)
Received: from localhost (dfr@localhost)
	by herring.nlsystems.com (8.9.3/8.8.8) with ESMTP id RAA52882;
	Mon, 10 May 1999 17:56:21 +0100 (BST)
	(envelope-from dfr@nlsystems.com)
Date: Mon, 10 May 1999 17:56:20 +0100 (BST)
From: Doug Rabson <dfr@nlsystems.com>
To: Mats Lofkvist <mal@algonet.se>
Cc: stable@freebsd.org
Subject: Re: NFS question..
In-Reply-To: <199905101502.RAA03718@kairos.algonet.se>
Message-ID: <Pine.BSF.4.05.9905101750340.447-100000@herring.nlsystems.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-stable@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

On Mon, 10 May 1999, Mats Lofkvist wrote:

>   <snip>
>   Well, I understand the issues (or at least I think so). But I am
>   interested in fast, working NFS implementation (which I know could exist
>   because Linux does it) and not in explanations (system administration is
>   not my primary job). I can trade some bit of stability for performance in
>   case of safe/unsafe NFS write modes.
> 
> Linux NFS isn't perfect either; two Sun's (Solaris 2.5.1 and 2.6
> respectively) mounting filesystems from a Linux NFS server at work
> have continous problems with files randomly being unreadable.
> Upgrading the Linux server from RedHat something based on 2.0.36
> to Debian something based on 2.2.6 didn't seem to make any difference.

Linux is fast because it violates the spec (this really pisses me off).
The specification for NFSv2 states that the reply to a write rpc shouldn't
be sent until the write has been completed. From rfc1094:

   All of the procedures in the NFS protocol are assumed to be
   synchronous.  When a procedure returns to the client, the client can
   assume that the operation has completed and any data associated with
   the request is now on stable storage.  For example, a client WRITE
   request may cause the server to update data blocks, filesystem
   information blocks (such as indirect blocks), and file attribute
   information (size and modify times).  When the WRITE returns to the
   client, it can assume that the write is safe, even in case of a
   server crash, and it can discard the data written.  This is a very
   important part of the statelessness of the server.  If the server
   waited to flush data from remote requests, the client would have to
   save those requests so that it could resend them in case of a server
   crash.

The linux server appears to ack the write as soon as it has been handed
off to the kernel's buffer cache (which is certainly not stable storage).
If you want FreeBSD to do this, you can set the sysctl variable
vfs.nfs.async to nonzero. The default for this is off since turning it on
risks data loss.

Alternatively you can use NFSv3 which uses a more complex protocol which
allows the server to delay the writes safely. If the linux clients can't
do NFSv3, perhaps you would consider replacing them with FreeBSD
clients...

--
Doug Rabson				Mail:  dfr@nlsystems.com
Nonlinear Systems Ltd.			Phone: +44 181 442 9037




To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message