From owner-freebsd-stable Mon May 10 9:57:19 1999 Delivered-To: freebsd-stable@freebsd.org Received: from herring.nlsystems.com (nlsys.demon.co.uk [158.152.125.33]) by hub.freebsd.org (Postfix) with ESMTP id C692B1561A for ; Mon, 10 May 1999 09:56:03 -0700 (PDT) (envelope-from dfr@nlsystems.com) Received: from localhost (dfr@localhost) by herring.nlsystems.com (8.9.3/8.8.8) with ESMTP id RAA52882; Mon, 10 May 1999 17:56:21 +0100 (BST) (envelope-from dfr@nlsystems.com) Date: Mon, 10 May 1999 17:56:20 +0100 (BST) From: Doug Rabson To: Mats Lofkvist Cc: stable@freebsd.org Subject: Re: NFS question.. In-Reply-To: <199905101502.RAA03718@kairos.algonet.se> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Mon, 10 May 1999, Mats Lofkvist wrote: > > Well, I understand the issues (or at least I think so). But I am > interested in fast, working NFS implementation (which I know could exist > because Linux does it) and not in explanations (system administration is > not my primary job). I can trade some bit of stability for performance in > case of safe/unsafe NFS write modes. > > Linux NFS isn't perfect either; two Sun's (Solaris 2.5.1 and 2.6 > respectively) mounting filesystems from a Linux NFS server at work > have continous problems with files randomly being unreadable. > Upgrading the Linux server from RedHat something based on 2.0.36 > to Debian something based on 2.2.6 didn't seem to make any difference. Linux is fast because it violates the spec (this really pisses me off). The specification for NFSv2 states that the reply to a write rpc shouldn't be sent until the write has been completed. From rfc1094: All of the procedures in the NFS protocol are assumed to be synchronous. When a procedure returns to the client, the client can assume that the operation has completed and any data associated with the request is now on stable storage. For example, a client WRITE request may cause the server to update data blocks, filesystem information blocks (such as indirect blocks), and file attribute information (size and modify times). When the WRITE returns to the client, it can assume that the write is safe, even in case of a server crash, and it can discard the data written. This is a very important part of the statelessness of the server. If the server waited to flush data from remote requests, the client would have to save those requests so that it could resend them in case of a server crash. The linux server appears to ack the write as soon as it has been handed off to the kernel's buffer cache (which is certainly not stable storage). If you want FreeBSD to do this, you can set the sysctl variable vfs.nfs.async to nonzero. The default for this is off since turning it on risks data loss. Alternatively you can use NFSv3 which uses a more complex protocol which allows the server to delay the writes safely. If the linux clients can't do NFSv3, perhaps you would consider replacing them with FreeBSD clients... -- Doug Rabson Mail: dfr@nlsystems.com Nonlinear Systems Ltd. Phone: +44 181 442 9037 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message