From owner-svn-src-head@freebsd.org Thu Jun 14 21:12:09 2018 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 99D85100753A; Thu, 14 Jun 2018 21:12:09 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FA2982B9A; Thu, 14 Jun 2018 21:12:09 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 30C3C61A0; Thu, 14 Jun 2018 21:12:09 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w5ELC9Ya016532; Thu, 14 Jun 2018 21:12:09 GMT (envelope-from rmacklem@FreeBSD.org) Received: (from rmacklem@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w5ELC8Bu016531; Thu, 14 Jun 2018 21:12:08 GMT (envelope-from rmacklem@FreeBSD.org) Message-Id: <201806142112.w5ELC8Bu016531@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: rmacklem set sender to rmacklem@FreeBSD.org using -f From: Rick Macklem Date: Thu, 14 Jun 2018 21:12:08 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r335175 - head/usr.sbin/nfsd X-SVN-Group: head X-SVN-Commit-Author: rmacklem X-SVN-Commit-Paths: head/usr.sbin/nfsd X-SVN-Commit-Revision: 335175 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Jun 2018 21:12:09 -0000 Author: rmacklem Date: Thu Jun 14 21:12:08 2018 New Revision: 335175 URL: https://svnweb.freebsd.org/changeset/base/335175 Log: Add a new man page that briefly describes the pNFS variant of the NFSv4.1 protocol. This is a content change. Added: head/usr.sbin/nfsd/pnfs.4 (contents, props changed) Modified: head/usr.sbin/nfsd/Makefile Modified: head/usr.sbin/nfsd/Makefile ============================================================================== --- head/usr.sbin/nfsd/Makefile Thu Jun 14 20:55:33 2018 (r335174) +++ head/usr.sbin/nfsd/Makefile Thu Jun 14 21:12:08 2018 (r335175) @@ -2,6 +2,6 @@ # $FreeBSD$ PROG= nfsd -MAN= nfsd.8 nfsv4.4 stablerestart.5 +MAN= nfsd.8 nfsv4.4 stablerestart.5 pnfs.4 .include Added: head/usr.sbin/nfsd/pnfs.4 ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ head/usr.sbin/nfsd/pnfs.4 Thu Jun 14 21:12:08 2018 (r335175) @@ -0,0 +1,187 @@ +.\" Copyright (c) 2017 Rick Macklem +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd March 26, 2018 +.Dt PNFS 4 +.Os +.Sh NAME +.Nm pNFS +.Nd NFS Version 4.1 Parallel NFS Protocol +.Sh DESCRIPTION +The NFSv4.1 client and server provides support for the +.Tn pNFS +specification; see +.%T "Network File System (NFS) Version 4 Minor Version 1 Protocol RFC 5661" . +A pNFS service separates Read/Write operations from all other NFSv4.1 +operations, which are referred to as Metadata operations. +The Read/Write operations are performed directly on the Data Server (DS) +where the file's data resides, bypassing the NFS server. +All other file operations are performed on the NFS server, which is referred to +as a Metadata Server (MDS). +NFS clients that do not support +.Tn pNFS +perform Read/Write operations on the MDS, which acts as a proxy for the +appropriate DS(s). +.Pp +The NFSv4.1 protocol provides two pieces of information to pNFS aware +clients that allow them to perform Read/Write operations directly on +the DS. +.Pp +The first is DeviceInfo, which is static information defining the DS +server. +The critical piece of information in DeviceInfo for the layout types +supported by FreeBSD is the IP address that is used to perform RPCs on the DS. +It also indicates which version of NFS the DS supports, I/O size and other +layout specific information. +In the DeviceInfo, there is a DeviceID which, for the FreeBSD server +is unique to the DS configuration +and changes whenever the +.Xr nfsd +daemon is restarted or the server is rebooted. +.Pp +The second is the layout, which is per file and references the DeviceInfo +to use via the DeviceID. +It is for a byte range of a file and is either Read or Read/Write. +For the FreeBSD server, a layout covers all bytes of a file. +A layout may be recalled by the MDS using a LayoutRecall callback. +When a client returns a layout via the LayoutReturn operation it can +indicate that error(s) were encountered while doing I/O on the DS. +.Pp +The FreeBSD client and server supports two layout types. +.Pp +The File Layout is described in RFC5661 and uses the NFSv4.1 protocol +to perform I/O on the DS. +It does not support client aware DS mirroring and, as such, +the FreeBSD server only provides File Layout support for non-mirrored +configurations. +.Pp +The Flexible File Layout allows the use of the NFSv3, NFSv4.0 or NFSv4.1 +protocol to perform I/O on the DS and does support client aware mirroring. +As such, the FreeBSD server uses Flexible File Layout layouts for the +mirrored DS configurations. +The FreeBSD server supports the +.Dq tightly coupled +variant and all DSs use the +NFSv4.1 protocol for I/O operations. +Clients that support the Flexible File Layout will do writes and commits +to all DS mirrors in the mirror set. +.Pp +A FreeBSD pNFS service consists of a single MDS server plus one or more +DS servers, all of which are FreeBSD systems. +For a non-mirrored configuration, the FreeBSD server will issue File Layout +layouts by default. +However that default can be set to the Flexible File Layout by setting the +.Xr sysctl 1 +sysctl ``vfs.nfsd.default_flexfile'' to one. +Mirrored server configurations will only issue Flexible File Layouts. +.Tn pNFS +clients mount the MDS as they would a single NFS server. +.Pp +A FreeBSD +.Tn pNFS +client must be running the +.Xr nfscbd 8 +daemon and use the mount options +.Dq nfsv4,minorversion=1,pnfs . +.Pp +When files are created, the MDS creates a file tree identical to what a +single NFS server creates, except that all the regular (VREG) files will +be empty. +As such, if you look at the exported tree on the MDS directly +on the MDS server (not via an NFS mount), the files will all be of size zero. +Each of these files will also have two extended attributes in the system +attribute name space: +.Bd -literal -offset indent +pnfsd.dsfile - This extended attrbute stores the information that the + MDS needs to find the data file on a DS for this file. +pnfsd.dsattr - This extended attribute stores the Size, AccessTime, + ModifyTime and Change attributes for the file. +.Ed +.Pp +For each regular (VREG) file, the MDS creates a data file on one +(or on N of them for the mirrored case, where N is the mirror_level) +of the DSs where the file's data will be stored. +The name of this file is +the file handle of the file on the MDS in hexadecimal at time of file creation. +The data file will have the same file ownership, mode and NFSv4 ACL +(if ACLs are enabled for the file system) as the file on the MDS, so that +permission checking can be done on the DS. +This is referred to as +.Dq tightly coupled +for the Flexible File Layout. +.Pp +For +.Tn pNFS +aware clients, the service generates File Layout +or Flexible File Layout +layouts and associated DeviceInfo. +For non-pNFS aware NFS clients, the pNFS service appears just like a normal +NFS service. +For the non-pNFS aware client, the MDS will perform I/O operations on the appropriate DS(s), acting as +a proxy for the non-pNFS aware client. +This is also true for NFSv3 and NFSv4.0 mounts, since these are always non-pNFS +aware. +.Pp +See +.Bd -literal -offset indent +http://people.freebsd.org/~rmacklem/pnfs-planb-setup.txt +.Ed +.sp +for information on how to set up a FreeBSD pNFS service. +.Sh SEE ALSO +.Xr nfsv4 4 , +.Xr exports 5 , +.Xr fstab 5 , +.Xr rc.conf 5 , +.Xr nfscbd 8 , +.Xr nfsd 8 , +.Xr nfsuserd 8 , +.Xr pnfsdscopymr 8 , +.Xr pnfsdsfile 8 , +.Xr pnfsdskill 8 +.Sh BUGS +Linux kernel versions prior to 4.12 only supports NFSv3 DSs in its client +and will do all I/O through the MDS. +For Linux 4.12 kernels, support for NFSv4.1 DSs was added, but I have seen +Linux client crashes when testing this client. +For Linux 4.17-rc2 kernels, I have not seen client crashes during testing, +but it only supports the +.Dq loosely coupled +variant. +To make it work correctly when mounting the FreeBSD server, you must either +patch the Flexible File Layout client driver with a patch like: +.Bd -literal -offset indent +http://people.freebsd.org/~rmacklem/flexfile.patch +.Ed +.sp +or set the sysctl +.Dq vfs.nfsd.flexlinuxhack +to one so that it works around +the Linux client driver's limitations. +.Pp +Since the MDS cannot be mirrored, it is a single point of failure just +as a non +.Tn pNFS +server is.