From owner-svn-src-projects@freebsd.org Sun Apr 29 05:37:47 2018 Return-Path: Delivered-To: svn-src-projects@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6B9B5FC71E8 for ; Sun, 29 Apr 2018 05:37:47 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2045971522; Sun, 29 Apr 2018 05:37:47 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 1B39F202FB; Sun, 29 Apr 2018 05:37:47 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w3T5bkbG034102; Sun, 29 Apr 2018 05:37:46 GMT (envelope-from rmacklem@FreeBSD.org) Received: (from rmacklem@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w3T5bkoT034101; Sun, 29 Apr 2018 05:37:46 GMT (envelope-from rmacklem@FreeBSD.org) Message-Id: <201804290537.w3T5bkoT034101@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: rmacklem set sender to rmacklem@FreeBSD.org using -f From: Rick Macklem Date: Sun, 29 Apr 2018 05:37:46 +0000 (UTC) To: src-committers@freebsd.org, svn-src-projects@freebsd.org Subject: svn commit: r333088 - projects/pnfs-planb-server/usr.bin/pnfsdsfile X-SVN-Group: projects X-SVN-Commit-Author: rmacklem X-SVN-Commit-Paths: projects/pnfs-planb-server/usr.bin/pnfsdsfile X-SVN-Commit-Revision: 333088 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-projects@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "SVN commit messages for the src " projects" tree" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Apr 2018 05:37:47 -0000 Author: rmacklem Date: Sun Apr 29 05:37:46 2018 New Revision: 333088 URL: https://svnweb.freebsd.org/changeset/base/333088 Log: Get rid of newlines at the end of err() and errx() plus add the file name to one of them. (I always forget that these functions append a newline to the message.) Modified: projects/pnfs-planb-server/usr.bin/pnfsdsfile/pnfsdsfile.c Modified: projects/pnfs-planb-server/usr.bin/pnfsdsfile/pnfsdsfile.c ============================================================================== --- projects/pnfs-planb-server/usr.bin/pnfsdsfile/pnfsdsfile.c Sat Apr 28 17:55:28 2018 (r333087) +++ projects/pnfs-planb-server/usr.bin/pnfsdsfile/pnfsdsfile.c Sun Apr 29 05:37:46 2018 (r333088) @@ -87,19 +87,19 @@ main(int argc, char *argv[]) /* Replace the first DS server with the second one. */ if (zerofh != 0 || zerods != 0) errx(1, "-c, -r and -z are mutually " - "exclusive\n"); + "exclusive"); if (res != NULL) - errx(1, "-c and -s are mutually exclusive\n"); + errx(1, "-c and -s are mutually exclusive"); strlcpy(hostn, optarg, 2 * NI_MAXHOST + 2); cp = strchr(hostn, ','); if (cp == NULL) - errx(1, "Bad -c argument %s\n", hostn); + errx(1, "Bad -c argument %s", hostn); *cp = '\0'; if (getaddrinfo(hostn, NULL, NULL, &res) != 0) - errx(1, "Can't get IP# for %s\n", hostn); + errx(1, "Can't get IP# for %s", hostn); *cp++ = ','; if (getaddrinfo(cp, NULL, NULL, &newres) != 0) - errx(1, "Can't get IP# for %s\n", cp); + errx(1, "Can't get IP# for %s", cp); break; case 'q': quiet = 1; @@ -108,24 +108,24 @@ main(int argc, char *argv[]) /* Reset the DS server in a mirror with 0.0.0.0. */ if (zerofh != 0 || res != NULL || newres != NULL) errx(1, "-r and -s, -z or -c are mutually " - "exclusive\n"); + "exclusive"); zerods = 1; /* Translate the server name to an IP address. */ if (getaddrinfo(optarg, NULL, NULL, &res) != 0) - errx(1, "Can't get IP# for %s\n", optarg); + errx(1, "Can't get IP# for %s", optarg); break; case 's': if (res != NULL) errx(1, "-s, -c and -r are mutually " - "exclusive\n"); + "exclusive"); /* Translate the server name to an IP address. */ if (getaddrinfo(optarg, NULL, NULL, &res) != 0) - errx(1, "Can't get IP# for %s\n", optarg); + errx(1, "Can't get IP# for %s", optarg); break; case 'z': if (newres != NULL || zerods != 0) errx(1, "-c, -r and -z are mutually " - "exclusive\n"); + "exclusive"); zerofh = 1; break; default: @@ -145,7 +145,7 @@ main(int argc, char *argv[]) "pnfsd.dsfile", dsfile, sizeof(dsfile)); mirrorcnt = xattrsize / sizeof(struct pnfsdsfile); if (mirrorcnt < 1 || xattrsize != mirrorcnt * sizeof(struct pnfsdsfile)) - err(1, "Can't get extattr pnfsd.dsfile\n"); + err(1, "Can't get extattr pnfsd.dsfile for %s", *argv); if (quiet == 0) printf("%s:\t", *argv); @@ -155,7 +155,7 @@ main(int argc, char *argv[]) /* Do the zerofh option. You must be root. */ if (zerofh != 0) { if (geteuid() != 0) - errx(1, "Must be root/su to zerofh\n"); + errx(1, "Must be root/su to zerofh"); /* * Do it for the server specified by -s/--ds or all @@ -188,7 +188,7 @@ main(int argc, char *argv[]) /* Do the zerods option. You must be root. */ if (zerods != 0 && mirrorcnt > 1) { if (geteuid() != 0) - errx(1, "Must be root/su to zerods\n"); + errx(1, "Must be root/su to zerods"); /* * Do it for the server specified. @@ -223,7 +223,7 @@ main(int argc, char *argv[]) if (newres != NULL) { if (geteuid() != 0) errx(1, "Must be root/su to replace the host" - " addr\n"); + " addr"); /* * Check that the old host address matches. @@ -261,7 +261,7 @@ main(int argc, char *argv[]) newres = newres->ai_next; if (newres == NULL) errx(1, "Hostname %s has no" - " IP#\n", cp); + " IP#", cp); } if (newres->ai_addr->sa_family == AF_INET) { memcpy(sin, newres->ai_addr, @@ -282,7 +282,7 @@ main(int argc, char *argv[]) if (getnameinfo((struct sockaddr *)&dsfile[i].dsf_sin, dsfile[i].dsf_sin.sin_len, hostn, sizeof(hostn), NULL, 0, 0) < 0) - err(1, "Can't get hostname\n"); + err(1, "Can't get hostname"); printf("%s\tds%d/%s", hostn, dsfile[i].dsf_dir, dsfile[i].dsf_filename); } @@ -291,7 +291,7 @@ main(int argc, char *argv[]) printf("\n"); if (dosetxattr != 0 && extattr_set_file(*argv, EXTATTR_NAMESPACE_SYSTEM, "pnfsd.dsfile", dsfile, xattrsize) != xattrsize) - err(1, "Can't set pnfsd.dsfile\n"); + err(1, "Can't set pnfsd.dsfile"); } static void From owner-svn-src-projects@freebsd.org Sun Apr 29 11:46:21 2018 Return-Path: Delivered-To: svn-src-projects@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BD09EFAA55D for ; Sun, 29 Apr 2018 11:46:21 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6B9EB7B265; Sun, 29 Apr 2018 11:46:21 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 5743C23E51; Sun, 29 Apr 2018 11:46:21 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w3TBkLhl020560; Sun, 29 Apr 2018 11:46:21 GMT (envelope-from rmacklem@FreeBSD.org) Received: (from rmacklem@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w3TBkL4b020559; Sun, 29 Apr 2018 11:46:21 GMT (envelope-from rmacklem@FreeBSD.org) Message-Id: <201804291146.w3TBkL4b020559@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: rmacklem set sender to rmacklem@FreeBSD.org using -f From: Rick Macklem Date: Sun, 29 Apr 2018 11:46:21 +0000 (UTC) To: src-committers@freebsd.org, svn-src-projects@freebsd.org Subject: svn commit: r333090 - projects/pnfs-planb-server/usr.sbin/nfsd X-SVN-Group: projects X-SVN-Commit-Author: rmacklem X-SVN-Commit-Paths: projects/pnfs-planb-server/usr.sbin/nfsd X-SVN-Commit-Revision: 333090 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-projects@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "SVN commit messages for the src " projects" tree" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Apr 2018 11:46:22 -0000 Author: rmacklem Date: Sun Apr 29 11:46:20 2018 New Revision: 333090 URL: https://svnweb.freebsd.org/changeset/base/333090 Log: Update the nfsd.8 man page for mirrored DSs. Modified: projects/pnfs-planb-server/usr.sbin/nfsd/nfsd.8 Modified: projects/pnfs-planb-server/usr.sbin/nfsd/nfsd.8 ============================================================================== --- projects/pnfs-planb-server/usr.sbin/nfsd/nfsd.8 Sun Apr 29 10:45:09 2018 (r333089) +++ projects/pnfs-planb-server/usr.sbin/nfsd/nfsd.8 Sun Apr 29 11:46:20 2018 (r333090) @@ -114,9 +114,8 @@ as a Data Server (DS) for it to use. .Pp The .Ar pnfs_setup -string is a set of ',' separated fields: +string is a set of fields separated by ',' and '#' characters: .Bl -tag -width Ds -.It Each of these fields specifies one Data Server. It consists of a server hostname, followed by a ':' and the directory path where the DS's data storage file system is mounted on @@ -126,12 +125,30 @@ The DS storage file systems must be mounted on this sy is started with this option specified. For example: .sp -nfsv4-ds0:/DS0,nfsv4-ds1:/DS1 +nfsv4-data0:/data0,nfsv4-data1:/data1 .sp -Would specify two DS servers called nfsv4-ds0 and nfsv4-ds1 that comprise the -data storage component of the pNFS service. -The directories "/DS0" and "/DS1" are where the DS storage servers exported +Would specify two DS servers called nfsv4-data0 and nfsv4-data1 that comprise +the data storage component of the pNFS service. +The directories "/data0" and "/data1" are where the data storage servers exported storage directories are mounted on this system (which will act as the MDS). +.Pp +If the fields are separated by the '#' character, it indicates that these +DSs are to be configured as a mirrored set. +For example: +.sp +nfsv4-data0:/data0#nfsv4-data1:/data1,nfsv4-data2:/data2#nfsv4-data3:/data3 +.sp +Would specify two mirrored pairs of DSs, with nfsv4-data0 plus nfsv4-data1 +comprising one mirrored pair and nfsv4-data2 plus nfsv4-data3 comprising +the other mirrored pair. +.Pp +If there are mirrored sets of DSs, the server must use the Flexible File +layout. +If there are no mirrored sets of DSs, the server will use the File layout +by default, but this default can be changed to the Flexible File layout if the +.Xr sysctl 1 +vfs.nfsd.default_flexfile +is set non-zero. .El .It Fl t Serve @@ -155,7 +172,7 @@ transports using six daemons. .Pp A server should run enough daemons to handle the maximum level of concurrency from its clients, -typically four to six. +typically four or more per CPU core. .Pp The .Nm @@ -165,9 +182,11 @@ server specification; see .%T "Network File System Protocol Specification" , RFC1094, .%T "NFS: Network File System Version 3 Protocol Specification" , -RFC1813 and +RFC1813, .%T "Network File System (NFS) Version 4 Protocol" , -RFC3530. +RFC3530 and +.%T "Network File System (NFS) Version 4 Minor Version 1 Protocol" , +RFC5661. .Pp If .Nm @@ -243,6 +262,7 @@ just do a .Xr kldload 2 , .Xr nfssvc 2 , .Xr nfsv4 4 , +.Xr pnfs 4 , .Xr exports 5 , .Xr stablerestart 5 , .Xr gssd 8 , From owner-svn-src-projects@freebsd.org Sun Apr 29 20:24:04 2018 Return-Path: Delivered-To: svn-src-projects@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 61561FB79B0 for ; Sun, 29 Apr 2018 20:24:04 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 12AEF6A4FD; Sun, 29 Apr 2018 20:24:04 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 0D8A411F2; Sun, 29 Apr 2018 20:24:04 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w3TKO3Uu084684; Sun, 29 Apr 2018 20:24:03 GMT (envelope-from rmacklem@FreeBSD.org) Received: (from rmacklem@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w3TKO38U084683; Sun, 29 Apr 2018 20:24:03 GMT (envelope-from rmacklem@FreeBSD.org) Message-Id: <201804292024.w3TKO38U084683@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: rmacklem set sender to rmacklem@FreeBSD.org using -f From: Rick Macklem Date: Sun, 29 Apr 2018 20:24:03 +0000 (UTC) To: src-committers@freebsd.org, svn-src-projects@freebsd.org Subject: svn commit: r333093 - projects/pnfs-planb-server/usr.sbin/nfsd X-SVN-Group: projects X-SVN-Commit-Author: rmacklem X-SVN-Commit-Paths: projects/pnfs-planb-server/usr.sbin/nfsd X-SVN-Commit-Revision: 333093 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-projects@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "SVN commit messages for the src " projects" tree" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Apr 2018 20:24:04 -0000 Author: rmacklem Date: Sun Apr 29 20:24:03 2018 New Revision: 333093 URL: https://svnweb.freebsd.org/changeset/base/333093 Log: Bring the pnfs.4 man page up to date. Modified: projects/pnfs-planb-server/usr.sbin/nfsd/pnfs.4 Modified: projects/pnfs-planb-server/usr.sbin/nfsd/pnfs.4 ============================================================================== --- projects/pnfs-planb-server/usr.sbin/nfsd/pnfs.4 Sun Apr 29 17:46:08 2018 (r333092) +++ projects/pnfs-planb-server/usr.sbin/nfsd/pnfs.4 Sun Apr 29 20:24:03 2018 (r333093) @@ -24,7 +24,7 @@ .\" .\" $FreeBSD$ .\" -.Dd July 2, 2017 +.Dd March 26, 2018 .Dt PNFS 4 .Os .Sh NAME @@ -35,69 +35,75 @@ The NFSv4.1 client and server provides support for the .Tn pNFS specification; see .%T "Network File System (NFS) Version 4 Minor Version 1 Protocol RFC 5661" . -A pNFS service separates the Read/Write operations from all other NFSv4.1 +A pNFS service separates Read/Write operations from all other NFSv4.1 operations, which are referred to as Metadata operations. The Read/Write operations are performed directly on the Data Server (DS) where the file's data resides, bypassing the NFS server. -All other operations are performed on the NFS server, which is referred to +All other file operations are performed on the NFS server, which is referred to as a Metadata Server (MDS). NFS clients that do not support .Tn pNFS perform Read/Write operations on the MDS, which acts as a proxy for the -appropriate DS. +appropriate DS(s). .Pp +The NFSv4.1 protocol provides two pieces of information to pNFS aware +clients that allow them to perform Read/Write operations directly on +the DS. +.Pp +The first is DeviceInfo, which is static information defining the DS +server. +The critical piece of information in DeviceInfo for the layout types +supported by FreeBSD is the IP address that is used to perform RPCs on the DS. +It also indicates which version of NFS the DS supports, I/O size and other +layout specific information. +In the DeviceInfo, there is a DeviceID which, for the FreeBSD server +is unique to the DS configuration +and changes whenever the +.Xr nfsd +daemon is restarted or the server is rebooted. +.Pp +The second is the layout, which is per file and references the DeviceInfo +to use via the DeviceID. +It is for a byte range of a file and is either Read or Read/Write. +For the FreeBSD server, a layout covers all bytes of a file. +A layout may be recalled by the MDS using a LayoutRecall callback. +When a client returns a layout via the LayoutReturn operation it can +indicate that error(s) were encountered while doing I/O on the DS. +.Pp +The FreeBSD client and server supports two layout types. +.Pp +The File Layout is described in RFC5661 and uses the NFSv4.1 protocol +to perform I/O on the DS. +It does not support client aware DS mirroring and, as such, +the FreeBSD server only provides File Layout support for non-mirrored +configurations. +.Pp +The Flexible File Layout allows the use of the NFSv3, NFSv4.0 or NFSv4.1 +protocol to perform I/O on the DS and does support client aware mirroring. +As such, the FreeBSD server uses Flexible File Layout layouts for the +mirrored DS configurations. +The FreeBSD server supports the ``tightly coupled'' variant and all DSs use the +NFSv4.1 protocol for I/O operations. +Clients that support the Flexible File Layout will do writes and commits +to all DS mirrors in the mirror set. +.Pp A FreeBSD pNFS service consists of a single MDS server plus one or more DS servers, all of which are FreeBSD systems. +For a non-mirrored configuration, the FreeBSD server will issue File Layout +layouts by default. +However that default can be set to the Flexible File Layout by setting the +.Xr sysctl 1 +sysctl ``vfs.nfsd.default_flexfile'' to one. +Mirrored server configurations will only issue Flexible File Layouts. .Tn pNFS clients mount the MDS as they would a single NFS server. .Pp -A +A FreeBSD .Tn pNFS client must be running the .Xr nfscbd 8 daemon and use the mount options ''nfsv4,minorversion=1,pnfs''. .Pp -A pNFS DS server must be configured as a NFSv4.1 server, where there is an -exported directory with subdirectories named ds0, ds1, ..., ds created -in it. -For example, if the exported directory is /ds and the number of subdirectories is 20, the subdirectories are named -/ds/ds0, /ds/ds1, ..., /ds/ds19. -This exported directory is the one that the MDS will mount via NFSv4.1 to use as -a DS. -The subdirectories are created simply to reduce the size of the directories -by spreading the data storage files across them. -If the -.Tn pNFS -service will be storing a large number of files, the service should be -configured with a large number of subdirectories. -There really is no disadvantage in having a large number of subdirectories, -so sysadmins should err on the side of creating many of them. -Each of these subdirectories must be owned by the that the -maproot -.Xr exports 5 -option maps to, since the MDS accesses these directories as . -These directories should have file mode 0700, so that only the mapped -for has access to them. -See -.Xr exports 5 -for more information on this. -These subdirectories must be created by the sysadmin on all DS servers before -the -.Tn pNFS -service is started. -.Pp -The sysctl -.sp -.Bd -literal -offset indent -vfs.nfsd.dsdirsize -.Ed -.Pp -defines the number of subdirectories named ds0, ds1, ... , ds, where N is -vfs.nfsd.dsdirsize - 1, with the default set to 20. -The number of subdirectories can be increased after the server has been -running, but only when the -.Xr nfsd 8 -daemon is not running. -.Pp When files are created, the MDS creates a file tree identical to what a single NFS server creates, except that all the regular (VREG) files will be empty. @@ -105,7 +111,6 @@ As such, if you look at the exported tree on the MDS d on the MDS server (not via an NFS mount), the files will all be of size zero. Each of these files will also have two extended attributes in the system attribute name space: -.sp .Bd -literal -offset indent pnfsd.dsfile - This extended attrbute stores the information that the MDS needs to find the data storage file on a DS for this file. @@ -114,43 +119,37 @@ pnfsd.dsattr - This extended attribute stores the Size .Ed .Pp For each regular (VREG) file, the MDS creates a data storage file on one -of the DSs, in one of the ds subdirectories of the exported DS directory. +(or on each mirror for the mirrored DS case) +of the DSs which will store the file's data. The name of this file is -the file handle of the file on the MDS in hexadecimal at time of creation. +the file handle of the file on the MDS in hexadecimal at time of file creation. +The data storage file will have the same file ownership, mode and NFSv4 ACL +(if ACLs are enabled for the file system) as the file on the MDS, so that +permission checking can be done on the DS. +This is referred to as ``tightly coupled'' for the Flexible File Layout. .Pp For .Tn pNFS -clients, the service generates File Layout layouts and associated DeviceInfo. -For NFS clients that do not support NFSv4.1 pNFS, there will be a performance -hit, since the I/O RPCs will be proxied by the MDS for the DS server the -data storage file resides on. +aware clients, the service generates File Layout +or Flexible File Layout +layouts and associated DeviceInfo. +For non-pNFS aware NFS clients, the pNFS service appears just like a normal +NFS service. +For the non-pNFS aware client, the MDS will perform I/O operations on the appropriate DS(s), acting as +a proxy for the non-pNFS aware client. +This is also true for NFSv3 and NFSv4.0 mounts, since these are always non-pNFS +aware. .Pp -Configuration of a DS is done exactly as any other NFS server is configured, -with the data storage directory exported to the MDS. -.Pp -The MDS is configured to mount the data storage directories of the DSs. -For example, if there are 2 DSs named nfsv4-ds0 and nfsv4-ds1 and both of -these have a /ds directory exported to the MDS, the -.Xr fstab 5 -entries might be: -.sp +See .Bd -literal -offset indent -nfsv4-ds0:/ds /ds0 nfs rw,nfsv4,minorversion=1 0 0 -nfsv4-ds1:/ds /ds1 nfs rw,nfsv4,minorversion=1 0 0 +http://people.freebsd.org/~rmacklem/pnfs-planb-setup.txt .Ed -.Pp -The MDS will require a "-p" flag option specifying the DSs. For the above -mounts, the nfs_server_flags entry in -.Xr rc.conf 5 -might be: .sp -.Bd -literal -offset indent -nfs_server_flags="-u -t -n 256 -p nfsv4-ds0:/ds0,nfsv4-ds1:/ds1" -.Ed -.Pp -Note that the "-p" flag requires the "mounted-on" directory path on the MDS. +for information on how to set up a FreeBSD pNFS service. .Sh SEE ALSO +.Xr pnfsdscopymr 1 , .Xr pnfsdsfile 1 , +.Xr pnfsdskill 1 , .Xr nfsv4 4 , .Xr exports 5 , .Xr fstab 5 , @@ -159,7 +158,22 @@ Note that the "-p" flag requires the "mounted-on" dire .Xr nfsd 8 , .Xr nfsuserd 8 .Sh BUGS -At this time, there is no support for DS mirroring. -As such, the MDS plus all DS servers are single points of failure for the +Linux kernel versions prior to 4.12 only supports NFSv3 DSs in its client +and will do all I/O through the MDS. +For Linux 4.12 kernels, support for NFSv4.1 DSs was added, but I have seen +Linux client crashes when testing this client. +For Linux 4.17-rc2 kernels, I have not seen client crashes during testing, +but it only supports the ``loosely coupled'' variant. +To make it work correctly when mounting the FreeBSD server, you must either +patch the Flexible File Layout client driver with a patch like: +.Bd -literal -offset indent +http://people.freebsd.org/~rmacklem/flexfile.patch +.Ed +.sp +or set the sysctl ``vfs.nfsd.flexlinuxhack'' to one so that it works around +the Linux client driver's limitations. +.Pp +Since the MDS cannot be mirrored, it is a single point of failure just +as a non .Tn pNFS -service. +server is. From owner-svn-src-projects@freebsd.org Wed May 2 20:10:28 2018 Return-Path: Delivered-To: svn-src-projects@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0FF7EFB30BD for ; Wed, 2 May 2018 20:10:28 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B555775FAD; Wed, 2 May 2018 20:10:27 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id B015C159A1; Wed, 2 May 2018 20:10:27 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w42KARKv053991; Wed, 2 May 2018 20:10:27 GMT (envelope-from rmacklem@FreeBSD.org) Received: (from rmacklem@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w42KARhq053990; Wed, 2 May 2018 20:10:27 GMT (envelope-from rmacklem@FreeBSD.org) Message-Id: <201805022010.w42KARhq053990@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: rmacklem set sender to rmacklem@FreeBSD.org using -f From: Rick Macklem Date: Wed, 2 May 2018 20:10:27 +0000 (UTC) To: src-committers@freebsd.org, svn-src-projects@freebsd.org Subject: svn commit: r333179 - projects/pnfs-planb-server/sys/fs/nfsserver X-SVN-Group: projects X-SVN-Commit-Author: rmacklem X-SVN-Commit-Paths: projects/pnfs-planb-server/sys/fs/nfsserver X-SVN-Commit-Revision: 333179 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-projects@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "SVN commit messages for the src " projects" tree" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 May 2018 20:10:28 -0000 Author: rmacklem Date: Wed May 2 20:10:27 2018 New Revision: 333179 URL: https://svnweb.freebsd.org/changeset/base/333179 Log: Add some diagnostics and fix up layout recall in a few ways: - Increment the layout's stateid.seqid as required by the RFC. - Fix the locking, moving one lock up and using the correct lock in another place. Modified: projects/pnfs-planb-server/sys/fs/nfsserver/nfs_nfsdstate.c Modified: projects/pnfs-planb-server/sys/fs/nfsserver/nfs_nfsdstate.c ============================================================================== --- projects/pnfs-planb-server/sys/fs/nfsserver/nfs_nfsdstate.c Wed May 2 20:04:31 2018 (r333178) +++ projects/pnfs-planb-server/sys/fs/nfsserver/nfs_nfsdstate.c Wed May 2 20:10:27 2018 (r333179) @@ -4392,9 +4392,11 @@ errout: if (clp->lc_flags & LCL_CBDOWN) clp->lc_flags &= ~(LCL_CBDOWN | LCL_CALLBACKSON); NFSUNLOCKSTATE(); - if (nd->nd_repstat) + if (nd->nd_repstat) { error = nd->nd_repstat; - else if (error == 0 && procnum == NFSV4OP_CBGETATTR) + NFSD_DEBUG(1, "nfsrv_docallback op=%d err=%d\n", + procnum, error); + } else if (error == 0 && procnum == NFSV4OP_CBGETATTR) error = nfsv4_loadattr(nd, NULL, nap, NULL, NULL, 0, NULL, NULL, NULL, NULL, NULL, 0, NULL, NULL, NULL, p, NULL); @@ -6592,6 +6594,12 @@ nfsrv_flexmirrordel(char *devid, NFSPROC_T *p) /* Now, try to do a Layout recall for each one found. */ LIST_FOREACH_SAFE(lyp, &loclyp, lay_list, nlyp) { NFSD_DEBUG(4, "do layout recall\n"); + /* + * The layout stateid.seqid needs to be incremented + * before doing a LAYOUT_RECALL callback. + */ + if (++lyp->lay_stateid.seqid == 0) + lyp->lay_stateid.seqid = 1; nfsrv_recalllayout(lyp, p); nfsrv_freelayout(lyp); } @@ -6612,10 +6620,18 @@ nfsrv_recalllayout(struct nfslayout *lyp, NFSPROC_T *p NFSD_DEBUG(4, "aft nfsrv_getclient=%d\n", error); if (error != 0) return; - if ((clp->lc_flags & LCL_NFSV41) != 0) - nfsrv_docallback(clp, NFSV4OP_CBLAYOUTRECALL, NULL, 0, NULL, - NULL, NULL, lyp, p); - else + if ((clp->lc_flags & LCL_NFSV41) != 0) { + error = nfsrv_docallback(clp, NFSV4OP_CBLAYOUTRECALL, NULL, 0, + NULL, NULL, NULL, lyp, p); + if (error == NFSERR_NOMATCHLAYOUT) { + NFSDRECALLLOCK(); + if ((lyp->lay_flags & NFSLAY_RECALL) != 0) { + lyp->lay_flags |= NFSLAY_RETURNED; + wakeup(lyp); + } + NFSDRECALLUNLOCK(); + } + } else printf("nfsrv_recalllayout: clp not NFSv4.1\n"); } @@ -6645,10 +6661,14 @@ nfsrv_layoutreturn(struct nfsrv_descript *nd, vnode_t } if (error == 0) { lhyp = NFSLAYOUTHASH(&fh); + NFSDRECALLLOCK(); NFSLOCKLAYOUT(lhyp); error = nfsrv_findlayout(nd, &fh, layouttype, p, &lyp); NFSD_DEBUG(4, "layoutret findlay=%d\n", error); - if (error == 0) { + if (error == 0 && + stateidp->other[0] == lyp->lay_stateid.other[0] && + stateidp->other[1] == lyp->lay_stateid.other[1] && + stateidp->other[2] == lyp->lay_stateid.other[2]) { NFSD_DEBUG(4, "nfsrv_layoutreturn: stateid %d" " %x %x %x laystateid %d %x %x %x" " off=%ju len=%ju flgs=0x%x\n", @@ -6660,15 +6680,6 @@ nfsrv_layoutreturn(struct nfsrv_descript *nd, vnode_t lyp->lay_stateid.other[2], (uintmax_t)offset, (uintmax_t)len, lyp->lay_flags); - if (stateidp->other[0] != - lyp->lay_stateid.other[0] || - stateidp->other[1] != - lyp->lay_stateid.other[1] || - stateidp->other[2] != - lyp->lay_stateid.other[2]) - error = NFSERR_BADSTATEID; - } - if (error == 0) { if (++lyp->lay_stateid.seqid == 0) lyp->lay_stateid.seqid = 1; stateidp->seqid = lyp->lay_stateid.seqid; @@ -6685,23 +6696,18 @@ nfsrv_layoutreturn(struct nfsrv_descript *nd, vnode_t } } NFSUNLOCKLAYOUT(lhyp); - if (error != 0) { - /* Search the nfsrv_recalllist for a match. */ - NFSDDSLOCK(); - LIST_FOREACH(lyp, &nfsrv_recalllisthead, - lay_list) { - if (NFSBCMP(&lyp->lay_fh, &fh, - sizeof(fh)) == 0 && - lyp->lay_clientid.qval == - nd->nd_clientid.qval) { - lyp->lay_flags |= - NFSLAY_RETURNED; - wakeup(lyp); - error = 0; - } + /* Search the nfsrv_recalllist for a match. */ + LIST_FOREACH(lyp, &nfsrv_recalllisthead, lay_list) { + if (NFSBCMP(&lyp->lay_fh, &fh, + sizeof(fh)) == 0 && + lyp->lay_clientid.qval == + nd->nd_clientid.qval) { + lyp->lay_flags |= NFSLAY_RETURNED; + wakeup(lyp); + error = 0; } - NFSDDSUNLOCK(); } + NFSDRECALLUNLOCK(); } if (layouttype == NFSLAYOUT_FLEXFILE) nfsrv_flexlayouterr(nd, layp, maxcnt, p); @@ -7602,7 +7608,7 @@ nfsrv_copymr(vnode_t vp, vnode_t fvp, vnode_t dvp, str off_t rdpos, wrpos; ssize_t aresid; char *dat; - int ret, retacl, xfer; + int didprintf, ret, retacl, xfer; ASSERT_VOP_LOCKED(fvp, "nfsrv_copymr fvp"); ASSERT_VOP_LOCKED(vp, "nfsrv_copymr vp"); @@ -7639,7 +7645,7 @@ nfsrv_copymr(vnode_t vp, vnode_t fvp, vnode_t dvp, str /* * Search for all RW layouts for this file. Move them to the - * recall list, so they can be recalled and their return noted. + * recall list, so they can be recalled and their return noted. */ lhyp = NFSLAYOUTHASH(&fh); NFSDRECALLLOCK(); @@ -7655,6 +7661,7 @@ nfsrv_copymr(vnode_t vp, vnode_t fvp, vnode_t dvp, str NFSDRECALLUNLOCK(); ret = 0; + didprintf = 0; LIST_INIT(&thl); /* Unlock the MDS vp, so that a LayoutReturn can be done on it. */ NFSVOPUNLOCK(vp, 0); @@ -7665,6 +7672,12 @@ tryagain: if (NFSBCMP(&lyp->lay_fh, &fh, sizeof(fh)) == 0 && (lyp->lay_flags & NFSLAY_RECALL) == 0) { lyp->lay_flags |= NFSLAY_RECALL; + /* + * The layout stateid.seqid needs to be incremented + * before doing a LAYOUT_RECALL callback. + */ + if (++lyp->lay_stateid.seqid == 0) + lyp->lay_stateid.seqid = 1; NFSDRECALLUNLOCK(); nfsrv_recalllayout(lyp, p); NFSD_DEBUG(4, "nfsrv_copymr: recalled layout\n"); @@ -7683,11 +7696,17 @@ tryagain2: "nfsrv_copymr: layout returned\n"); } else { ret = mtx_sleep(lyp, NFSDRECALLMUTEXPTR, - PVFS | PCATCH, "nfsmrl", 0); + PVFS | PCATCH, "nfsmrl", hz); NFSD_DEBUG(4, "nfsrv_copymr: aft sleep=%d\n", ret); if (ret == EINTR || ret == ERESTART) break; + if ((lyp->lay_flags & NFSLAY_RETURNED) == 0 && + didprintf == 0) { + printf("nfsrv_copymr: layout not " + "returned\n"); + didprintf = 1; + } } goto tryagain2; } From owner-svn-src-projects@freebsd.org Thu May 3 23:51:10 2018 Return-Path: Delivered-To: svn-src-projects@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 24DFAFC036B for ; Thu, 3 May 2018 23:51:10 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C8D9A87444; Thu, 3 May 2018 23:51:09 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id C3B9F26D47; Thu, 3 May 2018 23:51:09 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w43Np9Oh002384; Thu, 3 May 2018 23:51:09 GMT (envelope-from rmacklem@FreeBSD.org) Received: (from rmacklem@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w43Np9lI002383; Thu, 3 May 2018 23:51:09 GMT (envelope-from rmacklem@FreeBSD.org) Message-Id: <201805032351.w43Np9lI002383@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: rmacklem set sender to rmacklem@FreeBSD.org using -f From: Rick Macklem Date: Thu, 3 May 2018 23:51:09 +0000 (UTC) To: src-committers@freebsd.org, svn-src-projects@freebsd.org Subject: svn commit: r333231 - projects/pnfs-planb-server/sys/fs/nfsserver X-SVN-Group: projects X-SVN-Commit-Author: rmacklem X-SVN-Commit-Paths: projects/pnfs-planb-server/sys/fs/nfsserver X-SVN-Commit-Revision: 333231 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-projects@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "SVN commit messages for the src " projects" tree" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 May 2018 23:51:10 -0000 Author: rmacklem Date: Thu May 3 23:51:09 2018 New Revision: 333231 URL: https://svnweb.freebsd.org/changeset/base/333231 Log: Add a check for same layout stateid to the layout return operation. Modified: projects/pnfs-planb-server/sys/fs/nfsserver/nfs_nfsdstate.c Modified: projects/pnfs-planb-server/sys/fs/nfsserver/nfs_nfsdstate.c ============================================================================== --- projects/pnfs-planb-server/sys/fs/nfsserver/nfs_nfsdstate.c Thu May 3 22:51:44 2018 (r333230) +++ projects/pnfs-planb-server/sys/fs/nfsserver/nfs_nfsdstate.c Thu May 3 23:51:09 2018 (r333231) @@ -6701,7 +6701,13 @@ nfsrv_layoutreturn(struct nfsrv_descript *nd, vnode_t if (NFSBCMP(&lyp->lay_fh, &fh, sizeof(fh)) == 0 && lyp->lay_clientid.qval == - nd->nd_clientid.qval) { + nd->nd_clientid.qval && + stateidp->other[0] == + lyp->lay_stateid.other[0] && + stateidp->other[1] == + lyp->lay_stateid.other[1] && + stateidp->other[2] == + lyp->lay_stateid.other[2]) { lyp->lay_flags |= NFSLAY_RETURNED; wakeup(lyp); error = 0;