From owner-svn-src-all@freebsd.org Sat Jul 7 19:27:50 2018 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5B2351044DAD; Sat, 7 Jul 2018 19:27:50 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 01B0587641; Sat, 7 Jul 2018 19:27:50 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id D6E1E1B960; Sat, 7 Jul 2018 19:27:49 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w67JRnGW015628; Sat, 7 Jul 2018 19:27:49 GMT (envelope-from rmacklem@FreeBSD.org) Received: (from rmacklem@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w67JRnqB015627; Sat, 7 Jul 2018 19:27:49 GMT (envelope-from rmacklem@FreeBSD.org) Message-Id: <201807071927.w67JRnqB015627@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: rmacklem set sender to rmacklem@FreeBSD.org using -f From: Rick Macklem Date: Sat, 7 Jul 2018 19:27:49 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r336075 - head/sys/fs/nfsserver X-SVN-Group: head X-SVN-Commit-Author: rmacklem X-SVN-Commit-Paths: head/sys/fs/nfsserver X-SVN-Commit-Revision: 336075 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 Jul 2018 19:27:50 -0000 Author: rmacklem Date: Sat Jul 7 19:27:49 2018 New Revision: 336075 URL: https://svnweb.freebsd.org/changeset/base/336075 Log: Fix handling of the hybrid DS case for a pNFS server. After the addition of the "#mds_path" suffix for a DS specification on the "-p" nfsd option, it is possible to have a mix of DSs assigned to an MDS file system and DSs that store files for all DSs. This is what I referred to as "hybrid" above. At first, I didn't think this hybrid case would be useful, but I now believe that some system administrators may fine it useful. This patch modifies the file storage assignment algorithm so that it makes the "#mds_path" DSs take priority and the all file systems DSs are now only used for MDS file systems with no "#mds_path" DS servers. This only affects the pNFS server for this "hybrid" case. Modified: head/sys/fs/nfsserver/nfs_nfsdport.c Modified: head/sys/fs/nfsserver/nfs_nfsdport.c ============================================================================== --- head/sys/fs/nfsserver/nfs_nfsdport.c Sat Jul 7 19:11:43 2018 (r336074) +++ head/sys/fs/nfsserver/nfs_nfsdport.c Sat Jul 7 19:27:49 2018 (r336075) @@ -3848,7 +3848,7 @@ nfsrv_pnfscreate(struct vnode *vp, struct vattr *vap, NFSPROC_T *p) { struct nfsrvdscreate *dsc, *tdsc; - struct nfsdevice *ds, *mds; + struct nfsdevice *ds, *tds, *fds; struct mount *mp; struct pnfsdsfile *pf, *tpf; struct pnfsdsattr dsattr; @@ -3866,12 +3866,25 @@ nfsrv_pnfscreate(struct vnode *vp, struct vattr *vap, /* Get a DS server directory in a round-robin order. */ mirrorcnt = 1; mp = vp->v_mount; + ds = fds = NULL; NFSDDSLOCK(); - TAILQ_FOREACH(ds, &nfsrv_devidhead, nfsdev_list) { - if (ds->nfsdev_nmp != NULL && (ds->nfsdev_mdsisset == 0 || - (mp->mnt_stat.f_fsid.val[0] == ds->nfsdev_mdsfsid.val[0] && - mp->mnt_stat.f_fsid.val[1] == ds->nfsdev_mdsfsid.val[1]))) - break; + /* + * Search for the first entry that handles this MDS fs, but use the + * first entry for all MDS fs's otherwise. + */ + TAILQ_FOREACH(tds, &nfsrv_devidhead, nfsdev_list) { + if (tds->nfsdev_nmp != NULL) { + if (tds->nfsdev_mdsisset == 0 && ds == NULL) + ds = tds; + else if (tds->nfsdev_mdsisset != 0 && + mp->mnt_stat.f_fsid.val[0] == + tds->nfsdev_mdsfsid.val[0] && + mp->mnt_stat.f_fsid.val[1] == + tds->nfsdev_mdsfsid.val[1]) { + ds = fds = tds; + break; + } + } } if (ds == NULL) { NFSDDSUNLOCK(); @@ -3881,17 +3894,18 @@ nfsrv_pnfscreate(struct vnode *vp, struct vattr *vap, i = dsdir[0] = ds->nfsdev_nextdir; ds->nfsdev_nextdir = (ds->nfsdev_nextdir + 1) % nfsrv_dsdirsize; dvp[0] = ds->nfsdev_dsdir[i]; - mds = TAILQ_NEXT(ds, nfsdev_list); - if (nfsrv_maxpnfsmirror > 1 && mds != NULL) { - TAILQ_FOREACH_FROM(mds, &nfsrv_devidhead, nfsdev_list) { - if (mds->nfsdev_nmp != NULL && - (mds->nfsdev_mdsisset == 0 || - (mp->mnt_stat.f_fsid.val[0] == - mds->nfsdev_mdsfsid.val[0] && + tds = TAILQ_NEXT(ds, nfsdev_list); + if (nfsrv_maxpnfsmirror > 1 && tds != NULL) { + TAILQ_FOREACH_FROM(tds, &nfsrv_devidhead, nfsdev_list) { + if (tds->nfsdev_nmp != NULL && + ((tds->nfsdev_mdsisset == 0 && fds == NULL) || + (tds->nfsdev_mdsisset != 0 && fds != NULL && + mp->mnt_stat.f_fsid.val[0] == + tds->nfsdev_mdsfsid.val[0] && mp->mnt_stat.f_fsid.val[1] == - mds->nfsdev_mdsfsid.val[1]))) { + tds->nfsdev_mdsfsid.val[1]))) { dsdir[mirrorcnt] = i; - dvp[mirrorcnt] = mds->nfsdev_dsdir[i]; + dvp[mirrorcnt] = tds->nfsdev_dsdir[i]; mirrorcnt++; if (mirrorcnt >= nfsrv_maxpnfsmirror) break; @@ -4495,7 +4509,7 @@ nfsrv_dsgetsockmnt(struct vnode *vp, int lktype, char struct nfsmount *nmp, *newnmp; struct sockaddr *sad; struct sockaddr_in *sin; - struct nfsdevice *ds, *fndds; + struct nfsdevice *ds, *tds, *fndds; struct pnfsdsfile *pf; uint32_t dsdir; int error, fhiszero, fnd, gotone, i, mirrorcnt; @@ -4563,6 +4577,7 @@ nfsrv_dsgetsockmnt(struct vnode *vp, int lktype, char /* Use the socket address to find the mount point. */ fndds = NULL; NFSDDSLOCK(); + /* Find a match for the IP address. */ TAILQ_FOREACH(ds, &nfsrv_devidhead, nfsdev_list) { if (ds->nfsdev_nmp != NULL) { dvp = ds->nfsdev_dvp; @@ -4570,25 +4585,41 @@ nfsrv_dsgetsockmnt(struct vnode *vp, int lktype, char if (nmp != ds->nfsdev_nmp) printf("different2 nmp %p %p\n", nmp, ds->nfsdev_nmp); - if (nfsaddr2_match(sad, nmp->nm_nam)) + if (nfsaddr2_match(sad, nmp->nm_nam)) { fndds = ds; - else if (newnmpp != NULL && - newnmp == NULL && - (*newnmpp == NULL || - fndds == NULL) && - (ds->nfsdev_mdsisset == 0 || - (ds->nfsdev_mdsfsid.val[0] == + break; + } + } + } + if (fndds != NULL && newnmpp != NULL && + newnmp == NULL) { + /* Search for a place to make a mirror copy. */ + TAILQ_FOREACH(tds, &nfsrv_devidhead, + nfsdev_list) { + if (tds->nfsdev_nmp != NULL && + fndds != tds && + ((tds->nfsdev_mdsisset == 0 && + fndds->nfsdev_mdsisset == 0) || + (tds->nfsdev_mdsisset != 0 && + fndds->nfsdev_mdsisset != 0 && + tds->nfsdev_mdsfsid.val[0] == mp->mnt_stat.f_fsid.val[0] && - ds->nfsdev_mdsfsid.val[1] == - mp->mnt_stat.f_fsid.val[1]))) - /* - * Return a destination for the - * copy in newnmpp. Choose the - * last valid one before the - * source mirror, so it isn't - * always the first one. - */ - *newnmpp = nmp; + tds->nfsdev_mdsfsid.val[1] == + mp->mnt_stat.f_fsid.val[1]))) { + *newnmpp = tds->nfsdev_nmp; + break; + } + } + if (tds != NULL) { + /* + * Move this entry to the end of the + * list, so it won't be selected as + * easily the next time. + */ + TAILQ_REMOVE(&nfsrv_devidhead, tds, + nfsdev_list); + TAILQ_INSERT_TAIL(&nfsrv_devidhead, tds, + nfsdev_list); } } NFSDDSUNLOCK();