From owner-svn-src-head@freebsd.org Wed Jul 29 22:58:09 2020 Return-Path: Delivered-To: svn-src-head@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 67D8E36E7D0; Wed, 29 Jul 2020 22:58:09 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4BH88d1vvpz4Md9; Wed, 29 Jul 2020 22:58:09 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 0A7A7274C5; Wed, 29 Jul 2020 22:58:09 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id 06TMw8BW070178; Wed, 29 Jul 2020 22:58:08 GMT (envelope-from rmacklem@FreeBSD.org) Received: (from rmacklem@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id 06TMw806070177; Wed, 29 Jul 2020 22:58:08 GMT (envelope-from rmacklem@FreeBSD.org) Message-Id: <202007292258.06TMw806070177@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: rmacklem set sender to rmacklem@FreeBSD.org using -f From: Rick Macklem Date: Wed, 29 Jul 2020 22:58:08 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r363677 - head/sys/fs/nfsserver X-SVN-Group: head X-SVN-Commit-Author: rmacklem X-SVN-Commit-Paths: head/sys/fs/nfsserver X-SVN-Commit-Revision: 363677 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2020 22:58:09 -0000 Author: rmacklem Date: Wed Jul 29 22:58:08 2020 New Revision: 363677 URL: https://svnweb.freebsd.org/changeset/base/363677 Log: Add support for ext_pgs mbufs to nfsrvd_readdir() and nfsrvd_readdirplus(). This patch code that optionally (based on ND_TLS, never set yet) generates readdir replies in ext_pgs mbufs. To trim the list back, a new function that is ext_pgs aware called nfsm_trimtrailing() replaces newnfs_trimtrailing(). newnfs_trimtrailing() is no longer used, but will be removed in a future commit, since its removal does modify the internal kpi between the NFS modules. This is another in the series of commits that add support to the NFS client and server for building RPC messages in ext_pgs mbufs with anonymous pages. This is useful so that the entire mbuf list does not need to be copied before calling sosend() when NFS over TLS is enabled. Use of ext_pgs mbufs will not be enabled until the kernel RPC is updated to handle TLS. Modified: head/sys/fs/nfsserver/nfs_nfsdport.c Modified: head/sys/fs/nfsserver/nfs_nfsdport.c ============================================================================== --- head/sys/fs/nfsserver/nfs_nfsdport.c Wed Jul 29 22:10:25 2020 (r363676) +++ head/sys/fs/nfsserver/nfs_nfsdport.c Wed Jul 29 22:58:08 2020 (r363677) @@ -144,6 +144,8 @@ static int nfsrv_dsremove(struct vnode *, char *, stru static int nfsrv_dssetacl(struct vnode *, struct acl *, struct ucred *, NFSPROC_T *); static int nfsrv_pnfsstatfs(struct statfs *, struct mount *); +static void nfsm_trimtrailing(struct nfsrv_descript *, struct mbuf *, + char *, int, int); int nfs_pnfsio(task_fn_t *, void *); @@ -2043,6 +2045,17 @@ again: vput(vp); /* + * If cnt > MCLBYTES and the reply will not be saved, use + * ext_pgs mbufs for TLS. + * For NFSv4.0, we do not know for sure if the reply will + * be saved, so do not use ext_pgs mbufs for NFSv4.0. + */ + if (cnt > MCLBYTES && siz > MCLBYTES && + (nd->nd_flag & (ND_TLS | ND_EXTPG | ND_SAVEREPLY)) == ND_TLS && + (nd->nd_flag & (ND_NFSV4 | ND_NFSV41)) != ND_NFSV4) + nd->nd_flag |= ND_EXTPG; + + /* * dirlen is the size of the reply, including all XDR and must * not exceed cnt. For NFSv2, RFC1094 didn't clearly indicate * if the XDR should be included in "count", but to be safe, we do. @@ -2146,6 +2159,7 @@ nfsrvd_readdirplus(struct nfsrv_descript *nd, int isdg struct mount *mp, *new_mp; uint64_t mounted_on_fileno; struct thread *p = curthread; + int bextpg0, bextpg1, bextpgsiz0, bextpgsiz1; if (nd->nd_repstat) { nfsrv_postopattr(nd, getret, &at); @@ -2359,11 +2373,27 @@ again: } /* + * If the reply is likely to exceed MCLBYTES and the reply will + * not be saved, use ext_pgs mbufs for TLS. + * It is difficult to predict how large each entry will be and + * how many entries have been read, so just assume the directory + * entries grow by a factor of 4 when attributes are included. + * For NFSv4.0, we do not know for sure if the reply will + * be saved, so do not use ext_pgs mbufs for NFSv4.0. + */ + if (cnt > MCLBYTES && siz > MCLBYTES / 4 && + (nd->nd_flag & (ND_TLS | ND_EXTPG | ND_SAVEREPLY)) == ND_TLS && + (nd->nd_flag & (ND_NFSV4 | ND_NFSV41)) != ND_NFSV4) + nd->nd_flag |= ND_EXTPG; + + /* * Save this position, in case there is an error before one entry * is created. */ mb0 = nd->nd_mb; bpos0 = nd->nd_bpos; + bextpg0 = nd->nd_bextpg; + bextpgsiz0 = nd->nd_bextpgsiz; /* * Fill in the first part of the reply. @@ -2385,6 +2415,8 @@ again: */ mb1 = nd->nd_mb; bpos1 = nd->nd_bpos; + bextpg1 = nd->nd_bextpg; + bextpgsiz1 = nd->nd_bextpgsiz; /* Loop through the records and build reply */ entrycnt = 0; @@ -2401,6 +2433,8 @@ again: */ mb1 = nd->nd_mb; bpos1 = nd->nd_bpos; + bextpg1 = nd->nd_bextpg; + bextpgsiz1 = nd->nd_bextpgsiz; /* * For readdir_and_lookup get the vnode using @@ -2626,11 +2660,11 @@ invalid: if (!nd->nd_repstat && entrycnt == 0) nd->nd_repstat = NFSERR_TOOSMALL; if (nd->nd_repstat) { - newnfs_trimtrailing(nd, mb0, bpos0); + nfsm_trimtrailing(nd, mb0, bpos0, bextpg0, bextpgsiz0); if (nd->nd_flag & ND_NFSV3) nfsrv_postopattr(nd, getret, &at); } else - newnfs_trimtrailing(nd, mb1, bpos1); + nfsm_trimtrailing(nd, mb1, bpos1, bextpg1, bextpgsiz1); eofflag = 0; } else if (cpos < cend) eofflag = 0; @@ -6416,6 +6450,44 @@ out: } NFSEXITCODE(error); return (error); +} + +/* + * Trim trailing data off the mbuf list being built. + */ +static void +nfsm_trimtrailing(struct nfsrv_descript *nd, struct mbuf *mb, char *bpos, + int bextpg, int bextpgsiz) +{ + vm_page_t pg; + int fullpgsiz, i; + + if (mb->m_next != NULL) { + m_freem(mb->m_next); + mb->m_next = NULL; + } + if ((mb->m_flags & M_EXTPG) != 0) { + /* First, get rid of any pages after this position. */ + for (i = mb->m_epg_npgs - 1; i > bextpg; i--) { + pg = PHYS_TO_VM_PAGE(mb->m_epg_pa[i]); + vm_page_unwire_noq(pg); + vm_page_free(pg); + } + mb->m_epg_npgs = bextpg + 1; + if (bextpg == 0) + fullpgsiz = PAGE_SIZE - mb->m_epg_1st_off; + else + fullpgsiz = PAGE_SIZE; + mb->m_epg_last_len = fullpgsiz - bextpgsiz; + mb->m_len = m_epg_pagelen(mb, 0, mb->m_epg_1st_off); + for (i = 1; i < mb->m_epg_npgs; i++) + mb->m_len += m_epg_pagelen(mb, i, 0); + nd->nd_bextpgsiz = bextpgsiz; + nd->nd_bextpg = bextpg; + } else + mb->m_len = bpos - mtod(mb, char *); + nd->nd_mb = mb; + nd->nd_bpos = bpos; } extern int (*nfsd_call_nfsd)(struct thread *, struct nfssvc_args *);