From owner-svn-src-all@freebsd.org Mon Jul 31 15:23:20 2017 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EBFDBDB0ED1; Mon, 31 Jul 2017 15:23:20 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C86B67F1C4; Mon, 31 Jul 2017 15:23:20 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6VFNJDm096225; Mon, 31 Jul 2017 15:23:19 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6VFNJIV096222; Mon, 31 Jul 2017 15:23:19 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707311523.v6VFNJIV096222@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Mon, 31 Jul 2017 15:23:19 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r321794 - in head/sys: fs/nfsserver nfs X-SVN-Group: head X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in head/sys: fs/nfsserver nfs X-SVN-Commit-Revision: 321794 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Jul 2017 15:23:21 -0000 Author: mav Date: Mon Jul 31 15:23:19 2017 New Revision: 321794 URL: https://svnweb.freebsd.org/changeset/base/321794 Log: Improve FHA locality control for NFS read/write requests. This change adds two new tunables, allowing to control serialization for read and write NFS requests separately. It does not change the default behavior since there are too many factors to consider, but gives additional space for further experiments and tuning. The main motivation for this change is very low write speed in case of ZFS with sync=always or when NFS clients requests sychronous operation, when every separate request has to be written/flushed to ZIL, and requests are processed one at a time. Setting vfs.nfsd.fha.write=0 in that case allows to increase ZIL throughput by several times by coalescing writes and cache flushes. There is a worry that doing it may increase data fragmentation on disks, but I suppose it should not happen for pool with SLOG. MFC after: 1 week Sponsored by: iXsystems, Inc. Modified: head/sys/fs/nfsserver/nfs_fha_new.c head/sys/nfs/nfs_fha.c head/sys/nfs/nfs_fha.h Modified: head/sys/fs/nfsserver/nfs_fha_new.c ============================================================================== --- head/sys/fs/nfsserver/nfs_fha_new.c Mon Jul 31 15:21:26 2017 (r321793) +++ head/sys/fs/nfsserver/nfs_fha_new.c Mon Jul 31 15:23:19 2017 (r321794) @@ -93,7 +93,7 @@ fhanew_init(void *foo) sysctl_ctx_init(&softc->sysctl_ctx); softc->sysctl_tree = SYSCTL_ADD_NODE(&softc->sysctl_ctx, SYSCTL_STATIC_CHILDREN(_vfs_nfsd), OID_AUTO, "fha", CTLFLAG_RD, - 0, "fha node"); + 0, "NFS File Handle Affinity (FHA)"); if (softc->sysctl_tree == NULL) { printf("%s: unable to allocate sysctl tree\n", __func__); return; Modified: head/sys/nfs/nfs_fha.c ============================================================================== --- head/sys/nfs/nfs_fha.c Mon Jul 31 15:21:26 2017 (r321793) +++ head/sys/nfs/nfs_fha.c Mon Jul 31 15:23:19 2017 (r321794) @@ -51,7 +51,6 @@ static MALLOC_DEFINE(M_NFS_FHA, "NFS FHA", "NFS FHA"); void fha_init(struct fha_params *softc) { - char tmpstr[128]; int i; for (i = 0; i < FHA_HASH_SIZE; i++) @@ -61,47 +60,38 @@ fha_init(struct fha_params *softc) * Set the default tuning parameters. */ softc->ctls.enable = FHA_DEF_ENABLE; + softc->ctls.read = FHA_DEF_READ; + softc->ctls.write = FHA_DEF_WRITE; softc->ctls.bin_shift = FHA_DEF_BIN_SHIFT; softc->ctls.max_nfsds_per_fh = FHA_DEF_MAX_NFSDS_PER_FH; softc->ctls.max_reqs_per_nfsd = FHA_DEF_MAX_REQS_PER_NFSD; /* - * Allow the user to override the defaults at boot time with - * tunables. + * Add sysctls so the user can change the tuning parameters. */ - snprintf(tmpstr, sizeof(tmpstr), "vfs.%s.fha.enable", - softc->server_name); - TUNABLE_INT_FETCH(tmpstr, &softc->ctls.enable); - snprintf(tmpstr, sizeof(tmpstr), "vfs.%s.fha.bin_shift", - softc->server_name); - TUNABLE_INT_FETCH(tmpstr, &softc->ctls.bin_shift); - snprintf(tmpstr, sizeof(tmpstr), "vfs.%s.fha.max_nfsds_per_fh", - softc->server_name); - TUNABLE_INT_FETCH(tmpstr, &softc->ctls.max_nfsds_per_fh); - snprintf(tmpstr, sizeof(tmpstr), "vfs.%s.fha.max_reqs_per_nfsd", - softc->server_name); - TUNABLE_INT_FETCH(tmpstr, &softc->ctls.max_reqs_per_nfsd); - - /* - * Add sysctls so the user can change the tuning parameters at - * runtime. - */ SYSCTL_ADD_UINT(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), - OID_AUTO, "enable", CTLFLAG_RW, + OID_AUTO, "enable", CTLFLAG_RWTUN, &softc->ctls.enable, 0, "Enable NFS File Handle Affinity (FHA)"); SYSCTL_ADD_UINT(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), - OID_AUTO, "bin_shift", CTLFLAG_RW, - &softc->ctls.bin_shift, 0, "For FHA reads, no two requests will " - "contend if they're 2^(bin_shift) bytes apart"); + OID_AUTO, "read", CTLFLAG_RWTUN, + &softc->ctls.read, 0, "Enable NFS FHA read locality"); SYSCTL_ADD_UINT(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), - OID_AUTO, "max_nfsds_per_fh", CTLFLAG_RW, + OID_AUTO, "write", CTLFLAG_RWTUN, + &softc->ctls.write, 0, "Enable NFS FHA write locality"); + + SYSCTL_ADD_UINT(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), + OID_AUTO, "bin_shift", CTLFLAG_RWTUN, + &softc->ctls.bin_shift, 0, "Maximum locality distance 2^(bin_shift) bytes"); + + SYSCTL_ADD_UINT(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), + OID_AUTO, "max_nfsds_per_fh", CTLFLAG_RWTUN, &softc->ctls.max_nfsds_per_fh, 0, "Maximum nfsd threads that " "should be working on requests for the same file handle"); SYSCTL_ADD_UINT(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), - OID_AUTO, "max_reqs_per_nfsd", CTLFLAG_RW, + OID_AUTO, "max_reqs_per_nfsd", CTLFLAG_RWTUN, &softc->ctls.max_reqs_per_nfsd, 0, "Maximum requests that " "single nfsd thread should be working on at any time"); @@ -144,6 +134,7 @@ fha_extract_info(struct svc_req *req, struct fha_info i->fh = ++random_fh; i->offset = 0; i->locktype = LK_EXCLUSIVE; + i->read = i->write = 0; /* * Extract the procnum and convert to v3 form if necessary, @@ -169,6 +160,9 @@ fha_extract_info(struct svc_req *req, struct fha_info if (cb->no_offset(procnum)) goto out; + i->read = cb->is_read(procnum); + i->write = cb->is_write(procnum); + error = cb->realign(&req->rq_args, M_NOWAIT); if (error) goto out; @@ -181,7 +175,7 @@ fha_extract_info(struct svc_req *req, struct fha_info goto out; /* Content ourselves with zero offset for all but reads. */ - if (cb->is_read(procnum) || cb->is_write(procnum)) + if (i->read || i->write) cb->get_offset(&md, &dpos, v3, i); out: @@ -311,8 +305,13 @@ fha_hash_entry_choose_thread(struct fha_params *softc, return (thread); } + /* Check whether we should consider locality. */ + if ((i->read && !softc->ctls.read) || + (i->write && !softc->ctls.write)) + goto noloc; + /* - * Check for read locality, making sure that we won't + * Check for locality, making sure that we won't * exceed our per-thread load limit in the process. */ offset1 = i->offset; @@ -332,6 +331,7 @@ fha_hash_entry_choose_thread(struct fha_params *softc, } } +noloc: /* * We don't have a locality match, so skip this thread, * but keep track of the most attractive thread in case Modified: head/sys/nfs/nfs_fha.h ============================================================================== --- head/sys/nfs/nfs_fha.h Mon Jul 31 15:21:26 2017 (r321793) +++ head/sys/nfs/nfs_fha.h Mon Jul 31 15:23:19 2017 (r321794) @@ -31,6 +31,8 @@ /* Sysctl defaults. */ #define FHA_DEF_ENABLE 1 +#define FHA_DEF_READ 1 +#define FHA_DEF_WRITE 1 #define FHA_DEF_BIN_SHIFT 22 /* 4MB */ #define FHA_DEF_MAX_NFSDS_PER_FH 8 #define FHA_DEF_MAX_REQS_PER_NFSD 0 /* Unlimited */ @@ -39,6 +41,8 @@ struct fha_ctls { int enable; + int read; + int write; uint32_t bin_shift; uint32_t max_nfsds_per_fh; uint32_t max_reqs_per_nfsd; @@ -79,6 +83,8 @@ struct fha_info { u_int64_t fh; off_t offset; int locktype; + int read; + int write; }; struct fha_callbacks {