From owner-svn-src-all@freebsd.org Mon Jan 25 09:40:27 2016 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9D7ECA31D0C; Mon, 25 Jan 2016 09:40:27 +0000 (UTC) (envelope-from smh@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7167C770; Mon, 25 Jan 2016 09:40:27 +0000 (UTC) (envelope-from smh@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id u0P9eQUj029828; Mon, 25 Jan 2016 09:40:26 GMT (envelope-from smh@FreeBSD.org) Received: (from smh@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id u0P9eQTj029824; Mon, 25 Jan 2016 09:40:26 GMT (envelope-from smh@FreeBSD.org) Message-Id: <201601250940.u0P9eQTj029824@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: smh set sender to smh@FreeBSD.org using -f From: Steven Hartland Date: Mon, 25 Jan 2016 09:40:26 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-10@freebsd.org Subject: svn commit: r294711 - in stable/10/sys/dev: nvd nvme X-SVN-Group: stable-10 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Jan 2016 09:40:27 -0000 Author: smh Date: Mon Jan 25 09:40:25 2016 New Revision: 294711 URL: https://svnweb.freebsd.org/changeset/base/294711 Log: MFC r292074: Limit stripesize reported from nvd(4) to 4K Sponsored by: Multiplay Modified: stable/10/sys/dev/nvd/nvd.c stable/10/sys/dev/nvme/nvme.h stable/10/sys/dev/nvme/nvme_ns.c stable/10/sys/dev/nvme/nvme_sysctl.c Directory Properties: stable/10/ (props changed) Modified: stable/10/sys/dev/nvd/nvd.c ============================================================================== --- stable/10/sys/dev/nvd/nvd.c Mon Jan 25 09:31:32 2016 (r294710) +++ stable/10/sys/dev/nvd/nvd.c Mon Jan 25 09:40:25 2016 (r294711) @@ -295,7 +295,7 @@ nvd_new_disk(struct nvme_namespace *ns, disk->d_sectorsize = nvme_ns_get_sector_size(ns); disk->d_mediasize = (off_t)nvme_ns_get_size(ns); disk->d_delmaxsize = (off_t)nvme_ns_get_size(ns); - disk->d_stripesize = nvme_ns_get_stripesize(ns); + disk->d_stripesize = nvme_ns_get_optimal_sector_size(ns); if (TAILQ_EMPTY(&disk_head)) disk->d_unit = 0; Modified: stable/10/sys/dev/nvme/nvme.h ============================================================================== --- stable/10/sys/dev/nvme/nvme.h Mon Jan 25 09:31:32 2016 (r294710) +++ stable/10/sys/dev/nvme/nvme.h Mon Jan 25 09:40:25 2016 (r294711) @@ -870,6 +870,7 @@ const char * nvme_ns_get_serial_number(s const char * nvme_ns_get_model_number(struct nvme_namespace *ns); const struct nvme_namespace_data * nvme_ns_get_data(struct nvme_namespace *ns); +uint32_t nvme_ns_get_optimal_sector_size(struct nvme_namespace *ns); uint32_t nvme_ns_get_stripesize(struct nvme_namespace *ns); int nvme_ns_bio_process(struct nvme_namespace *ns, struct bio *bp, Modified: stable/10/sys/dev/nvme/nvme_ns.c ============================================================================== --- stable/10/sys/dev/nvme/nvme_ns.c Mon Jan 25 09:31:32 2016 (r294710) +++ stable/10/sys/dev/nvme/nvme_ns.c Mon Jan 25 09:40:25 2016 (r294711) @@ -45,6 +45,8 @@ __FBSDID("$FreeBSD$"); #include "nvme_private.h" +extern int nvme_max_optimal_sectorsize; + static void nvme_bio_child_inbed(struct bio *parent, int bio_error); static void nvme_bio_child_done(void *arg, const struct nvme_completion *cpl); @@ -217,6 +219,22 @@ nvme_ns_get_stripesize(struct nvme_names return (ns->stripesize); } +uint32_t +nvme_ns_get_optimal_sector_size(struct nvme_namespace *ns) +{ + uint32_t stripesize; + + stripesize = nvme_ns_get_stripesize(ns); + + if (stripesize == 0) + return nvme_ns_get_sector_size(ns); + + if (nvme_max_optimal_sectorsize == 0) + return (stripesize); + + return (MIN(stripesize, nvme_max_optimal_sectorsize)); +} + static void nvme_ns_bio_done(void *arg, const struct nvme_completion *status) { Modified: stable/10/sys/dev/nvme/nvme_sysctl.c ============================================================================== --- stable/10/sys/dev/nvme/nvme_sysctl.c Mon Jan 25 09:31:32 2016 (r294710) +++ stable/10/sys/dev/nvme/nvme_sysctl.c Mon Jan 25 09:40:25 2016 (r294711) @@ -33,6 +33,22 @@ __FBSDID("$FreeBSD$"); #include "nvme_private.h" +SYSCTL_NODE(_kern, OID_AUTO, nvme, CTLFLAG_RD, 0, "NVM Express"); +/* + * Intel NVMe controllers have a slow path for I/Os that span a 128KB + * stripe boundary but ZFS limits ashift, which is derived from + * d_stripesize, to 13 (8KB) so we limit the stripesize reported to + * geom(8) to 4KB by default. + * + * This may result in a small number of additional I/Os to require + * splitting in nvme(4), however the NVMe I/O path is very efficient + * so these additional I/Os will cause very minimal (if any) difference + * in performance or CPU utilisation. + */ +int nvme_max_optimal_sectorsize = 1<<12; +SYSCTL_INT(_kern_nvme, OID_AUTO, max_optimal_sectorsize, CTLFLAG_RWTUN, + &nvme_max_optimal_sectorsize, 0, "The maximum optimal sectorsize reported"); + /* * CTLTYPE_S64 and sysctl_handle_64 were added in r217616. Define these * explicitly here for older kernels that don't include the r217616