From owner-svn-src-all@freebsd.org Fri Mar 11 04:58:46 2016 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DF7E4ACB2EA; Fri, 11 Mar 2016 04:58:46 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-ob0-x22f.google.com (mail-ob0-x22f.google.com [IPv6:2607:f8b0:4003:c01::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B04F22EE; Fri, 11 Mar 2016 04:58:46 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-ob0-x22f.google.com with SMTP id fz5so102876056obc.0; Thu, 10 Mar 2016 20:58:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc; bh=V2IhuUV9kNm0n01QMR+lwH/m8xkINsgu1OPnBvOXSvk=; b=FRSMvLF7As9OF1oIBnJVfZtKSLuBZMr/ls5zMYrw2dxBRA9oZFO4O7o9A4230ZCnpV +A8KNuSlfALVZNj9fwpi3Q+zIeXm8XNGA9P8EcmBVa59FSjwGJqQZ115fnyqbNnudlPz ZTcDOtvc/D0C0AUDdUp3i1sTqaHbXpllIajiBKWmXkVTat+P1BdNjEELB2wzVYFfo5Ea 0l2XxpmZKNGTGrYzopiCUxlddCRlnhNi/6s6l21wVCnttDuRJSZDQf2IzCg9oyVaRHfM OUNyUrNslP6msmMWlLrXe19Erso+qqtxuBNdfjGEfiUbJr1aiJX0/0gpWnMrRu2Hyi+n FL2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:cc; bh=V2IhuUV9kNm0n01QMR+lwH/m8xkINsgu1OPnBvOXSvk=; b=PVVlu9aEW2VsUMhK6c50g8w7rf9do8YdTKetkokCJo4UiwK8KAFVVMEJjxa6tLAHS5 ExCEV5bf9qRJhJwR1nh4WsOhYqGlELWbwkL2Q78x5wYI6aD3gMMVcBXAmLH6MB3QRPn2 tu3atFLM1YNvRCZqz0N6p6S/kRqtzmDbm7jj01mCRMBwRkCeKCpmFOvrpd49eKSy5ld8 /bsqYg1b2rzR1MRgoujEtB82GhgE9xG/bvyllxtXhLZvXmC905aMPBBUDfHa1gDKgbk0 +5/+YvNUqpc6NAwcuGe4qBYVHnNBfB4eOzJAB+nt2zPqSkvGGS7upZzBQew/FZ0XClNH c0qA== X-Gm-Message-State: AD7BkJKy5VPeJKEQX57g1FFkRiVyzLp5UBBpK0YGNR8kQSGeg44fU0WLrZ3WC1DCBmqs4NVI7qpzjJ/NJqgT+w== MIME-Version: 1.0 X-Received: by 10.182.34.167 with SMTP id a7mr4303576obj.41.1457672325917; Thu, 10 Mar 2016 20:58:45 -0800 (PST) Sender: asomers@gmail.com Received: by 10.202.64.138 with HTTP; Thu, 10 Mar 2016 20:58:45 -0800 (PST) In-Reply-To: References: <201512110206.tBB264Ad039486@repo.freebsd.org> Date: Thu, 10 Mar 2016 21:58:45 -0700 X-Google-Sender-Auth: FinJUZvtBH4mUNW9u0upe4Yg38M Message-ID: Subject: Re: svn commit: r292074 - in head/sys/dev: nvd nvme From: Alan Somers To: Warner Losh Cc: Steven Hartland , "src-committers@freebsd.org" , "svn-src-all@freebsd.org" , "svn-src-head@freebsd.org" Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.21 X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Mar 2016 04:58:47 -0000 Do they behave badly for writes that cross a 128KB boundary, but are nonetheless aligned to 128KB boundaries? Then I don't understand how this change (or mav's replacement) is supposed to help. The stripesize is supposed to be the minimum write that the device can accept without requiring a read-modify-write. ZFS guarantees that it will never issue a write smaller than the stripesize, nor will it ever issue a write that is not aligned to a stripesize-boundary. But even if ZFS worked with 128KB stripesizes, it would still happily issue writes a multiple of 128KB in size, and these would cross those boundaries. Am I not understanding something here? -Alan On Thu, Mar 10, 2016 at 9:34 PM, Warner Losh wrote: > Some Intel NVMe drives behave badly when the LBA range crosses a 128k > boundary. Their > performance is worse for those transactions than for ones that don't cross > the 128k boundary. > > Warner > > On Thu, Mar 10, 2016 at 11:01 AM, Alan Somers wrote: > >> Are you saying that Intel NVMe controllers perform poorly for all I/Os >> that are less than 128KB, or just for I/Os of any size that cross a 128KB >> boundary? >> >> On Thu, Dec 10, 2015 at 7:06 PM, Steven Hartland wrote: >> >>> Author: smh >>> Date: Fri Dec 11 02:06:03 2015 >>> New Revision: 292074 >>> URL: https://svnweb.freebsd.org/changeset/base/292074 >>> >>> Log: >>> Limit stripesize reported from nvd(4) to 4K >>> >>> Intel NVMe controllers have a slow path for I/Os that span a 128KB >>> stripe boundary but ZFS limits ashift, which is derived from d_stripesize, >>> to 13 (8KB) so we limit the stripesize reported to geom(8) to 4KB. >>> >>> This may result in a small number of additional I/Os to require >>> splitting in nvme(4), however the NVMe I/O path is very efficient so these >>> additional I/Os will cause very minimal (if any) difference in performance >>> or CPU utilisation. >>> >>> This can be controller by the new sysctl >>> kern.nvme.max_optimal_sectorsize. >>> >>> MFC after: 1 week >>> Sponsored by: Multiplay >>> Differential Revision: https://reviews.freebsd.org/D4446 >>> >>> Modified: >>> head/sys/dev/nvd/nvd.c >>> head/sys/dev/nvme/nvme.h >>> head/sys/dev/nvme/nvme_ns.c >>> head/sys/dev/nvme/nvme_sysctl.c >>> >>> >