Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 8 Dec 2015 22:01:26 +0000
From:      Steven Hartland <steven@multiplay.co.uk>
To:        Jim Harris <jim.harris@gmail.com>
Cc:        svn-src-head@freebsd.org
Subject:   Re: svn commit: r290199 - in head/sys/dev: nvd nvme
Message-ID:  <56675336.8080104@freebsd.org>
In-Reply-To: <CAJP=Hc9e0eLGZLyu30bnTqw=uZgE5tc3jzrvt4oU580Dr1xu%2BA@mail.gmail.com>
References:  <201510301635.t9UGZI0F085365@repo.freebsd.org> <5667422C.1050806@freebsd.org> <CAJP=Hc9e0eLGZLyu30bnTqw=uZgE5tc3jzrvt4oU580Dr1xu%2BA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 08/12/2015 21:17, Jim Harris wrote:
>
>
> On Tue, Dec 8, 2015 at 1:48 PM, Steven Hartland 
> <steven@multiplay.co.uk <mailto:steven@multiplay.co.uk>> wrote:
>
>     Hi Jim could you let me know the use case for exposing the
>     controller stripe size as the disk stripe size done by this commit?
>
>     I ask as it actually causes problems for ZFS which has checks to
>     ensure zpools perform optimally by correctly configuring ashift to
>     match the stripesize if reported.
>
>     This is usually fine as stripe size typically reports the physical
>     block size of device, where sectorsize is the logical block size,
>     unfortunately this is currently limited to ashift of 13 (8KB) so
>     when nvme reports 128KB it limits it 8KB and hence every
>     subsequent zpool status reports a warning about optimal performance.
>
>     Before I look to fix one or the other, I wanted to fully
>     understand the reasoning behind how nvme behaves here.
>
>
> Some Intel NVMe controllers have a slow path for I/Os that span a 
> 128KB stripe boundary.  The FreeBSD NVMe driver checks for this 
> condition, and will split the I/O inside of the NVMe driver in these 
> cases, to ensure we do not hit this slow path.
>
> The idea behind reporting the stripe size up through GEOM was to 
> provide a hint to upper layers, especially for file system layout - in 
> hopes of reducing the number of I/Os that need to be split.
>
> Based on your findings, limiting the stripe size reported up through 
> GEOM to 4KB would be OK.  This may result in some small number of 
> additional I/Os to require splitting, but the NVMe I/O path is very 
> efficient so these additional I/Os would cause very minimal (if any) 
> difference in performance or CPU utilization.
>
Thanks for the fast reply Jim most appreciated. I've created a review 
for the change here: https://reviews.freebsd.org/D4446

If you're happy I'll get that committed.

     Regards
     Steve



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?56675336.8080104>