Date: Tue, 8 Dec 2015 22:01:26 +0000 From: Steven Hartland <steven@multiplay.co.uk> To: Jim Harris <jim.harris@gmail.com> Cc: svn-src-head@freebsd.org Subject: Re: svn commit: r290199 - in head/sys/dev: nvd nvme Message-ID: <56675336.8080104@freebsd.org> In-Reply-To: <CAJP=Hc9e0eLGZLyu30bnTqw=uZgE5tc3jzrvt4oU580Dr1xu%2BA@mail.gmail.com> References: <201510301635.t9UGZI0F085365@repo.freebsd.org> <5667422C.1050806@freebsd.org> <CAJP=Hc9e0eLGZLyu30bnTqw=uZgE5tc3jzrvt4oU580Dr1xu%2BA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 08/12/2015 21:17, Jim Harris wrote: > > > On Tue, Dec 8, 2015 at 1:48 PM, Steven Hartland > <steven@multiplay.co.uk <mailto:steven@multiplay.co.uk>> wrote: > > Hi Jim could you let me know the use case for exposing the > controller stripe size as the disk stripe size done by this commit? > > I ask as it actually causes problems for ZFS which has checks to > ensure zpools perform optimally by correctly configuring ashift to > match the stripesize if reported. > > This is usually fine as stripe size typically reports the physical > block size of device, where sectorsize is the logical block size, > unfortunately this is currently limited to ashift of 13 (8KB) so > when nvme reports 128KB it limits it 8KB and hence every > subsequent zpool status reports a warning about optimal performance. > > Before I look to fix one or the other, I wanted to fully > understand the reasoning behind how nvme behaves here. > > > Some Intel NVMe controllers have a slow path for I/Os that span a > 128KB stripe boundary. The FreeBSD NVMe driver checks for this > condition, and will split the I/O inside of the NVMe driver in these > cases, to ensure we do not hit this slow path. > > The idea behind reporting the stripe size up through GEOM was to > provide a hint to upper layers, especially for file system layout - in > hopes of reducing the number of I/Os that need to be split. > > Based on your findings, limiting the stripe size reported up through > GEOM to 4KB would be OK. This may result in some small number of > additional I/Os to require splitting, but the NVMe I/O path is very > efficient so these additional I/Os would cause very minimal (if any) > difference in performance or CPU utilization. > Thanks for the fast reply Jim most appreciated. I've created a review for the change here: https://reviews.freebsd.org/D4446 If you're happy I'll get that committed. Regards Steve
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?56675336.8080104>