Date: Fri, 11 Mar 2016 11:33:14 -0700 From: Jim Harris <jim.harris@gmail.com> To: Warner Losh <imp@bsdimp.com> Cc: Alan Somers <asomers@freebsd.org>, Alexander Motin <mav@freebsd.org>, Steven Hartland <smh@freebsd.org>, "src-committers@freebsd.org" <src-committers@freebsd.org>, "svn-src-all@freebsd.org" <svn-src-all@freebsd.org>, "svn-src-head@freebsd.org" <svn-src-head@freebsd.org> Subject: Re: svn commit: r292074 - in head/sys/dev: nvd nvme Message-ID: <CAJP=Hc-kzyYtDk5ofXkXinxBaKhUS7uDesUYApjeCV24V9P4wg@mail.gmail.com> In-Reply-To: <CANCZdfqWM8u5z_=5NGPzeN5oY_Fvgx4yLSdw_beU5P9TvmKiGQ@mail.gmail.com> References: <201512110206.tBB264Ad039486@repo.freebsd.org> <CAOtMX2gAmt_--_vs6M=be9nShkCpKbwzK-K_N4t1MahMijyoog@mail.gmail.com> <CANCZdfp3aq4Ysb%2Bwbew-KjUvg7yqbzoqLSS82hKQQut=QRJQbQ@mail.gmail.com> <CAOtMX2h16eb=W9VC-hMtuHLknv8pEzA6OxP9=5uFtrftYsBTvw@mail.gmail.com> <56E28ABD.3060803@FreeBSD.org> <CAOtMX2gYj_GA%2BoYKh7Lav2N_kzipECAda9wEKOW09O_=7XBf-w@mail.gmail.com> <CANCZdfrQMKY0BAvafbdWC5ENvmGCj4VqwWjpw_p1cn0xXwXYvw@mail.gmail.com> <CANCZdfqWM8u5z_=5NGPzeN5oY_Fvgx4yLSdw_beU5P9TvmKiGQ@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Mar 11, 2016 at 9:31 AM, Warner Losh <imp@bsdimp.com> wrote: > > > > And keep in mind the original description was this: > > Quote: > > Intel NVMe controllers have a slow path for I/Os that span > a 128KB stripe boundary but ZFS limits ashift, which is derived > from d_stripesize, to 13 (8KB) so we limit the stripesize > reported to geom(8) to 4KB. > > This may result in a small number of additional I/Os > to require splitting in nvme(4), however the NVMe I/O > path is very efficient so these additional I/Os will cause > very minimal (if any) difference in performance or > CPU utilisation. > > unquote > > so the issue seems to being blown up a bit. It's better if you > don't generate these I/Os, but the driver copes by splitting them > on the affected drives causing a small inefficiency because you're > increasing the IOs needed to do the I/O, cutting into the IOPS budget. > > Warner > > Warner is correct. This is something specific to some of the Intel NVMe controllers. The core nvme(4) driver detects Intel controllers that benefit from splitting I/O crossing 128KB stripe boundaries, and will do the splitting internal to the driver. Reporting this stripe size further up the stack is only to reduce the number of I/O that require this splitting. In practice, there is no noticeable impact to performance or latency when splitting I/O on 128KB boundaries. Larger I/O are more likely to require splitting, but for larger I/O you will hit overall bandwidth limitations before getting close to IOPs limitations. -Jim
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJP=Hc-kzyYtDk5ofXkXinxBaKhUS7uDesUYApjeCV24V9P4wg>