Date: Wed, 2 Dec 2020 21:21:29 +0200 From: Konstantin Belousov <kostikbel@gmail.com> To: Warner Losh <imp@bsdimp.com> Cc: Michal Meloun <mmel@freebsd.org>, src-committers <src-committers@freebsd.org>, svn-src-all <svn-src-all@freebsd.org>, svn-src-head <svn-src-head@freebsd.org> Subject: Re: svn commit: r368279 - head/sys/dev/nvme Message-ID: <X8fpOeryAr0Z%2BRjl@kib.kiev.ua> In-Reply-To: <CANCZdfpGBcN8cEUO4CuLeVcRjgpwusrA7xfoqB7t8bT4GgziSg@mail.gmail.com> References: <202012021654.0B2GsOP8000763@repo.freebsd.org> <X8fTZLNuBAt3vzUp@kib.kiev.ua> <CANCZdfpGBcN8cEUO4CuLeVcRjgpwusrA7xfoqB7t8bT4GgziSg@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Dec 02, 2020 at 11:17:01AM -0700, Warner Losh wrote: > On Wed, Dec 2, 2020 at 10:48 AM Konstantin Belousov <kostikbel@gmail.com> > wrote: > > > On Wed, Dec 02, 2020 at 04:54:24PM +0000, Michal Meloun wrote: > > > Author: mmel > > > Date: Wed Dec 2 16:54:24 2020 > > > New Revision: 368279 > > > URL: https://svnweb.freebsd.org/changeset/base/368279 > > > > > > Log: > > > NVME: Multiple busdma related fixes. > > ... > > > > > - in nvme_qpair_submit_tracker(), don't do explicit wmb() also for arm > > > and arm64. Bus_dmamap_sync() on these architectures is sufficient to > > ensure > > > that all CPU stores are visible to external (including DMA) > > observers. > > > @@ -982,7 +982,7 @@ nvme_qpair_submit_tracker(struct nvme_qpair *qpair, > > st > > > > > > bus_dmamap_sync(qpair->dma_tag, qpair->queuemem_map, > > > BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); > > > -#ifndef __powerpc__ > > > +#if !defined( __powerpc__) && !defined( __aarch64__) && !defined( > > __arm__) > > > /* > > > * powerpc's bus_dmamap_sync() already includes a heavyweight > > sync, but > > > * no other archs do. > > Does anybody have any evidence that the wmb() below is useful ? > > For instance, on x86, does nvme driver use any write-combining mappings ? > > > > It translates to a sfence on x86. It is done just before the write > that moves the tail pointer. sfence ensures that all writes are done before > that write is done. I believe that the nvme spec requires that the entire > submission queue entry be fully filled out before the write to the tailq > pointer moving it.[*] Right, and SFENCE is mostly a slow NOP, except situations where we deal with write-combining memory, or some specific instructions (CLFLUSHOPT etc). More, IN/OUT and uncached memory accesses, like BAR io, provide strongest serialization. > > I'm not enough of an expert on the exact details here to know if it's > absolutely required or not, but given the ordering requirements in the > spec, the intent appears to be related to enforcing that ordering. I don't > know enough to know if it accomplishes this goal or not. My point is that, unless there are some additional reasons, and not just the need to order writes, SFENCE is not needed and probably somewhat slows down the driver. It might be even measurable for NVMe that takes several millions of iops/sec. > > Warner > > [*] I know I said in the review it may be due to getting lower latency to > the device, but I now believe that to be mistaken after studying it further.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?X8fpOeryAr0Z%2BRjl>
