From owner-svn-src-all@freebsd.org Sat Aug 4 15:11:19 2018 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 77B43106E3EA for ; Sat, 4 Aug 2018 15:11:19 +0000 (UTC) (envelope-from ian@freebsd.org) Received: from pmta2.delivery6.ore.mailhop.org (pmta2.delivery6.ore.mailhop.org [54.200.129.228]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 019BF7B9E7 for ; Sat, 4 Aug 2018 15:11:18 +0000 (UTC) (envelope-from ian@freebsd.org) X-MHO-RoutePath: aGlwcGll X-MHO-User: a1eda9cb-97f8-11e8-904b-1d2e466b3c59 X-Report-Abuse-To: https://support.duocircle.com/support/solutions/articles/5000540958-duocircle-standard-smtp-abuse-information X-Originating-IP: 67.177.211.60 X-Mail-Handler: DuoCircle Outbound SMTP Received: from ilsoft.org (unknown [67.177.211.60]) by outbound2.ore.mailhop.org (Halon) with ESMTPSA id a1eda9cb-97f8-11e8-904b-1d2e466b3c59; Sat, 04 Aug 2018 15:11:16 +0000 (UTC) Received: from rev (rev [172.22.42.240]) by ilsoft.org (8.15.2/8.15.2) with ESMTP id w74FBFoD046128; Sat, 4 Aug 2018 09:11:15 -0600 (MDT) (envelope-from ian@freebsd.org) Message-ID: <1533395475.9860.9.camel@freebsd.org> Subject: Re: svn commit: r337273 - head/sys/dev/nvme From: Ian Lepore To: Justin Hibbits , Konstantin Belousov Cc: John Baldwin , src-committers , svn-src-all@freebsd.org, svn-src-head@freebsd.org Date: Sat, 04 Aug 2018 09:11:15 -0600 In-Reply-To: References: <201808032004.w73K46XJ053249@repo.freebsd.org> <20180804080840.GI6049@kib.kiev.ua> <20180804130315.GK6049@kib.kiev.ua> Content-Type: text/plain; charset="ISO-8859-1" X-Mailer: Evolution 3.18.5.1 FreeBSD GNOME Team Port Mime-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Aug 2018 15:11:19 -0000 On Sat, 2018-08-04 at 08:29 -0500, Justin Hibbits wrote: > On Sat, Aug 4, 2018, 08:03 Konstantin Belousov wrote: > > > > > On Sat, Aug 04, 2018 at 05:14:31AM -0700, John Baldwin wrote: > > > > > > On 8/4/18 1:08 AM, Konstantin Belousov wrote: > > > > > > > > On Fri, Aug 03, 2018 at 08:04:06PM +0000, Justin Hibbits wrote: > > > > > > > > > > Author: jhibbits > > > > > Date: Fri Aug  3 20:04:06 2018 > > > > > New Revision: 337273 > > > > > URL: https://svnweb.freebsd.org/changeset/base/337273 > > > > > > > > > > Log: > > > > >   nvme(4): Add bus_dmamap_sync() at the end of the request path > > > > > > > > > >   Summary: > > > > >   Some architectures, in this case powerpc64, need explicit > > synchronization > > > > > > > > > > > > > > > > >   barriers vs device accesses. > > > > > > > > > >   Prior to this change, when running 'make buildworld -j72' on a > > 18-core > > > > > > > > > > > > > > > > >   (72-thread) POWER9, I would see controller resets often.  With this > > change, I > > > > > > > > > > > > > > > > >   don't see these resets messages, though another tester still does, > > for yet to be > > > > > > > > > > > > > > > > >   determined reasons, so this may not be a complete fix. > > Additionally, I see a > > > > > > > > > > > > > > > > >   ~5-10% speed up in buildworld times, likely due to not needing to > > reset the > > > > > > > > > > > > > > > > >   controller. > > > > > > > > > >   Reviewed By: jimharris > > > > >   Differential Revision: https://reviews.freebsd.org/D16570 > > > > > > > > > > Modified: > > > > >   head/sys/dev/nvme/nvme_qpair.c > > > > > > > > > > Modified: head/sys/dev/nvme/nvme_qpair.c > > > > > > > ============================================================================== > > > > > > > > > > > > > > > > > --- head/sys/dev/nvme/nvme_qpair.c Fri Aug  3 19:24:04 2018 > > (r337272) > > > > > > > > > > > > > > > > > +++ head/sys/dev/nvme/nvme_qpair.c Fri Aug  3 20:04:06 2018 > > (r337273) > > > > > > > > > > > > > > > > > @@ -401,9 +401,13 @@ nvme_qpair_complete_tracker(struct nvme_qpair > > *qpair, > > > > > > > > > > > > > > > > >            req->retries++; > > > > >            nvme_qpair_submit_tracker(qpair, tr); > > > > >    } else { > > > > > -          if (req->type != NVME_REQUEST_NULL) > > > > > +          if (req->type != NVME_REQUEST_NULL) { > > > > > +                  bus_dmamap_sync(qpair->dma_tag_payload, > > > > > +                      tr->payload_dma_map, > > > > > +                      BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE); > > > > >                    bus_dmamap_unload(qpair->dma_tag_payload, > > > > >                        tr->payload_dma_map); > > > > > +          } > > > > > > > > > >            nvme_free_request(req); > > > > >            tr->req = NULL; > > > > > @@ -487,6 +491,8 @@ nvme_qpair_process_completions(struct nvme_qpair > > *qpai > > > > > > > > > > > > > > > > >             */ > > > > >            return (false); > > > > > > > > > > +  bus_dmamap_sync(qpair->dma_tag, qpair->queuemem_map, > > > > > +      BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE); > > > > >    while (1) { > > > > >            cpl = qpair->cpl[qpair->cq_head]; > > > > > > > > > > @@ -828,7 +834,16 @@ nvme_qpair_submit_tracker(struct nvme_qpair > > *qpair, st > > > > > > > > > > > > > > > > >    if (++qpair->sq_tail == qpair->num_entries) > > > > >            qpair->sq_tail = 0; > > > > > > > > > > +  bus_dmamap_sync(qpair->dma_tag, qpair->queuemem_map, > > > > > +      BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); > > > > > +#ifndef __powerpc__ > > > > > +  /* > > > > > +   * powerpc's bus_dmamap_sync() already includes a heavyweight > > sync, but > > > > > > > > > > > > > > > > > +   * no other archs do. > > > > > +   */ > > > > >    wmb(); > > > > > +#endif > > > > What is the purpose of this call ?  It is useless without paired read > > > > barrier.  So where is the reciprocal rmb() ? > > > For DMA, the rmb is in the device controller.  However, architectures > > > that need this kind of ordering should do it in their bus_dmmap_sync op, > > > and this explicit one needs to be removed.  (Alpha had a wmb in its > > > bus_dmamap_sync op for this reason.) > > Yes, if something special is needed, it should happen in platform-specific > > busdma code. > > > > Also, if wmb() is needed, then it is not a supposed semantic or > > wmb(), but a specific side-effects of one of the instruction in the > > implementation of wmb(). > > > > As I noted, on x86 it is not needed and detrimental to the performance. > > > According to Jim Harris it is needed for x86. I tried to remove it thinking > it was unnecessary with the sync in place now.  The wmb() was added way > back in r240616 by him, when only x86 was supported by the driver. > bus_dmamap_sync() for all archs except powerpc lack a barrier, from what I > can see.  See the review for more discussion that took place. If it isn't > needed I will gladly remove it. > The arm busdma sync routines definitely invoke the necessary barriers, which on arm are spelled 'dsb' and 'isb' and are in the low-level routines called from the busdma code, such as dcache_wb_pou(), dcache_inv_poc(), etc. -- Ian