Date: Thu, 27 Dec 2012 07:26:19 -0800 From: Attilio Rao <attilio@freebsd.org> To: Gleb Smirnoff <glebius@freebsd.org> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r244732 - head/sys/sys Message-ID: <CAJ-FndAzqHCNtLGFH=6Cm3rshMestSZz_naF1=saMEKuX9cyog@mail.gmail.com> In-Reply-To: <CAJ-FndC6Xq4EWcU203E4ucgr=jzOAutBBBkn%2BO1Qs0nL0i_Q3A@mail.gmail.com> References: <201212271236.qBRCawuU078203@svn.freebsd.org> <20121227124657.GX80310@FreeBSD.org> <CAJ-FndD9aDfPprwBYC%2B3T1WsfE1b4aZJENRAjo%2BhFEL1NLBKmw@mail.gmail.com> <20121227132507.GY80310@FreeBSD.org> <CAJ-FndC6Xq4EWcU203E4ucgr=jzOAutBBBkn%2BO1Qs0nL0i_Q3A@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Dec 27, 2012 at 5:53 AM, Attilio Rao <attilio@freebsd.org> wrote: > On Thu, Dec 27, 2012 at 5:25 AM, Gleb Smirnoff <glebius@freebsd.org> wrote: >> On Thu, Dec 27, 2012 at 04:55:22AM -0800, Attilio Rao wrote: >> A> On Thu, Dec 27, 2012 at 4:46 AM, Gleb Smirnoff <glebius@freebsd.org> wrote: >> A> > Attilio, >> A> > >> A> > On Thu, Dec 27, 2012 at 12:36:58PM +0000, Attilio Rao wrote: >> A> > A> Author: attilio >> A> > A> Date: Thu Dec 27 12:36:58 2012 >> A> > A> New Revision: 244732 >> A> > A> URL: http://svnweb.freebsd.org/changeset/base/244732 >> A> > A> >> A> > A> Log: >> A> > A> br_prod_tail and br_cons_tail members are used as barrier to >> A> > A> signal bug_ring ownership. However, instructions can be reordered >> A> > A> around members write leading to stale values for ie. br_prod_bufs. >> A> > A> >> A> > A> Use correct memory barriers to ensure proper ordering of the >> A> > A> ownership tokens updates. >> A> > A> >> A> > A> Sponsored by: EMC / Isilon storage division >> A> > A> MFC after: 2 weeks >> A> > >> A> > Have you profiled this? >> A> > >> A> > After this change the buf_ring actually gains its own hand-rolled >> A> > mutex: >> A> > >> A> > while (atomic_load_acq_32(&br->br_prod_tail) != prod_head) >> A> > cpu_spinwait(); >> A> > >> A> > The only difference with mutex(9) is that this one isn't monitored >> A> > by WITNESS. >> A> >> A> I think you are not correct. It doesn't disable interrupts (as >> A> spinlock do) and it doesn't sleep. >> A> So your analogy is completely off. >> A> >> A> Also, on x86 atomic_store_rel_*() is a simple write. The only thing >> A> that really changes is the atomic_load_acq_*() that introduces a >> A> locked instruction. >> >> This only thing, the locked instruction, affects performance a lot. I >> suspect strong forwarding degradation after your change. Can you please >> profile that? > > Yes but it is a matter of incorrect code vs. slower instruction. > Also, you are not considering that there are much heavier-weight > instructions already (wmb(), rmb(), which I'm going to change soon > into actual barriers btw). I highly doubt you can measure the latency > introduced by atomic_load_acq_*() when mfence and stuff is in place. > The pessimization should only account for a small fraction of the > overall performance. > >> A> > The idea behind buf_ring was lockless storing and lockless fetching >> A> > from a ring and now this vanished. >> A> > >> A> > What does your change actually fixes except precise accounting of >> A> > br_prod_bufs that are actually unused and should be better garbage >> A> > collected rather than fixed? >> A> >> A> The write of br_prod_tail must happens as very last thing, also after >> A> the whole buf setup. The only way you can enforce this is with a >> A> memory barrier. I can double-check if we can garbage collect >> A> br_prod_bufs but this should not be enough yet. >> >> Do you have a core file that illustrates that a ring can get into >> inconsistent state? > > I don't I got it by code inspection. The br_prod_tail update must > happen as very last thing because it means the buf is "ready-to-go" > and it will be owned. > > However, the prior wmb() may be helpful in this case, at least for one > case. I will do a follow up soon. For a longer discussion, I plan to > move this into a real atomic_store_rel_*() soon. Speaking of which, as you are here, I just found out that r241037 breaks the alignment of the structure. Infact the padding member is not updated accordingly. We don't have a param to control L2 caches, but I think that we can safely align them to the L1 cacheline for sure. Also, note that this padding is completely broken for MI requirements (it just assumes blindly 128 bytes L2 cachelines, which not always true even on i386). Attilio -- Peace can only be achieved by understanding - A. Einstein
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-FndAzqHCNtLGFH=6Cm3rshMestSZz_naF1=saMEKuX9cyog>