Date: Thu, 27 Dec 2012 05:53:22 -0800 From: Attilio Rao <attilio@freebsd.org> To: Gleb Smirnoff <glebius@freebsd.org> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r244732 - head/sys/sys Message-ID: <CAJ-FndC6Xq4EWcU203E4ucgr=jzOAutBBBkn%2BO1Qs0nL0i_Q3A@mail.gmail.com> In-Reply-To: <20121227132507.GY80310@FreeBSD.org> References: <201212271236.qBRCawuU078203@svn.freebsd.org> <20121227124657.GX80310@FreeBSD.org> <CAJ-FndD9aDfPprwBYC%2B3T1WsfE1b4aZJENRAjo%2BhFEL1NLBKmw@mail.gmail.com> <20121227132507.GY80310@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Dec 27, 2012 at 5:25 AM, Gleb Smirnoff <glebius@freebsd.org> wrote: > On Thu, Dec 27, 2012 at 04:55:22AM -0800, Attilio Rao wrote: > A> On Thu, Dec 27, 2012 at 4:46 AM, Gleb Smirnoff <glebius@freebsd.org> wrote: > A> > Attilio, > A> > > A> > On Thu, Dec 27, 2012 at 12:36:58PM +0000, Attilio Rao wrote: > A> > A> Author: attilio > A> > A> Date: Thu Dec 27 12:36:58 2012 > A> > A> New Revision: 244732 > A> > A> URL: http://svnweb.freebsd.org/changeset/base/244732 > A> > A> > A> > A> Log: > A> > A> br_prod_tail and br_cons_tail members are used as barrier to > A> > A> signal bug_ring ownership. However, instructions can be reordered > A> > A> around members write leading to stale values for ie. br_prod_bufs. > A> > A> > A> > A> Use correct memory barriers to ensure proper ordering of the > A> > A> ownership tokens updates. > A> > A> > A> > A> Sponsored by: EMC / Isilon storage division > A> > A> MFC after: 2 weeks > A> > > A> > Have you profiled this? > A> > > A> > After this change the buf_ring actually gains its own hand-rolled > A> > mutex: > A> > > A> > while (atomic_load_acq_32(&br->br_prod_tail) != prod_head) > A> > cpu_spinwait(); > A> > > A> > The only difference with mutex(9) is that this one isn't monitored > A> > by WITNESS. > A> > A> I think you are not correct. It doesn't disable interrupts (as > A> spinlock do) and it doesn't sleep. > A> So your analogy is completely off. > A> > A> Also, on x86 atomic_store_rel_*() is a simple write. The only thing > A> that really changes is the atomic_load_acq_*() that introduces a > A> locked instruction. > > This only thing, the locked instruction, affects performance a lot. I > suspect strong forwarding degradation after your change. Can you please > profile that? Yes but it is a matter of incorrect code vs. slower instruction. Also, you are not considering that there are much heavier-weight instructions already (wmb(), rmb(), which I'm going to change soon into actual barriers btw). I highly doubt you can measure the latency introduced by atomic_load_acq_*() when mfence and stuff is in place. The pessimization should only account for a small fraction of the overall performance. > A> > The idea behind buf_ring was lockless storing and lockless fetching > A> > from a ring and now this vanished. > A> > > A> > What does your change actually fixes except precise accounting of > A> > br_prod_bufs that are actually unused and should be better garbage > A> > collected rather than fixed? > A> > A> The write of br_prod_tail must happens as very last thing, also after > A> the whole buf setup. The only way you can enforce this is with a > A> memory barrier. I can double-check if we can garbage collect > A> br_prod_bufs but this should not be enough yet. > > Do you have a core file that illustrates that a ring can get into > inconsistent state? I don't I got it by code inspection. The br_prod_tail update must happen as very last thing because it means the buf is "ready-to-go" and it will be owned. However, the prior wmb() may be helpful in this case, at least for one case. I will do a follow up soon. For a longer discussion, I plan to move this into a real atomic_store_rel_*() soon. Attilio -- Peace can only be achieved by understanding - A. Einstein
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-FndC6Xq4EWcU203E4ucgr=jzOAutBBBkn%2BO1Qs0nL0i_Q3A>