FreeBSD Mail Archives

Date:      Thu, 27 Dec 2012 08:37:24 -0800
From:      Attilio Rao <attilio@freebsd.org>
To:        Gleb Smirnoff <glebius@freebsd.org>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org
Subject:   Re: svn commit: r244732 - head/sys/sys
Message-ID:  <CAJ-FndDeE15ygFKm1=euyvE8PW=HyUiK7WUcfgpX04Nj4GuExA@mail.gmail.com>
In-Reply-To: <CAJ-FndAzqHCNtLGFH=6Cm3rshMestSZz_naF1=saMEKuX9cyog@mail.gmail.com>
References:  <201212271236.qBRCawuU078203@svn.freebsd.org> <20121227124657.GX80310@FreeBSD.org> <CAJ-FndD9aDfPprwBYC%2B3T1WsfE1b4aZJENRAjo%2BhFEL1NLBKmw@mail.gmail.com> <20121227132507.GY80310@FreeBSD.org> <CAJ-FndC6Xq4EWcU203E4ucgr=jzOAutBBBkn%2BO1Qs0nL0i_Q3A@mail.gmail.com> <CAJ-FndAzqHCNtLGFH=6Cm3rshMestSZz_naF1=saMEKuX9cyog@mail.gmail.com>

On Thu, Dec 27, 2012 at 7:26 AM, Attilio Rao <attilio@freebsd.org> wrote:
> On Thu, Dec 27, 2012 at 5:53 AM, Attilio Rao <attilio@freebsd.org> wrote:
>> On Thu, Dec 27, 2012 at 5:25 AM, Gleb Smirnoff <glebius@freebsd.org> wrote:
>>> On Thu, Dec 27, 2012 at 04:55:22AM -0800, Attilio Rao wrote:
>>> A> On Thu, Dec 27, 2012 at 4:46 AM, Gleb Smirnoff <glebius@freebsd.org> wrote:
>>> A> >   Attilio,
>>> A> >
>>> A> > On Thu, Dec 27, 2012 at 12:36:58PM +0000, Attilio Rao wrote:
>>> A> > A> Author: attilio
>>> A> > A> Date: Thu Dec 27 12:36:58 2012
>>> A> > A> New Revision: 244732
>>> A> > A> URL: http://svnweb.freebsd.org/changeset/base/244732
>>> A> > A>
>>> A> > A> Log:
>>> A> > A>   br_prod_tail and br_cons_tail members are used as barrier to
>>> A> > A>   signal bug_ring ownership. However, instructions can be reordered
>>> A> > A>   around members write leading to stale values for ie. br_prod_bufs.
>>> A> > A>
>>> A> > A>   Use correct memory barriers to ensure proper ordering of the
>>> A> > A>   ownership tokens updates.
>>> A> > A>
>>> A> > A>   Sponsored by:      EMC / Isilon storage division
>>> A> > A>   MFC after: 2 weeks
>>> A> >
>>> A> > Have you profiled this?
>>> A> >
>>> A> > After this change the buf_ring actually gains its own hand-rolled
>>> A> > mutex:
>>> A> >
>>> A> >         while (atomic_load_acq_32(&br->br_prod_tail) != prod_head)
>>> A> >                 cpu_spinwait();
>>> A> >
>>> A> > The only difference with mutex(9) is that this one isn't monitored
>>> A> > by WITNESS.
>>> A>
>>> A> I think you are not correct. It doesn't disable interrupts (as
>>> A> spinlock do) and it doesn't sleep.
>>> A> So your analogy is completely off.
>>> A>
>>> A> Also, on x86 atomic_store_rel_*() is a simple write. The only thing
>>> A> that really changes is the atomic_load_acq_*() that introduces a
>>> A> locked instruction.
>>>
>>> This only thing, the locked instruction, affects performance a lot. I
>>> suspect strong forwarding degradation after your change. Can you please
>>> profile that?
>>
>> Yes but it is a matter of incorrect code vs. slower instruction.
>> Also, you are not considering that there are much heavier-weight
>> instructions already (wmb(), rmb(), which I'm going to change soon
>> into actual barriers btw). I highly doubt you can measure the latency
>> introduced by atomic_load_acq_*() when mfence and stuff is in place.
>> The pessimization should only account for a small fraction of the
>> overall performance.
>>
>>> A> > The idea behind buf_ring was lockless storing and lockless fetching
>>> A> > from a ring and now this vanished.
>>> A> >
>>> A> > What does your change actually fixes except precise accounting of
>>> A> > br_prod_bufs that are actually unused and should be better garbage
>>> A> > collected rather than fixed?
>>> A>
>>> A> The write of br_prod_tail must happens as very last thing, also after
>>> A> the whole buf setup. The only way you can enforce this is with a
>>> A> memory barrier. I can double-check if we can garbage collect
>>> A> br_prod_bufs but this should not be enough yet.
>>>
>>> Do you have a core file that illustrates that a ring can get into
>>> inconsistent state?
>>
>> I don't I got it by code inspection. The br_prod_tail update must
>> happen as very last thing because it means the buf is "ready-to-go"
>> and it will be owned.
>>
>> However, the prior wmb() may be helpful in this case, at least for one
>> case. I will do a follow up soon. For a longer discussion, I plan to
>> move this into a real atomic_store_rel_*() soon.
>
> Speaking of which, as you are here, I just found out that r241037
> breaks the alignment of the structure.
> Infact the padding member is not updated accordingly.
>
> We don't have a param to control L2 caches, but I think that we can
> safely align them to the L1 cacheline for sure.
> Also, note that this padding is completely broken for MI requirements
> (it just assumes blindly 128 bytes L2 cachelines, which not always
> true even on i386).

More specifically this patch:
http://www.freebsd.org/~attilio/bufring_pad.patch

Of course I don't think the optimization is important in the
DEBUG_BUFRING on case, so the patch should be fine.

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-FndDeE15ygFKm1=euyvE8PW=HyUiK7WUcfgpX04Nj4GuExA>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation