Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 9 Nov 2012 04:26:03 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Attilio Rao <attilio@freebsd.org>
Cc:        src-committers@freebsd.org, John Baldwin <jhb@freebsd.org>, Jeff Roberson <jeff@freebsd.org>, Florian Smeets <flo@freebsd.org>, Bruce Evans <bde@freebsd.org>, Bruce Evans <brde@optusnet.com.au>, svn-src-projects@freebsd.org
Subject:   Re: svn commit: r238907 - projects/calloutng/sys/kern
Message-ID:  <20121109034942.C5338@besplex.bde.org>
In-Reply-To: <CAJ-FndDpB2vQ7nBgYS4mNHb3_G4sXi3SfU7Y7kdVHMQZYKapig@mail.gmail.com>
References:  <201207301350.q6UDobCI099069@svn.freebsd.org> <CAJ-FndBj8tpC_BJXs_RH8sG2TBG8yA=Lxu3-GTVT9Ap_zOCuVQ@mail.gmail.com> <CAJ-FndDnO7wjnWPV0tTu%2BUGHjsxa3YDarMxmyei3ZmjLAFvRkQ@mail.gmail.com> <201207301732.33474.jhb@freebsd.org> <CAJ-FndD5EO12xsWOAe6u0EvX00q33wxO4OivnGjzj0=T2Oe8uA@mail.gmail.com> <CAJ-FndCRg0UCThFkatp=tw7rUWWCvhsApLE=iztLpxpGBC1F9w@mail.gmail.com> <CAJ-FndBqV2uD8Th9ePtxyJwhMAPzY3AXA5cQ7HszLp=%2BfSpHTA@mail.gmail.com> <CAJ-FndDPLmkpAJeGVN2wgbhdgHYezyUV-PPvH9e-CA7Go7HG3A@mail.gmail.com> <20121029155136.O943@besplex.bde.org> <CAJ-FndDpB2vQ7nBgYS4mNHb3_G4sXi3SfU7Y7kdVHMQZYKapig@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 2 Nov 2012, Attilio Rao wrote:

> On 10/29/12, Bruce Evans <brde@optusnet.com.au> wrote:
>> On Mon, 29 Oct 2012, Attilio Rao wrote:
>>
>>> Now that sched_pin()/sched_unpin() are fixed I would like to introduce
>>> this new patch, making critical_enter()/critical_exit() inline:
>>> http://www.freebsd.org/~attilio/inline_critical.patch
>> ...
>> My version goes the other way and uninlines mtx_lock_spin() and
>> mtx_unlock_spin().  Then it inlines (open codes) critical_enter() and
>> ...
>
> So, I thought more about this and I think that inlining
> critical_exit() is not really going to bring any benefit here but
> bloat.
> This is because nested critical sections are rare rather not, which

Rather rare !not? :-)

> means you will end up in doing the function call most of the time and
> plus we have the possible pessimization introduced by the memory
> clobber (__compiler_membar()) and as you note possible deficiency
> caming from the branch misprediction*.

This seems best.

I see a point about the rareness of the branch in critical_exit().
Not sure if it is the one you are making: since the nested case is
rare, then the branch will normally be correctly predicted.  If the
function remains uninlined, then the branch still has a good chance
of being correctly predicted.  This depends on the nested case being
so rare across all callers, that the non-nested case doesn't mess
up the prediction by happening often.  The branch predictor only
has to maintain history for 1 branch for this.  However, if the
call is inlined and there are many callers, there might be too many
to maintain history for them all.

> I feel very unsure style-wise about my add to sys/systm.h because it
> is already very messy that I cannot give a logical sense to that
> chunk.

It is less bad than most places.  Some of the inlines in libkern.h
are misplaced.  However, the bitcount inlines in systm.h belong
closer to libkern (or /dev/null).  The split between systm.h and
kernel.h is bogus.  But systemy things like critical* don't
belong near libkern.

> * In theory we could use __predict_false() or similar but I don't
> think this will really help as x86 doesn't have explicit instructions
> to control branch prediction

It can help as follows on x86:
- modern x86 has a default for when the branch predictor is cold.  The
   compiler should arrange the code so that the default matches the
   usual case.  Compilers can either follow the code order or guess
   which branch is usual (-fpredict-branch-probability IIRC).  The
   branches in critical_exit() and x86 spinlock_enter() seem to have
   a good order for the first, but might be mispredicted by the compiler.
   The branch in spinlock_exit() seems to be in a bad order for the first.
   This doesn't matter if the CPU's branch predictor stays warm.
- compilers can help the CPU's branch predictor and instruction prefetcher
   by moving the unusual case far away.  This helps mainly for large code.
   The code in critical_exit() is not large, except possibly if it is
   inlined,

But __predict_false() never did much for me, perhaps because I test mainly
non-large code in loops where branch predictors stay warm.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20121109034942.C5338>