Date: Fri, 9 Nov 2012 04:26:03 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Attilio Rao <attilio@freebsd.org> Cc: src-committers@freebsd.org, John Baldwin <jhb@freebsd.org>, Jeff Roberson <jeff@freebsd.org>, Florian Smeets <flo@freebsd.org>, Bruce Evans <bde@freebsd.org>, Bruce Evans <brde@optusnet.com.au>, svn-src-projects@freebsd.org Subject: Re: svn commit: r238907 - projects/calloutng/sys/kern Message-ID: <20121109034942.C5338@besplex.bde.org> In-Reply-To: <CAJ-FndDpB2vQ7nBgYS4mNHb3_G4sXi3SfU7Y7kdVHMQZYKapig@mail.gmail.com> References: <201207301350.q6UDobCI099069@svn.freebsd.org> <CAJ-FndBj8tpC_BJXs_RH8sG2TBG8yA=Lxu3-GTVT9Ap_zOCuVQ@mail.gmail.com> <CAJ-FndDnO7wjnWPV0tTu%2BUGHjsxa3YDarMxmyei3ZmjLAFvRkQ@mail.gmail.com> <201207301732.33474.jhb@freebsd.org> <CAJ-FndD5EO12xsWOAe6u0EvX00q33wxO4OivnGjzj0=T2Oe8uA@mail.gmail.com> <CAJ-FndCRg0UCThFkatp=tw7rUWWCvhsApLE=iztLpxpGBC1F9w@mail.gmail.com> <CAJ-FndBqV2uD8Th9ePtxyJwhMAPzY3AXA5cQ7HszLp=%2BfSpHTA@mail.gmail.com> <CAJ-FndDPLmkpAJeGVN2wgbhdgHYezyUV-PPvH9e-CA7Go7HG3A@mail.gmail.com> <20121029155136.O943@besplex.bde.org> <CAJ-FndDpB2vQ7nBgYS4mNHb3_G4sXi3SfU7Y7kdVHMQZYKapig@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 2 Nov 2012, Attilio Rao wrote: > On 10/29/12, Bruce Evans <brde@optusnet.com.au> wrote: >> On Mon, 29 Oct 2012, Attilio Rao wrote: >> >>> Now that sched_pin()/sched_unpin() are fixed I would like to introduce >>> this new patch, making critical_enter()/critical_exit() inline: >>> http://www.freebsd.org/~attilio/inline_critical.patch >> ... >> My version goes the other way and uninlines mtx_lock_spin() and >> mtx_unlock_spin(). Then it inlines (open codes) critical_enter() and >> ... > > So, I thought more about this and I think that inlining > critical_exit() is not really going to bring any benefit here but > bloat. > This is because nested critical sections are rare rather not, which Rather rare !not? :-) > means you will end up in doing the function call most of the time and > plus we have the possible pessimization introduced by the memory > clobber (__compiler_membar()) and as you note possible deficiency > caming from the branch misprediction*. This seems best. I see a point about the rareness of the branch in critical_exit(). Not sure if it is the one you are making: since the nested case is rare, then the branch will normally be correctly predicted. If the function remains uninlined, then the branch still has a good chance of being correctly predicted. This depends on the nested case being so rare across all callers, that the non-nested case doesn't mess up the prediction by happening often. The branch predictor only has to maintain history for 1 branch for this. However, if the call is inlined and there are many callers, there might be too many to maintain history for them all. > I feel very unsure style-wise about my add to sys/systm.h because it > is already very messy that I cannot give a logical sense to that > chunk. It is less bad than most places. Some of the inlines in libkern.h are misplaced. However, the bitcount inlines in systm.h belong closer to libkern (or /dev/null). The split between systm.h and kernel.h is bogus. But systemy things like critical* don't belong near libkern. > * In theory we could use __predict_false() or similar but I don't > think this will really help as x86 doesn't have explicit instructions > to control branch prediction It can help as follows on x86: - modern x86 has a default for when the branch predictor is cold. The compiler should arrange the code so that the default matches the usual case. Compilers can either follow the code order or guess which branch is usual (-fpredict-branch-probability IIRC). The branches in critical_exit() and x86 spinlock_enter() seem to have a good order for the first, but might be mispredicted by the compiler. The branch in spinlock_exit() seems to be in a bad order for the first. This doesn't matter if the CPU's branch predictor stays warm. - compilers can help the CPU's branch predictor and instruction prefetcher by moving the unusual case far away. This helps mainly for large code. The code in critical_exit() is not large, except possibly if it is inlined, But __predict_false() never did much for me, perhaps because I test mainly non-large code in loops where branch predictors stay warm. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20121109034942.C5338>