Date: Tue, 3 Apr 2001 19:07:29 -0700 (PDT) From: Matt Dillon <dillon@earth.backplane.com> To: Garrett Wollman <wollman@khavrinen.lcs.mit.edu> Cc: Alfred Perlstein <alfred@FreeBSD.ORG>, cvs-committers@FreeBSD.ORG, cvs-all@FreeBSD.ORG Subject: Re: cvs commit: src/sys/sys mbuf.h src/sys/kern uipc_mbuf.c Message-ID: <200104040207.f3427Tw80262@earth.backplane.com> References: <200104030315.f333FCX69312@freefall.freebsd.org> <20010403140457.B2952@electricjellyfish.net> <200104031813.f33ID4b58965@earth.backplane.com> <20010403194004.A15434@technokratis.com> <200104040020.f340Kgi74269@earth.backplane.com> <20010403173529.O12164@fw.wintelcom.net> <200104040106.VAA26103@khavrinen.lcs.mit.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
:<<On Tue, 3 Apr 2001 17:35:29 -0700, Alfred Perlstein <alfred@FreeBSD.org> said: : :> While this is a good idea, it doesn't give us a consistant view of :> the stats without additional atomic ops or critical regions. : :Atomic operations are likely to be cheaper on modern platforms than :locking. In any case, you can simply keep per-CPU stats and then :summarize when they are requested, which is even cheaper, since the :stats are updated FAR more frequently than they are inspected. : :-GAWollman A per-cpu variable that is only manipulated by that cpu does not even need to be bus-locked (i.e. not even a 'lock' prefix is required for i386). For a counter, a simple 'incl' or 'addl' type of instruction is sufficient. In order of expense, for an i386: VERY FAST normal (per-cpu) read-modify-write instruction, no cache contention. (incl, addl, etc...) SLOW bus-locked instruction, no cache contention. (lock; addl ...) EXTREMELY SLOW bus-locked instruction, cache contention. (lock; addl ...) In anycase, people should keep in mind that the whole point of using a per-cpu variable in this case is to avoid *ALL* locking requirements... avoid the mutexes, AND avoid any bus locking. The moment you do either you might as well throw in the towel and not bother. But if you do it right, then whatever 'slow' cases remain (e.g. no mbufs on the per-cpu free list) can be implemented with simple global mutexes and no special optimizations. In fact, any serious effort towards optimizing the slow case when you have a fast case fronting it is nothing but a waste of time. Don't add complex optimizations where they aren't needed. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe cvs-all" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200104040207.f3427Tw80262>