Date: Tue, 13 Jul 1999 13:37:15 +1000 From: Peter Jeremy <jeremyp@gsmx07.alcatel.com.au> To: mike@smith.net.au Cc: freebsd-current@FreeBSD.ORG Subject: Re: "objtrm" problem probably found (was Re: Stuck in "objtrm") Message-ID: <99Jul13.134051est.40360@border.alcanet.com.au> In-Reply-To: <199907130209.TAA03301@dingo.cdrom.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Mike Smith <mike@smith.net.au> wrote: >> Although function calls are more expensive than inline code, >> they aren't necessarily a lot more so, and function calls to >> non-locked RMW operations are certainly much cheaper than >> inline locked RMW operations. > >This is a fairly key statement in context, and an opinion here would >count for a lot; are function calls likely to become more or less >expensive in time? Based on general computer architecture principles, I'd say that a lock prefix is likely to become more expensive[1], whilst a function call will become cheaper[2] over time. I'm not sure that this is an important issue here. The sole advantage of moving to indirect function calls would be that the same object code could be used on both UP and SMP configurations, without incurring the overhead of the lock prefix in the UP configuration. (At the expense of an additional function call in all configurations). We can't avoid the lock prefix overhead in the SMP case. Based on the timings I did this morning, function calls are (unacceptably, IMHO) expensive on all the CPU's I have to hand (i386, Pentium and P-II) - the latter two presumably comprising the bulk of current FreeBSD use. Currently the UP/SMP decision is made at compile time (and has significant and widespread impact) - therefore there seems little (if any) benefit in using function calls within the main kernel. I believe that Matt's patched i386/include/atomic.h, with the addition of code to only include the lock prefix when SMP is defined, is currently the optimal approach for the kernel - and I can't see any way a future IA-32 implementation could change that. The only benefit could be for kernel modules - a module could possibly be compiled so the same LKM would run on either UP or SMP. Note that function calls for atomic operations may not be sufficient (by themselves) to achieve this: One of the SMP gurus may be able to confirm whether anything else prevents an SMP-compiled LKM running with a UP kernel. If the lock prefix overhead becomes an issue for LKMs, then we could define a variant of i386/include/atomic.h (eg by using a #define which is only true for compiling LKMs) which does use indirect function calls (and add the appropriate initialisation code). This is a trivial exercise (which I'll demonstrate on request). [1] A locked instruction implies a synchronous RMW cycle. In order to meet write-ordering guarantees (without which, a locked RMW cycle would be useless as a semaphore primitive), it implies a complete write serialization, and probably some level of instruction serialisation. Since write-back pipelines will get longer and parallel execution units more numerous, the cost of a serialisation operation will get relatively higher. Also, lock instructions are relatively infrequent, therefore there is little incentive to expend valuable silicon on trying to make them more efficient (at least as seen by the executing CPU). [2] Function calls _are_ fairly common, therefore it probably is worthwhile expending some effort in optimising them - and the stack updates associated with a leaf subroutine are fairly easy to totally hide in an on-chip write pipeline/cache. Peter To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?99Jul13.134051est.40360>