Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 19 Oct 2009 09:47:58 -0600 (MDT)
From:      "M. Warner Losh" <imp@bsdimp.com>
To:        raj@semihalf.com
Cc:        gballet@gmail.com, tinguely@casselton.net, freebsd-arm@freebsd.org, stas@deglitch.com
Subject:   Re: Adding members to struct cpu_functions
Message-ID:  <20091019.094758.-2092524312.imp@bsdimp.com>
In-Reply-To: <05B19969-B238-4E3A-8326-624067F0362B@semihalf.com>
References:  <4AD39C78.5050309@freebsd.org> <4ADB38FA.2080604@freebsd.org> <05B19969-B238-4E3A-8326-624067F0362B@semihalf.com>

next in thread | previous in thread | raw e-mail | index | archive | help
In message: <05B19969-B238-4E3A-8326-624067F0362B@semihalf.com>
            Rafal Jaworowski <raj@semihalf.com> writes:
: On 2009-10-18, at 17:49, Nathan Whitehorn wrote:
[[ trimmed ]]
: > I just did the measurements on a 1.8 GHz PowerPC G5. There were four  
: > tests, each repeated 1 million times. "Load and store" involves  
: > incrementing a volatile int from 0 to 1e6 inline. "Direct calls"  
: > involves a branch to a function that returns 0 and does nothing  
: > else. "Function ptr" calls the same function via a pointer stored in  
: > a register, and "KOBJ calls" calls it via KOBJ. Here are the results  
: > (errors are +/- 0.5 ns for the function call measurements due to  
: > compiler optimization jitter, and 0 for load and store, since that  
: > takes a deterministic number of clock cycles):
: >
: > 32-bit kernel:
: > Load and store:  26.1 ns
: > Direct calls:   7.2 ns
: > Function ptr:   8.4 ns
: > KOBJ calls:     17.8 ns
: >
: > 64-bit kernel:
: > Load and store:  9.2 ns
: > Direct calls:   6.1 ns
: > Function ptr:   8.3 ns
: > KOBJ calls:     40.5 ns
: >
: > ABI changes make a large difference, as you can see. The cost of  
: > calling via KOBJ is non-negligible, but small, especially compared  
: > to the cost of doing anything involving memory. I don't know how  
: > this changes with ARM calling conventions.
: 
: Very interesting, thanks! Could you elaborate on the testing details  
: and share the diagnostic code so we could repeat this with other CPU  
: variations like Book-E PowerPC, or ARM?

I'd love to see this on MIPS too...

KOBJ is a big win for device configuration, where one memory I/O can
take 60 times these call numbers...

Warner



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20091019.094758.-2092524312.imp>