Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 8 Jun 2012 19:16:07 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        Dag-Erling Sm??rgrav <des@des.no>, freebsd-arch@freebsd.org
Subject:   Re: Fast vs slow syscalls (Re: Fwd: [RFC] Kernel shared variables)
Message-ID:  <20120608185204.T1708@besplex.bde.org>
In-Reply-To: <20120607100401.GW85127@deviant.kiev.zoral.com.ua>
References:  <CACfq090r1tWhuDkxdSZ24fwafbVKU0yduu1yV2%2BoYo%2BwwT4ipA@mail.gmail.com> <201206051008.29568.jhb@freebsd.org> <86haupvk4a.fsf@ds4.des.no> <201206051222.12627.jhb@freebsd.org> <20120605171446.GA28387@onelab2.iet.unipi.it> <20120606040931.F1050@besplex.bde.org> <864nqovoek.fsf@ds4.des.no> <20120607064951.C1106@besplex.bde.org> <86sje7sf31.fsf@ds4.des.no> <20120607100401.GW85127@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 7 Jun 2012, Konstantin Belousov wrote:

> On Thu, Jun 07, 2012 at 10:26:10AM +0200, Dag-Erling Sm??rgrav wrote:
>> Bruce Evans <brde@optusnet.com.au> writes:
>>> Now 2.44 nsec/call makes sense, but you really should add some volatiles
>>> here to ensure that getpid() is not optimized away.
>>
>> As you can see from the disassembly I provided, it isn't.
>>
>>> SO it loops OK, but we can't see what getpid() does.  It must not be
>>> doing much.
>>
>> Umm, yes, that's the whole point of this conversation.  Linux's getpid()
>> is not a syscall, but a library function that returns a constant from a
>> page shared by the kernel.

Of course, but were down to nearly single-cycle times, so the difference
between the libary function using 1 or 2 instructions to load the value
may be significant.

>>> 5.4104 nsec/call for gettimeofday() is impossible if there is any
>>> rdtsc() hardware call or much layering.
>>
>> It's gettimeofday(0, 0), actually, so it doesn't need to read the clock.
>> If I pass a struct timeval as the first argument - so it *does* need to
>> read the clock - it's a little bit slower but still faster than an
>> actual system call.  Here's another run that demonstrates this - a
>> little bit slower than previous runs because I have other processes
>> running:
>>
>> getpid(): 10,000,000 iterations in 30,377 us
>> gettimeofday(0, 0): 10,000,000 iterations in 55,571 us
>> gettimeofday(&tv, 0): 10,000,000 iterations in 302,634 us
> So this timing seems to be approximately same by the order of magnitude
> as the times I get for the patch, around 25 vs. 30ns/per gettimeofday()
> call.

Not great.  I get 6.97 nsec for a slightly reduced version of FreeBSD's
1998 version of microtime(), which was written in i386 asm.  (This depends
on rdtsc taking only 6.5 cycles = 3.25 nsec on the test CPU (Athlon64)).
>From rev.1.40 of microtime.s:

% #include <machine/asm.h>
% 
% ENTRY(microtime)
% 	movl	tsc_freq, %ecx
% 	testl	%ecx, %ecx
% 	je	i8254_microtime

This branch is predicted perfectly but costs 0.24 nsec (0.5 cycles).

% 	rdtsc
% 	subl	tsc_bias, %eax
% 	mull	tsc_multiplier
% 	movl	%edx, %eax
% 	addl	timeoff+4, %eax	/* usec += time.tv_sec */
% 	movl	timeoff, %edx	/* sec = time.tv_sec */

Similar to binuptime().  To convert from the old microtime.s, I just
converted the variable names from aout to elf (and supplied dummy
variables), and removed locking instructions, which were pushfl/cli/popfl).

% 
% 	cmpl	$1000000, %eax	/* usec valid? */
% 	jb	1f
% 	subl	$1000000, %eax	/* adjust usec */
% 	incl	%edx		/* bump sec */

Probably faster with bintimes (can be branch-free then (?)), but by
converting directly to the final format we avoid a scaling step.  The
branch in it is predicted too perfectly by my dummy variables.

% 1:
% 	movl	4(%esp), %ecx	/* load timeval pointer arg */
% 	movl	%edx, (%ecx)	/* tvp->tv_sec = sec */
% 	movl	%eax, 4(%ecx)	/* tvp->tv_usec = usec */
% 
% 	ret
% 
% i8254_microtime:
% 	ret			/* XXX garbage */

>
> Linux seems slower probably due to slower CPU ? Mine is 3.4Ghz, while
> des used 3.1Ghz for Linux box.

If it is a different CPU model, the the speed of rdtsc can vary a lot.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120608185204.T1708>