Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 6 Jun 2012 04:36:54 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Luigi Rizzo <rizzo@iet.unipi.it>
Cc:        Gianni <gianni@FreeBSD.org>, John Baldwin <jhb@FreeBSD.org>, Alan Cox <alc@rice.edu>, Alexander Kabaev <kan@FreeBSD.org>, Attilio Rao <attilio@FreeBSD.org>, Konstantin Belousov <kib@FreeBSD.org>, freebsd-arch@FreeBSD.org, Konstantin Belousov <kostikbel@gmail.com>, Dag-Erling Sm??rgrav <des@des.no>
Subject:   Re: Fast vs slow syscalls (Re: Fwd: [RFC] Kernel shared variables)
Message-ID:  <20120606040931.F1050@besplex.bde.org>
In-Reply-To: <20120605171446.GA28387@onelab2.iet.unipi.it>
References:  <CACfq090r1tWhuDkxdSZ24fwafbVKU0yduu1yV2%2BoYo%2BwwT4ipA@mail.gmail.com> <201206051008.29568.jhb@freebsd.org> <86haupvk4a.fsf@ds4.des.no> <201206051222.12627.jhb@freebsd.org> <20120605171446.GA28387@onelab2.iet.unipi.it>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 5 Jun 2012, Luigi Rizzo wrote:

> On Tue, Jun 05, 2012 at 12:22:12PM -0400, John Baldwin wrote:
>> On Tuesday, June 05, 2012 11:44:37 am Dag-Erling Sm??rgrav wrote:
>>> John Baldwin <jhb@freebsd.org> writes:
>>>> So you call getpid() on each access to a shared resource?
>>>
>>> I don't, but I've seen code that does, under the assumption that all the
>>> world is Linux and getpid() is free.  Here's a sample from RHEL6 on a
>>> 3.1 GHz i5, using raise(0) as a baseline:
>>>
>>> getpid(): 10,000,000 iterations in 24,400 ms
>>> gettimeofday(0, 0): 10,000,000 iterations in 54,104 ms
>>> raise(0): 10,000,000 iterations in 1,284,593 ms

That's one slow system or broken units.  24.4 seconds for 10 million
"syscalls" in the fastest case?  If the comma is really a decimal
point, then 24.4 milliseconds makes sense, but then the number of
iterations would be only 10, with a the second comma being a syntax
error.  If ms actually means microseconds, then someone should fix
ping(1) to stop pretending that it is 1000 times as fast as it is.

After adjusting by factors of 1000 here and there, this format is still
hard to parse.  I like the format of nsec/operation.  24400 10 million
operations in 24400 moroseconds seems to scale to 2.44 nsec/call (if 1
moro = 1 micro).  But that is impossibly fast, unless getpid() is
inlined to a load of the shared variable (it may also need the load to
be moved outside the loop).  I can't see any reasonable adjustment that
gives 24.4 nsec/call.

>>> The difference between the first two is due to the fact that while
>>> getpid() just returns a constant, gettimeofday(0, 0) performs two
>>> comparisons first.  Passing an actual struct timeval to gettimeofday()
>>> slows it down by a factor of about 6.
>>>
>>> (strace confirms that no system calls occur for either getpid() or
>>> gettimeofday(0, 0))
>>>
>>> Here is the same program running on FreeBSD 9.0-RELEASE in VirtualBox on
>>> an otherwise idle 3.4 GHz i7:
>>>
>>> getpid(): 10,000,000 iterations in 777,251 ms
>>> gettimeofday(0, 0): 10,000,000 iterations in 799,808 ms
>>> raise(0): 10,000,000 iterations in 2,142,275 ms

2142.275 seconds is really slow.

>> Yes, we know getpid() is slow, I think the question is does it matter that
>> it's slow in something other than a microbenchmark.  Can you name the
>> application that you've seen use getpid()?
>
> i think the important question is, for any function X:
>    Q1	"does it require horrible hacks or a huge amount of work
> 	to make X syscall-free ?"
> rather than
>    Q2	"does it matter to make X fast"

s/huge amount/any/

Work is all the programming work to implement it and maintain it forever.

> If the answer to Q1 is "no" then there is no question
> we should try to implement it.

The answer is sure to be "no", but you should try to implement to
see if it is easier or works better than expected.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120606040931.F1050>