Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 7 Jun 2012 15:47:04 -0700
From:      Peter Wemm <peter@wemm.org>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        freebsd-arch@freebsd.org
Subject:   Re: Fast gettimeofday(2) and clock_gettime(2)
Message-ID:  <CAGE5yCrk8E5DikNNVQzEZ7bkj98nxQi%2BaWsLsi6d4jc8vLg2PA@mail.gmail.com>
In-Reply-To: <20120607172839.GZ85127@deviant.kiev.zoral.com.ua>
References:  <20120606165115.GQ85127@deviant.kiev.zoral.com.ua> <201206061423.53179.jhb@freebsd.org> <20120606205938.GS85127@deviant.kiev.zoral.com.ua> <201206070850.55751.jhb@freebsd.org> <20120607172839.GZ85127@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jun 7, 2012 at 10:28 AM, Konstantin Belousov
<kostikbel@gmail.com> wrote:
> On Thu, Jun 07, 2012 at 08:50:55AM -0400, John Baldwin wrote:
>> On Wednesday, June 06, 2012 4:59:38 pm Konstantin Belousov wrote:
>> > On Wed, Jun 06, 2012 at 02:23:53PM -0400, John Baldwin wrote:
>> > > On Wednesday, June 06, 2012 12:51:15 pm Konstantin Belousov wrote:
>> > > > A positive result from the recent flame-bait on arch@ is the worki=
ng
>> > > > implementation of the fast gettimeofday(2) and clock_gettime(2). T=
he
>> > > > speedup I see is around 6-7x on the 2600K. I think the speedup cou=
ld
>> > > > be even bigger on the previous generation of CPUs, where lock
>> > > > operations and syscall entry are costlier. A sample test runs of
>> > > > tools/tools/syscall_timing are presented at the end of message.
>> > >
>> > > In general this looks good but I see a few nits / races:
>> > >
>> > > 1) You don't follow the model of clearing tk_current to 0 while you
>> > > =A0 =A0are updating the structure that the in-kernel timecounter cod=
e
>> > > =A0 =A0uses. =A0This also means you have to avoid using a tk_current=
 of 0
>> > > =A0 =A0and that userland has to keep spinning as long as tk_current =
is 0.
>> > > =A0 =A0Without this I believe userland can read a partially updated
>> > > =A0 =A0structure.
>> > I changed the code to be much more similar to the kern_tc.c. I (re)add=
ed
>> > the generation field, which is set to 0 upon kernel touching timehands=
.
>>
>> Thank you. =A0BTW, I think we should use atomic_load_acq_int() on both a=
ccesses
>> to th_gen (and the in-kernel binuptime should do the same). =A0I realize=
 this
>> requires using rmb before the while condition in userland since we can't
>> use atomic_load_acq_int() here. =A0I think it should also use
>> atomic_store_rel_int() for both stores to th_gen during the tc_windup()
>> callback.
> This is done. On the other hand, I removed a store_rel from updating
> tk_current, since it is after enabling store to th_gen, and the order
> there does not matter.
>
> I also did some restructuring of the userspace, removing layers that
> Bruce did not liked. Now top-level functions directly call binuptime().
> I also shortened the preliminary operations by caching timekeep pointer.
> Its double-initialization is safe.
>
> Latest version is at
> http://people.freebsd.org/~kib/misc/moronix.4.patch
>
> I probably move all shared page helpers to separate file from kern_exec.c=
,
> but this will happen after moronix is committed.

Stepping back for a moment.. why even have a shared page at all, in
common MI code?

The AMD64 kernel can simply make a page readable from within kernel
space since it's page level protected.

The i386 kernel needs the same treatment.  We can save one clock cycle
from address generation by switching to page protection for the kernel
and using a full 4GB %cs/%ds/etc.  With that fix we could do the same
there.  I've been meaning to "fix" this for about 8 years now.

There would have been no need to allocate "space" in userland for
things like signal trampolines because it could be executed directly
from a kernel page by unprivileged user code.

Things like allocating a shared page could be a MD backend decision
for architectures that don't have page level access control for where
the kernel lives.

Things like tc_fill_vdso_timehands() could go away if userland could
be allowed to directly read the kernel's version.  With a little
linker magic, the 'struct timehands' stuff could be marshaled into a
page and the auxinfo point to it.
--=20
Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6FJV
"All of this is for nothing if we don't go to the stars" - JMS/B5
"If Java had true garbage collection, most programs would delete
themselves upon execution." -- Robert Sewell



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGE5yCrk8E5DikNNVQzEZ7bkj98nxQi%2BaWsLsi6d4jc8vLg2PA>