From owner-freebsd-arch@FreeBSD.ORG Thu Jun 7 22:47:05 2012 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 31CB21065670 for ; Thu, 7 Jun 2012 22:47:05 +0000 (UTC) (envelope-from peter@wemm.org) Received: from mail-ob0-f182.google.com (mail-ob0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id DEBC58FC17 for ; Thu, 7 Jun 2012 22:47:04 +0000 (UTC) Received: by obcni5 with SMTP id ni5so1952667obc.13 for ; Thu, 07 Jun 2012 15:47:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wemm.org; s=google; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=ZaB+F0tRHi5JZqeDAo3qPkprunOQYi/KY9XrXhcuGNM=; b=f6nzVzE8ee7r/Kudk434F2fagmnXcM4gFqa4a0jTyx5xEbapI/cNYUFrtGHR/DtRlZ irJDhqc4Erq1am8qqM87pfDtNx+S7jyOLfe/GQ28tgDNk7Asp/E8kYN03NjqftpTmtry ujakbrRMNC4/jyS2X8BV5FvCjrm8oSqNDE3c4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding:x-gm-message-state; bh=ZaB+F0tRHi5JZqeDAo3qPkprunOQYi/KY9XrXhcuGNM=; b=nTGRU5M/+SDl4/7n9eMcyR413UlgJJ2C6nAccF3XiKNT/02mNImnlwmt8SI5xQPQxY 4UeAgjh2Fjr3/8XXuPJbw3S7eRJxj2PV+RY3Q6RO/M5GmX1D3wNXBorDylp2eurAwvbh EFAGc5Wzo3vPFNxnbULfSC7t6cJmT10Jnc5rbBSSqVxssr77R3N5hzTwnU4uAd3Labe6 KaQ1z4P5jHDcItB+DUjVLHoAom1QQImfZ+PTXvr3SrwBRiWUxpKcYzy6gIyIRVQ9g5EZ 5KxFR28mFymu35rF5bl77zD3VMVNaMY1rcx2+64boQvSVAnQpD/wAFxASB5Cg0Lpf2rJ aUag== MIME-Version: 1.0 Received: by 10.60.172.195 with SMTP id be3mr3981812oec.48.1339109224093; Thu, 07 Jun 2012 15:47:04 -0700 (PDT) Received: by 10.182.115.35 with HTTP; Thu, 7 Jun 2012 15:47:04 -0700 (PDT) In-Reply-To: <20120607172839.GZ85127@deviant.kiev.zoral.com.ua> References: <20120606165115.GQ85127@deviant.kiev.zoral.com.ua> <201206061423.53179.jhb@freebsd.org> <20120606205938.GS85127@deviant.kiev.zoral.com.ua> <201206070850.55751.jhb@freebsd.org> <20120607172839.GZ85127@deviant.kiev.zoral.com.ua> Date: Thu, 7 Jun 2012 15:47:04 -0700 Message-ID: From: Peter Wemm To: Konstantin Belousov Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Gm-Message-State: ALoCoQnX4FaIIZpmicN4p3UdGvue4oUpi+fiZK8hhtRZJmoHXmqMrlFQI/lLhYvMzrdLsHWnqGxa Cc: freebsd-arch@freebsd.org Subject: Re: Fast gettimeofday(2) and clock_gettime(2) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jun 2012 22:47:05 -0000 On Thu, Jun 7, 2012 at 10:28 AM, Konstantin Belousov wrote: > On Thu, Jun 07, 2012 at 08:50:55AM -0400, John Baldwin wrote: >> On Wednesday, June 06, 2012 4:59:38 pm Konstantin Belousov wrote: >> > On Wed, Jun 06, 2012 at 02:23:53PM -0400, John Baldwin wrote: >> > > On Wednesday, June 06, 2012 12:51:15 pm Konstantin Belousov wrote: >> > > > A positive result from the recent flame-bait on arch@ is the worki= ng >> > > > implementation of the fast gettimeofday(2) and clock_gettime(2). T= he >> > > > speedup I see is around 6-7x on the 2600K. I think the speedup cou= ld >> > > > be even bigger on the previous generation of CPUs, where lock >> > > > operations and syscall entry are costlier. A sample test runs of >> > > > tools/tools/syscall_timing are presented at the end of message. >> > > >> > > In general this looks good but I see a few nits / races: >> > > >> > > 1) You don't follow the model of clearing tk_current to 0 while you >> > > =A0 =A0are updating the structure that the in-kernel timecounter cod= e >> > > =A0 =A0uses. =A0This also means you have to avoid using a tk_current= of 0 >> > > =A0 =A0and that userland has to keep spinning as long as tk_current = is 0. >> > > =A0 =A0Without this I believe userland can read a partially updated >> > > =A0 =A0structure. >> > I changed the code to be much more similar to the kern_tc.c. I (re)add= ed >> > the generation field, which is set to 0 upon kernel touching timehands= . >> >> Thank you. =A0BTW, I think we should use atomic_load_acq_int() on both a= ccesses >> to th_gen (and the in-kernel binuptime should do the same). =A0I realize= this >> requires using rmb before the while condition in userland since we can't >> use atomic_load_acq_int() here. =A0I think it should also use >> atomic_store_rel_int() for both stores to th_gen during the tc_windup() >> callback. > This is done. On the other hand, I removed a store_rel from updating > tk_current, since it is after enabling store to th_gen, and the order > there does not matter. > > I also did some restructuring of the userspace, removing layers that > Bruce did not liked. Now top-level functions directly call binuptime(). > I also shortened the preliminary operations by caching timekeep pointer. > Its double-initialization is safe. > > Latest version is at > http://people.freebsd.org/~kib/misc/moronix.4.patch > > I probably move all shared page helpers to separate file from kern_exec.c= , > but this will happen after moronix is committed. Stepping back for a moment.. why even have a shared page at all, in common MI code? The AMD64 kernel can simply make a page readable from within kernel space since it's page level protected. The i386 kernel needs the same treatment. We can save one clock cycle from address generation by switching to page protection for the kernel and using a full 4GB %cs/%ds/etc. With that fix we could do the same there. I've been meaning to "fix" this for about 8 years now. There would have been no need to allocate "space" in userland for things like signal trampolines because it could be executed directly from a kernel page by unprivileged user code. Things like allocating a shared page could be a MD backend decision for architectures that don't have page level access control for where the kernel lives. Things like tc_fill_vdso_timehands() could go away if userland could be allowed to directly read the kernel's version. With a little linker magic, the 'struct timehands' stuff could be marshaled into a page and the auxinfo point to it. --=20 Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6FJV "All of this is for nothing if we don't go to the stars" - JMS/B5 "If Java had true garbage collection, most programs would delete themselves upon execution." -- Robert Sewell