From owner-freebsd-arch@FreeBSD.ORG Tue Jun 12 01:52:53 2012 Return-Path: Delivered-To: arch@freeBSd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F1AFD1065673; Tue, 12 Jun 2012 01:52:52 +0000 (UTC) (envelope-from listlog2011@gmail.com) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id D31E98FC19; Tue, 12 Jun 2012 01:52:52 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q5C1qojR007844; Tue, 12 Jun 2012 01:52:51 GMT (envelope-from listlog2011@gmail.com) Message-ID: <4FD6A0F2.4010305@gmail.com> Date: Tue, 12 Jun 2012 09:52:50 +0800 From: David Xu User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:12.0) Gecko/20120428 Thunderbird/12.0.1 MIME-Version: 1.0 To: Robert Watson References: <20120606165115.GQ85127@deviant.kiev.zoral.com.ua> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Konstantin Belousov , arch@freeBSd.org Subject: Re: Fast gettimeofday(2) and clock_gettime(2) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: davidxu@freeBSd.org List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Jun 2012 01:52:53 -0000 On 2012/6/11 16:56, Robert Watson wrote: > On Wed, 6 Jun 2012, Konstantin Belousov wrote: > >> The whole struct vdso_timekeep is versioned, as well as individual >> struct vdso_timehands, which should allow to implement future >> algorithms without breaking binary compatibility. The code is >> structured to eventually move __vdso_* functions out of libc into >> VDSO, if it ever materialize. This desire explains vdso prefix and >> header file names. >> >> I implemented and tested the userspace timecounter on amd64, both for >> 64 and 32 bit binaries, it would probably work for i386 too. Other >> architecture maintainers are welcome to add neccessary support there. >> You need to provide machine/vdso.h header with definitions of >> VDSO_TIMEHANDS_MD fields for struct vdso_timehands, which should >> provide information for userspace to implement fast >> tc_get_timecount(). The fields are filled in per-arch >> cpu_fill_vdso_timehands(9) function. If your architecture support >> 32bit compat, there are cpu_fill_vdso_timehands32(9) and >> VDSO_TIMEHANDS_MD32 to code as well. After that, the >> lib/libc//sys/__vdso_gettc.c should contain an implemention of >> __vdso_gettc() function, exact analogue of tc_get_timecount(). > > Hi Kostik: > > I'm glad to see someone is finally grappling with this issue. I could > never entirely decide how I felt about the Linux VSDO mechanism, but > having some solution here is actually quite important. A few thoughts > that you might comment on: > > 1) It would be nice if we linked any (future) notion of VDSO to the same > mechanism we use for ELF branding/ABI emulation -- you conceivably > want to > support it not just for native ABI and perhaps 32-bit compat ABIs, > but also > the Linux ABI, alternative userspace ABIs (vis o32 on an n64 MIPS > kernel), > and so on. > > 2) Once the VDSO mechanism is there, you get into feature creep space, > and > looking at how Linux handles pluggable system call mechanisms for > the C > library is actually interesting. > > 3) For the purposes of adaptive mutexes in userspace, it really would > be quite > nice to know whether remote threads are running or not, in the same > way > that cheap access to remote thread run state in the kernel makes > for much > more efficient adaptive spinning. I wonder if we could use this > mechanism > for that purpose as well. I guess for now, at least, you're using > a single > global page, but in the future, per-process pages might be quite > beneficial. > Solaris uses per-thread page shared between kernel and userland, their schedctl() interface is designed for this purpose, not only you can know if a thread is running, but also userland can tell kernel to not preempt the current thread while is in critical section, for example, it is doing adaptive spin, set a no_preempt bit to let kernel delay for a few of ticks before it preempts the thread immediately. I even have an earlier patch: http://people.freebsd.org/~davidxu/schedctl/ However I don't know if you can get real-world advantage if you really implemented this feature because this might increase cache-line sharing. > Robert