From owner-svn-src-all@FreeBSD.ORG Wed Jul 25 18:05:44 2012 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 44242106564A; Wed, 25 Jul 2012 18:05:44 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 7CC848FC16; Wed, 25 Jul 2012 18:05:43 +0000 (UTC) Received: from skuns.kiev.zoral.com.ua (localhost [127.0.0.1]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id q6PI5ovj063384; Wed, 25 Jul 2012 21:05:50 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5) with ESMTP id q6PI5bVG090460; Wed, 25 Jul 2012 21:05:37 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5/Submit) id q6PI5bWW090459; Wed, 25 Jul 2012 21:05:37 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 25 Jul 2012 21:05:37 +0300 From: Konstantin Belousov To: Jung-uk Kim Message-ID: <20120725180537.GO2676@deviant.kiev.zoral.com.ua> References: <201207242210.q6OMACqV079603@svn.freebsd.org> <500F9E22.4080608@FreeBSD.org> <20120725102130.GH2676@deviant.kiev.zoral.com.ua> <500FE6AE.8070706@FreeBSD.org> <20120726001659.M5406@besplex.bde.org> <50102C94.9030706@FreeBSD.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="ycB2AJa2FexhOtH6" Content-Disposition: inline In-Reply-To: <50102C94.9030706@FreeBSD.org> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.0 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: Jim Harris , src-committers@freebsd.org, svn-src-all@freebsd.org, Andriy Gapon , Bruce Evans , svn-src-head@freebsd.org Subject: Re: svn commit: r238755 - head/sys/x86/x86 X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jul 2012 18:05:44 -0000 --ycB2AJa2FexhOtH6 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jul 25, 2012 at 01:27:48PM -0400, Jung-uk Kim wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 >=20 > On 2012-07-25 10:44:04 -0400, Bruce Evans wrote: > > On Wed, 25 Jul 2012, Andriy Gapon wrote: > >=20 > >> on 25/07/2012 13:21 Konstantin Belousov said the following: > >>> ... diff --git a/sys/x86/x86/tsc.c b/sys/x86/x86/tsc.c index > >>> 085c339..229b351 100644 --- a/sys/x86/x86/tsc.c +++ > >>> b/sys/x86/x86/tsc.c @@ -594,6 +594,7 @@ static u_int=20 > >>> tsc_get_timecount(struct timecounter *tc __unused) { > >>>=20 > >>> + rmb(); return (rdtsc32()); } > >>=20 > >> This makes sense to me. We probably want correctness over > >> performance here. [BTW, I originally thought that the change was > >> here; brain malfunction] > >=20 > > And I liked the original change because it wasn't here :-). > >=20 > >>> @@ -602,8 +603,9 @@ tsc_get_timecount_low(struct timecounter > >>> *tc) { uint32_t rv; > >>>=20 > >>> + rmb(); __asm __volatile("rdtsc; shrd %%cl, %%edx, %0" - > >>> : "=3Da" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); + > >>> : "=3Da" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); return > >>> (rv); } > >>>=20 > >>=20 > >> It would correct here too, but not sure if it would make any=20 > >> difference given that some lower bits are discarded anyway. > >> Probably depends on exact CPU. > >=20 > > It is needed to pessimize this too. :-) > >=20 > > As I have complained before, the loss of resolution from the shift > > is easy to see by reading the time from userland, even with syscall > > overhead taking 10-20 times longer than the read. On core2 with > > TSC-low, a clock- checking utility gives: > >=20 > > % min 481, max 12031, mean 530.589452, std 51.633626 % 1th: 550 > > (1296487 observations) % 2th: 481 (448425 observations) % 3th: 482 > > (142650 observations) % 4th: 549 (61945 observations) % 5th: 551 > > (47619 observations) > >=20 > > The numbers are diffences in nanoseconds measured by > > clock_gettime(). The jump from 481 to 549 is 68. From this I can > > tell that the clock frequency is 1.86 Ghz and the shift is 128, or > > the clock frequency is 3.72 Ghz and the shift is 256. > >=20 > > On AthlonXP with TSC: > >=20 > > % min 273, max 29075, mean 274.412811, std 80.425963 % 1th: 273 > > (853962 observations) % 2th: 274 (745606 observations) % 3th: 275 > > (400212 observations) % 4th: 276 (20 observations) % 5th: 280 (10 > > observations) > >=20 > > Now the numbers cluster about the mean. Although syscalls take > > much longer than the loss of resolution with TSC-low, and even the > > core2 TSC takes almost as long to read as the loss, it is still > > possible to see things happening at the limits of the resolution > > (~0.5 nsec). > >=20 > >> And, oh hmm, I read AMD Software Optimization Guide for AMD > >> Family 10h Processors and they suggest using cpuid (with a note > >> that it may be intercepted in virtualized environments) or > >> _mfence_ in the discussed role (Appendix F of the document).=20 > >> Googling for 'rdtsc mfence lfence' yields some interesting > >> results. > >=20 > > The second hit was for the shrd pessimization/loss of resolution > > and a memory access hack in lkml in 2011. I now seem to remember > > jkim mentioning the memory access hack. rmb() on i386 has a > > related memory access hack, but now with a lock prefix that defeats > > the point of the 2011 hack (it wanted to save 5 nsec by removing > > fences). rmb() on amd64 uses lfence. >=20 > I believe I mentioned this thread at the time: >=20 > https://patchwork.kernel.org/patch/691712/ >=20 > FYI, r238755 is essentially this commit for Linux: >=20 > http://git.kernel.org/?p=3Dlinux/kernel/git/torvalds/linux.git;a=3Dcommit= ;h=3D93ce99e849433ede4ce8b410b749dc0cad1100b2 >=20 > > Some of the other hits are a bit old. The 8th one was by me in > > the thread about kib@ implementing gettimeofday() in userland. >=20 > Since we have gettimeofday() in userland, the above Linux thread is > more relevant now, I guess. For some unrelated reasons, we do have lfence;rdtsc sequence in the userland already. Well, it is not exactly such sequence, there are some instructions between, but the main fact is that two consequtive invocations of gettimeofday(2) (*) or clock_gettime(2) are interleaved with lfence on Intels, guaranteeing that backstep of the counter is impossible. * - it is not a syscall anymore. As I said, using recommended mfence;rdtsc sequence for AMDs would require some work, but lets handle the kernel and userspace issues separately. And, I really failed to find what the patch from the thread you referenced tried to fix. Was it really committed into Linux ? I see actual problem of us allowing timecounters going back, and a solution that exactly follows words of both Intel and AMD documentation. This is good one step forward IMHO. --ycB2AJa2FexhOtH6 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAlAQNXEACgkQC3+MBN1Mb4hyXgCgjfz+zMNYK1zsP9IKc+ei3bCd VDUAn1pzLAZ0u7ssFpdo2nqairvnaSPi =s+f/ -----END PGP SIGNATURE----- --ycB2AJa2FexhOtH6--