Date: Fri, 21 Jun 2013 09:04:34 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Bruce Evans <brde@optusnet.com.au> Cc: svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org, Konstantin Belousov <kib@FreeBSD.org> Subject: Re: svn commit: r252032 - head/sys/amd64/include Message-ID: <20130621090207.F1318@besplex.bde.org> In-Reply-To: <20130621081116.E1151@besplex.bde.org> References: <201306201430.r5KEU4G5049115@svn.freebsd.org> <20130621065839.J916@besplex.bde.org> <20130621081116.E1151@besplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 21 Jun 2013, Bruce Evans wrote: > On Fri, 21 Jun 2013, I wrote: > >> On Thu, 20 Jun 2013, Konstantin Belousov wrote: >>> ... >>> @@ -44,7 +44,7 @@ counter_u64_add(counter_u64_t c, int64_t >>> ... >> The i386 version of the counter asm doesn't support the immediate >> constraint for technical reasons. 64 bit counters are too large and >> slow to use on i386, especially when they are implemented as they are >> without races. > > Actual testing showed that it is only about twice as slow as a direct > increment. With the enclosed test program (a userland version hacked > on a bit to avoid pcpu), on ref10-i386 the times are: > - loop overhead: 1 cycle > - direct unlocked increment of a uint32_t: 6 cycles > - direct unlocked increment of a uint64_t: 7 cycles > - non-inline function unlocked increment of a uint64_t: 7.5 cycles > - counter_u64_add(): 14 cycles > - non-inline counter_u64_add(): 18 cycles > ... Actually enclosing the test program: % #include <stdint.h> % #include <stdio.h> % % static inline void % counter_64_inc_8b(volatile uint64_t *p, int64_t inc) % { % % __asm __volatile( % "movl %%ds:(%%esi),%%eax\n\t" % "movl %%ds:4(%%esi),%%edx\n" % "1:\n\t" % "movl %%eax,%%ebx\n\t" % "movl %%edx,%%ecx\n\t" % "addl (%%edi),%%ebx\n\t" % "adcl 4(%%edi),%%ecx\n\t" % "cmpxchg8b %%ds:(%%esi)\n\t" % "jnz 1b" % : % : "S" (p), "D" (&inc) % : "memory", "cc", "eax", "edx", "ebx", "ecx"); % } % % uint32_t cpu_feature = 1; % % typedef volatile uint64_t *counter_u64_t; % % static void % #if 1 % inline % #else % __noinline % #endif % counter_u64_add(counter_u64_t c, int64_t inc) % { % % #if 1 % if ((cpu_feature & 1) == 1) { % counter_64_inc_8b(c, inc); % } % #elif 0 % if ((cpu_feature & 1) == 1) { % *c += inc; % } % #else % *c += inc; % #endif % } % % uint64_t mycounter[1]; % % int % main(void) % { % unsigned i; % % for (i = 0; i < 1861955704; i++) /* sysctl -n machdep.tsc_freq */ % counter_u64_add(mycounter, 1); % printf("%ju\n", (uintmax_t)mycounter[0]); % } Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130621090207.F1318>