From owner-freebsd-arch@FreeBSD.ORG Wed Jun 6 08:24:27 2012 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 40843106566C; Wed, 6 Jun 2012 08:24:27 +0000 (UTC) (envelope-from des@des.no) Received: from smtp.des.no (smtp.des.no [194.63.250.102]) by mx1.freebsd.org (Postfix) with ESMTP id C47898FC16; Wed, 6 Jun 2012 08:24:26 +0000 (UTC) Received: from ds4.des.no (smtp.des.no [194.63.250.102]) by smtp.des.no (Postfix) with ESMTP id E63F16395; Wed, 6 Jun 2012 08:24:19 +0000 (UTC) Received: by ds4.des.no (Postfix, from userid 1001) id 8AC2A96CE; Wed, 6 Jun 2012 10:24:19 +0200 (CEST) From: =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= To: Bruce Evans References: <201206051008.29568.jhb@freebsd.org> <86haupvk4a.fsf@ds4.des.no> <201206051222.12627.jhb@freebsd.org> <20120605171446.GA28387@onelab2.iet.unipi.it> <20120606040931.F1050@besplex.bde.org> Date: Wed, 06 Jun 2012 10:24:19 +0200 In-Reply-To: <20120606040931.F1050@besplex.bde.org> (Bruce Evans's message of "Wed, 6 Jun 2012 04:36:54 +1000 (EST)") Message-ID: <864nqovoek.fsf@ds4.des.no> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.3 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Mailman-Approved-At: Wed, 06 Jun 2012 12:28:08 +0000 Cc: Gianni , John Baldwin , Alan Cox , Alexander Kabaev , Attilio Rao , Konstantin Belousov , freebsd-arch@FreeBSD.org, Konstantin Belousov Subject: Re: Fast vs slow syscalls (Re: Fwd: [RFC] Kernel shared variables) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jun 2012 08:24:27 -0000 Bruce Evans writes: > Dag-Erling Sm=C3=B8rgrav writes: > > getpid(): 10,000,000 iterations in 24,400 ms > > gettimeofday(0, 0): 10,000,000 iterations in 54,104 ms > > raise(0): 10,000,000 iterations in 1,284,593 ms > That's one slow system or broken units. Broken units, these are microseconds not milliseconds. Sorry. > After adjusting by factors of 1000 here and there, this format is still > hard to parse. I like the format of nsec/operation. 24400 10 million > operations in 24400 moroseconds seems to scale to 2.44 nsec/call (if 1 > moro =3D 1 micro). But that is impossibly fast, unless getpid() is > inlined to a load of the shared variable (it may also need the load to > be moved outside the loop). I can't see any reasonable adjustment that > gives 24.4 nsec/call. #define ITERATIONS 10000000 struct timeval start, end; int i; gettimeofday(&start, NULL); for (i =3D 0; i < ITERATIONS; ++i) getpid(); gettimeofday(&end, NULL); On Linux, gcc 4.4.6 compiles this to: # gettimeofday(&start, NULL) 0x000000000040064b <+23>: lea -0x20(%rbp),%rax 0x000000000040064f <+27>: mov $0x0,%esi 0x0000000000400654 <+32>: mov %rax,%rdi 0x0000000000400657 <+35>: callq 0x400500 # i =3D 0 0x000000000040065c <+40>: movl $0x0,-0x4(%rbp) 0x0000000000400663 <+47>: jmp 0x40066e # getpid() 0x0000000000400665 <+49>: callq 0x400520 # ++i 0x000000000040066a <+54>: addl $0x1,-0x4(%rbp) # i < ITERATIONS 0x000000000040066e <+58>: cmpl $0x98967f,-0x4(%rbp) 0x0000000000400675 <+65>: jle 0x400665 # gettimeofday(&end, NULL) 0x0000000000400677 <+67>: lea -0x30(%rbp),%rax 0x000000000040067b <+71>: mov $0x0,%esi 0x0000000000400680 <+76>: mov %rax,%rdi 0x0000000000400683 <+79>: callq 0x400500 The code generated by gcc 4.2.1 on FreeBSD is almost identical: # gettimeofday(&start, NULL) 0x00000000004006f7 : lea -0x20(%rbp),%rdi 0x00000000004006fb : mov $0x0,%esi 0x0000000000400700 : callq 0x40057c # i =3D 0 0x0000000000400705 : movl $0x0,-0x4(%rbp) 0x000000000040070c : jmp 0x400717 # getpid() 0x000000000040070e : callq 0x40059c # ++i 0x0000000000400713 : addl $0x1,-0x4(%rbp) # i < ITERATIONS 0x0000000000400717 : cmpl $0x98967f,-0x4(%rbp) 0x000000000040071e : jle 0x40070e # gettimeofday(&end, NULL) 0x0000000000400720 : lea -0x30(%rbp),%rdi 0x0000000000400724 : mov $0x0,%esi 0x0000000000400729 : callq 0x40057c I don't know why gcc 4.4.6 loads &start / &end into %rax before copying it to %esi instead of loading it directly into %esi like 4.2.1 does. I used the same command line (gcc -Wall -Wextra syscall.c) in both cases. DES --=20 Dag-Erling Sm=C3=B8rgrav - des@des.no