From owner-freebsd-alpha Wed Sep 4 13:59:52 2002 Delivered-To: freebsd-alpha@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DEBBB37B400; Wed, 4 Sep 2002 13:59:44 -0700 (PDT) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5717B43E3B; Wed, 4 Sep 2002 13:59:40 -0700 (PDT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.9.3/8.9.3) with ESMTP id QAA28943; Wed, 4 Sep 2002 16:59:39 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.11.6/8.9.1) id g84Kx9N56895; Wed, 4 Sep 2002 16:59:09 -0400 (EDT) (envelope-from gallatin@cs.duke.edu) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15734.29725.515274.183629@grasshopper.cs.duke.edu> Date: Wed, 4 Sep 2002 16:59:09 -0400 (EDT) To: John Baldwin Cc: freebsd-alpha@freebsd.org Subject: alpha performance on -current In-Reply-To: References: X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid Sender: owner-freebsd-alpha@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org John Baldwin writes: > On the DS20 I have here, Peter's fix to basically enable the buffer cache > gave me about a 38% performance increase for a buildworld -j 2 on current: > > Before: > -------------------------------------- > build started at 15:11:08 on 08/28/02 > build finished at 18:43:05 on 08/28/02 > -------------------------------------- > Which is a total time of 3:31:57 > > After: > -------------------------------------- > build started at 22:41:31 on 09/03/02 > build finished at 00:51:46 on 09/04/02 > -------------------------------------- > Which is a total time of 2:10:15 A buildworld from July: 9611.42 real 6149.87 user 2613.20 sys A buildworld today: 8699.25 real 6985.64 user 1379.72 sys For all I know, the speedup is just from disabling WITNESS and INVARIANTS. Speaking of performance, I ran lmbench on my xp1000 under -stable and -current. I built the binary with compaq cc and ran the same binaries on -current and -stable. Both kernels were built without WITNESS, INVARIANTS and DIAGNOSTIC. On -current, I manually created an /etc/malloc.conf symlink to remove malloc debugging. I included results from Tru64 5.1A for comparison. The disk that -current is on is a little faster than the disks -stable and Tru64 are on, that's the only area where its not an apples-to-apples test. The first thing that stands out is that syscalls are *much* more expensive on -current. Nearly a factor of 4 for a null syscall (0.57us -> 2.06 us). I suppose it equates to a latency of ~0.35us for each mutex taken/released. Can this be right? The lmbench null syscall is getppid: # ./lat_syscall null Simple syscall: 2.0178 microseconds # ./lat_syscall null Simple syscall: 2.0333 microseconds # sysctl -w kern.giant.proc=0 kern.giant.proc: 1 -> 0 # ./lat_syscall null Simple syscall: 1.6360 microseconds # ./lat_syscall null Simple syscall: 1.6333 microseconds Is the locking overhead this bad on x86? It looks downright embarrassing on alpha. Can anything be done about it? Are the memory barriers in atomic_cmpset_acq_* really needed? They have the look of belt & suspenders code.. FWIW, The appended diff to remove them reducess null system call latency to 1.6us with kern.giant.proc=1, and 1.4us with kern.giant.proc=0. I'm about to start a buildworld with it, but I don't have any SMP boxes. On the other hand, the pipe results are tremendous. Pipe now goes like a bat out of hell. Congrats to whoever did that. Drew L M B E N C H 2 . 0 S U M M A R Y ------------------------------------ Basic system parameters ---------------------------------------------------- Host OS Description Mhz --------- ------------- ----------------------- ---- monet FreeBSD 4.5-S alpha-freebsd4.5 497 monet FreeBSD 5.0-C alpha-freebsd5.0 497 monet OSF1 V5.1 alphaev6-dec-osf5.1 499 Processor, Processes - times in microseconds - smaller is better ---------------------------------------------------------------- Host OS Mhz null null open selct sig sig fork exec sh call I/O stat clos TCP inst hndl proc proc proc --------- ------------- ---- ---- ---- ---- ---- ----- ---- ---- ---- ---- ---- monet FreeBSD 4.5-S 497 0.57 3.78 16.2 22.5 33.7 1.29 6.25 1434 4692 12.K monet FreeBSD 5.0-C 497 2.06 7.52 28.1 44.1 18.7 2.35 8.91 1440 4716 8895 monet OSF1 V5.1 499 0.41 1.33 144. 157. 19.8 1.01 4.56 1039 2745 7611 Context switching - times in microseconds - smaller is better ------------------------------------------------------------- Host OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw --------- ------------- ----- ------ ------ ------ ------ ------- ------- monet FreeBSD 4.5-S 1.850 20.4 98.1 36.9 139.2 39.7 139.7 monet FreeBSD 5.0-C 4.250 13.6 47.1 22.9 62.2 27.2 68.3 monet OSF1 V5.1 5.430 9.8900 45.2 18.1 49.5 20.2 53.6 *Local* Communication latencies in microseconds - smaller is better ------------------------------------------------------------------- Host OS 2p/0K Pipe AF UDP RPC/ TCP RPC/ TCP ctxsw UNIX UDP TCP conn --------- ------------- ----- ----- ---- ----- ----- ----- ----- ---- monet FreeBSD 4.5-S 1.850 12.0 14.4 38.5 79.1 43.0 100.0 231. monet FreeBSD 5.0-C 4.250 28.4 33.5 87.5 300. monet OSF1 V5.1 5.430 24.8 48.7 90.3 152.8 84.8 197.4 File & VM system latencies in microseconds - smaller is better -------------------------------------------------------------- Host OS 0K File 10K File Mmap Prot Page Create Delete Create Delete Latency Fault Fault --------- ------------- ------ ------ ------ ------ ------- ----- ----- monet FreeBSD 4.5-S 80.4 57.7 2457.0 1243.8 2980.0 0.527 monet FreeBSD 5.0-C 120.2 99.4 330.3 4854.4 4688.0 0.145 monet OSF1 V5.1 488.3 855.4 1773.0 1083.4 1462.0 2.931 4470.0 *Local* Communication bandwidths in MB/s - bigger is better ----------------------------------------------------------- Host OS Pipe AF TCP File Mmap Bcopy Bcopy Mem Mem UNIX reread reread (libc) (hand) read write --------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- ----- monet FreeBSD 4.5-S 82.3 201. 70.2 159.1 953.5 452.6 342.3 953. 351.2 monet FreeBSD 5.0-C 429. 127. 102. 250.9 946.6 439.3 339.3 947. 350.7 monet OSF1 V5.1 301. 237. 318.1 979.5 451.7 350.1 978. 356.7 Memory latencies in nanoseconds - smaller is better (WARNING - may not be correct, check graphs) --------------------------------------------------- Host OS Mhz L1 $ L2 $ Main mem Guesses --------- ------------- ---- ----- ------ -------- ------- monet FreeBSD 4.5-S 497 6.016 30.1 196.6 monet FreeBSD 5.0-C 497 6.039 30.3 197.7 monet OSF1 V5.1 499 5.859 29.3 194.3 Index: atomic.h =================================================================== RCS file: /home/ncvs/src/sys/alpha/include/atomic.h,v retrieving revision 1.14 diff -u -r1.14 atomic.h --- atomic.h 17 May 2002 05:45:39 -0000 1.14 +++ atomic.h 4 Sep 2002 20:37:43 -0000 @@ -419,14 +419,14 @@ int retval; retval = atomic_cmpset_32(p, cmpval, newval); - alpha_mb(); +/* alpha_mb();*/ return (retval); } static __inline u_int32_t atomic_cmpset_rel_32(volatile u_int32_t *p, u_int32_t cmpval, u_int32_t newval) { - alpha_mb(); +/* alpha_mb();*/ return (atomic_cmpset_32(p, cmpval, newval)); } @@ -436,14 +436,14 @@ int retval; retval = atomic_cmpset_64(p, cmpval, newval); - alpha_mb(); +/* alpha_mb();*/ return (retval); } static __inline u_int64_t atomic_cmpset_rel_64(volatile u_int64_t *p, u_int64_t cmpval, u_int64_t newval) { - alpha_mb(); +/* alpha_mb();*/ return (atomic_cmpset_64(p, cmpval, newval)); } To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-alpha" in the body of the message