From owner-freebsd-hackers Sat Dec 23 20:34:31 1995 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.3/8.7.3) id UAA23790 for hackers-outgoing; Sat, 23 Dec 1995 20:34:31 -0800 (PST) Received: from Root.COM (implode.Root.COM [198.145.90.17]) by freefall.freebsd.org (8.7.3/8.7.3) with SMTP id UAA23783 for ; Sat, 23 Dec 1995 20:34:23 -0800 (PST) Received: from corbin.Root.COM (corbin [198.145.90.50]) by Root.COM (8.6.12/8.6.5) with ESMTP id UAA01475; Sat, 23 Dec 1995 20:34:16 -0800 Received: from localhost (localhost [127.0.0.1]) by corbin.Root.COM (8.7.3/8.6.5) with SMTP id UAA00489; Sat, 23 Dec 1995 20:34:18 -0800 (PST) Message-Id: <199512240434.UAA00489@corbin.Root.COM> To: Torbjorn Granlund cc: freebsd-hackers@freebsd.org Subject: Re: Pentium bcopy In-reply-to: Your message of "Sat, 23 Dec 95 19:57:23 PST." <199512240357.TAA00460@corbin.Root.COM> From: David Greenman Reply-To: davidg@Root.COM Date: Sat, 23 Dec 1995 20:34:18 -0800 Sender: owner-hackers@freebsd.org Precedence: bulk >> Anyway, your optimization looks interesting and I do intend to try it out. >>Thanks for your efforts and please don't get too discouraged. > > I need to do some more testing, but a quick test shows that for copying >page-sized amounts, it's about 5% faster than bcopy on a 150Mhz P6 (Orion) >and about 25% faster on a 90Mhz Pentium (Triton, PB cache). > ...not 5 times faster, but definately an improvement. Thanks! Woops! ...that test was a bit too quick. Okay, so I was off by a factor of 12288 :-) (SIZE was 4096 in the test, which would have the copysize at 16K bytes). For 4096 bytes (1024 longwords): 150Mhz P6: [corbin:davidg] time ./copytest copy 1170 bcopy 1675 2.847u 0.007s 0:02.86 99.3% 37+204k 0+0io 0pf+0w 90Mhz P5: [implode:davidg] time ./copytest copy 1836 bcopy 5204 7.041u 0.007s 0:07.05 99.8% 36+205k 0+0io 0pf+0w So for the P6 it's about 41% faster, and for the P5 it's about 283% faster. The good numbers require that the thing being copied fits in the L1 cache, so it will be interesting to see how much it improves more 'real world' sorts of things (like paging performance and filesystem cache reads). Anyway, thanks again for the code. -DG