From owner-freebsd-hackers Tue Nov 7 02:56:13 1995 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.6.12/8.6.6) id CAA13650 for hackers-outgoing; Tue, 7 Nov 1995 02:56:13 -0800 Received: from rah.star-gate.com (rah.star-gate.com [204.188.121.18]) by freefall.freebsd.org (8.6.12/8.6.6) with ESMTP id CAA13645 for ; Tue, 7 Nov 1995 02:56:08 -0800 Received: from rah.star-gate.com (localhost.v-site.net [127.0.0.1]) by rah.star-gate.com (8.6.12/8.6.9) with ESMTP id CAA02766 for ; Tue, 7 Nov 1995 02:56:07 -0800 Message-Id: <199511071056.CAA02766@rah.star-gate.com> X-Mailer: exmh version 1.6.2 7/18/95 To: freebsd-hackers@freebsd.org MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="CAA02716.815741535/rah.star-gate.com" Date: Tue, 07 Nov 1995 02:56:06 -0800 From: "Amancio Hasty Jr." Sender: owner-hackers@freebsd.org Precedence: bulk This is a MIME-encapsulated message - --CAA02716.815741535/rah.star-gate.com - --CAA02716.815741535/rah.star-gate.com Content-Type: message/rfc822 Return-Path: hasty@rah.star-gate.com Received: from rah.star-gate.com (rah.star-gate.com [204.188.121.18]) by rah.star-gate.com (8.6.12/8.6.9) with SMTP id CAA02714 for <freebsd-hackers@freebsd.org>; Tue, 7 Nov 1995 02:52:12 -0800 Message-Id: <199511071052.CAA02714@rah.star-gate.com> Date: Tue, 07 Nov 95 02:52:13 -0800 Sender: hasty From: "Amancio Hasty, Jr." X-Mailer: Mozilla 1.1N (X11; I; FreeBSD 2.1-STABLE i386) MIME-Version: 1.0 To: freebsd-hackers@freebsd.org Subject: Re: A question about fast copying with a Pentium processor X-URL: news:47lm63$6j0@ixnews3.ix.netcom.com Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii Should we start using floating point ? 8) Thats a joke however I do think that some of you may find this interesting... Cheers, Amancio mschmit@ix.netcom.com (Mike Schmit) wrote: >In <1995Nov5.235249.8471@nmt.edu> borchers@nmt.edu (Brian Borchers) writes: >> >>I've got a question about coding for speed on the Pentium that has me >>somewhat baffled. Consider the problem of copying a large number of >>double precision numbers from one array to another. Here's C code >>for the operation: >> >> for (i=0; i<=SIZE-1; i++) >> { >> b[i]=a[i]; >> }; >> >> >>Using the Gnu C Compiler version 2.6.3 (I know, I should move up to the >>latest version, but that has nothing to do with my question) we get >>the following code for this loop: >> >>L20: >> movl (%ebx),%eax >> movl 4(%ebx),%edx >> movl %eax,(%ecx) >> movl %edx,4(%ecx) >> addl $8,%ecx >> addl $8,%ebx >> cmpl %edi,%ecx >> jle L20 >> >>When I run the code on fairly large arrays, I find that my system can copy >>about 30 Megabytes per second on arrays of four megabytes or so. >> >>I then rewrite the loop as follows: >> >>L20: >> fldl (%ebx) >> fstpl (%ecx) >> addl $8,%ecx >> addl $8,%ebx >> cmpl %edi,%ecx >> jle L20 >> >>The resulting program copies data at about 60 Megabytes per second. >> >>Thinking about it, I came to the conclusion that both versions of the >>code should probably be most limited by memory bandwidth. However, I >>expect that both codes should be using exactly the same memory >>bandwidth. >> >>Looking at "Optimizations for Intel's 32-Bit Processors", Version 2.0, >>I see that on page 25, an approach like that used by gcc is suggested >>as being twice as fast as the other approach, while in practice, it >>seems to be twice as slow. >> >>Questions: >> >> - Why is the first version of the code not as fast as the >second? >> >> - Why isn't the second version faster than the first (as >indicated >> by "Optimizations for Intel's 32-Bit Processors") > > (Did you mean first version?) > >> >> - What's going on here? >> > >I'm not sure why the Intel book says what it does. But the reason you >are >getting a faster copy is that the FP load and store instructions are >reading and writing memory 8 bytes at a time (and presumably these have >been properly aligned). The other integer code is just copying 4 bytes >at a time. > >Mike Schmit > >------------------------------------------------------------------- >mschmit@ix.netcom.com author: >408-244-6826 Pentium Processor Programming Tools >800-765-8086 ISBN: 0-12-627230-1 >------------------------------------------------------------------- > news:47lm63$6j0@ixnews3.ix.netcom.com - -- Amancio Hasty Hasty Software Consulting Services Tel: 415-495-3046 Fax: 415-495-3046 Cellular: 415-309-8434 e-mail: hasty@star-gate.com Powered by FreeBSD - --CAA02716.815741535/rah.star-gate.com-- ------- End of Forwarded Message