From owner-freebsd-hackers Sun Nov 10 12:48:57 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id MAA02105 for hackers-outgoing; Sun, 10 Nov 1996 12:48:57 -0800 (PST) Received: from rah.star-gate.com (rah.star-gate.com [204.188.121.18]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id MAA02098 for ; Sun, 10 Nov 1996 12:48:55 -0800 (PST) Received: from rah.star-gate.com (localhost.star-gate.com [127.0.0.1]) by rah.star-gate.com (8.7.6/8.7.3) with ESMTP id MAA05053 for ; Sun, 10 Nov 1996 12:48:45 -0800 (PST) Message-Id: <199611102048.MAA05053@rah.star-gate.com> X-Mailer: exmh version 1.6.9 8/22/96 To: hackers@freebsd.org Subject: PPRO optimizations? Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Sun, 10 Nov 1996 12:48:45 -0800 From: Amancio Hasty Sender: owner-hackers@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Tnks, Amancio Subject: PPro optimiz. (multiple accumulators) Date: Sun, 10 Nov 1996 11:56:34 -0700 From: "H.W. Stockman" Organization: Southwest Cyberport Newsgroups: comp.lang.asm.x86, comp.sys.intel I heard that Intel has a document with examples of "multiple accumulator" programming for the PPro. One example is supposedly like saxpy() or daxpy(), and ostensibly the speedup for the multi-accumulator approach was something like 150 MFLOPs vs. 60 MFLOPs. Has anyone seen this example... and could you tell me where I might get the document? I've see Terje Mathisen's excellent sqrt2(1/x) example, and am looking to increase my bag of tricks...