From owner-freebsd-hackers Sun Jun 2 14:27:22 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id OAA11980 for hackers-outgoing; Sun, 2 Jun 1996 14:27:22 -0700 (PDT) Received: from covina.lightside.com (covina.lightside.com [198.81.209.1]) by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id OAA11975 for ; Sun, 2 Jun 1996 14:27:19 -0700 (PDT) Received: by covina.lightside.com (Smail3.1.28.1 #6) id m0uQKg5-0004KrC; Sun, 2 Jun 96 14:27 PDT Date: Sun, 2 Jun 1996 14:27:12 -0700 (PDT) From: Jake Hamby To: Bruce Evans cc: bde@zeta.org.au, freebsd-hackers@FreeBSD.org, mrm@MARMOT.Mole.ORG, mrm@MARMOT.Mole.ORG Subject: Re: TARGET_NO_FANCY_MATH_387 In-Reply-To: <199605311732.DAA13967@godzilla.zeta.org.au> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-hackers@FreeBSD.org X-Loop: FreeBSD.org Precedence: bulk On Sat, 1 Jun 1996, Bruce Evans wrote: > Making a new libm with HAVE_FPU set in /etc/make.conf should be sufficient. > This isn't the default since it has the same problems as -mfancy-math-i387. > > User times for 1e6 fsqrt(2.0)'s on a P133: > > default libm (shared): 11.65 seconds > HAVE_FPU libm (shared): 1.18 > HAVE_FPU libm (static): 1.11 > -mfancy-math-387: 0.68 > home made inline fsqrt: 0.64 # (1) > -mfancy-math-387 -ffast-math: 0.07 # (2) > > (1) Another reason for gcc not to inline things is that it's easy to write > your own inline functions. > (2) fsqrt(2.0) is recognized as a loop invariant and only calculated once. > The time is just for counting to 1e6. Those performance gains are substantial! Although I don't do anything math-intensive with FreeBSD, I still wish I had known about HAVE_FPU earlier. This _really_ needs to be a FAQ or even mentioned during sysinstall. Therefore, I suggest we build two different versions of libm, and have sysinstall link to the correct one depending on whether the user has a math coprocessor or not (we could ask, or possibly probe the system for this information). Eventually, we can do a similar thing with Pentium-optimized versions of this and libc (bcopy, bzero, etc). Comments? ---Jake