From owner-cvs-all@FreeBSD.ORG  Tue Feb 19 15:30:59 2008
Return-Path: <owner-cvs-all@FreeBSD.ORG>
Delivered-To: cvs-all@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0B68F16A417;
	Tue, 19 Feb 2008 15:30:59 +0000 (UTC) (envelope-from bde@FreeBSD.org)
Received: from repoman.freebsd.org (repoman.freebsd.org
	[IPv6:2001:4f8:fff6::29])
	by mx1.freebsd.org (Postfix) with ESMTP id EBFCB13C46B;
	Tue, 19 Feb 2008 15:30:58 +0000 (UTC) (envelope-from bde@FreeBSD.org)
Received: from repoman.freebsd.org (localhost [127.0.0.1])
	by repoman.freebsd.org (8.14.1/8.14.1) with ESMTP id m1JFUwGi059424;
	Tue, 19 Feb 2008 15:30:58 GMT (envelope-from bde@repoman.freebsd.org)
Received: (from bde@localhost)
	by repoman.freebsd.org (8.14.1/8.14.1/Submit) id m1JFUwJe059423;
	Tue, 19 Feb 2008 15:30:58 GMT (envelope-from bde)
Message-Id: <200802191530.m1JFUwJe059423@repoman.freebsd.org>
From: Bruce Evans <bde@FreeBSD.org>
Date: Tue, 19 Feb 2008 15:30:58 +0000 (UTC)
To: src-committers@FreeBSD.org, cvs-src@FreeBSD.org, cvs-all@FreeBSD.org
X-FreeBSD-CVS-Branch: HEAD
Cc: 
Subject: cvs commit: src/lib/msun/src e_rem_pio2.c
X-BeenThere: cvs-all@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: CVS commit messages for the entire tree <cvs-all.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-all>,
	<mailto:cvs-all-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/cvs-all>
List-Post: <mailto:cvs-all@freebsd.org>
List-Help: <mailto:cvs-all-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-all>,
	<mailto:cvs-all-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 19 Feb 2008 15:30:59 -0000

bde         2008-02-19 15:30:58 UTC

  FreeBSD src repository

  Modified files:
    lib/msun/src         e_rem_pio2.c 
  Log:
  Optimize for 3pi/4 <= |x| <= 9pi/4 in much the same way as for
  pi/4 <= |x| <= 3pi/4.  Use the same branch ladder as for float precision.
  Remove the optimization for |x| near pi/2 and don't do it near the
  multiples of pi/2 in the newly optimized range, since it requires
  fairly large code to handle only relativley few cases.  Ifdef out
  optimization for |x| <= pi/4 since this case can't occur because it
  is done in callers.
  
  On amd64 (A64), for cos() and sin() with uniformly distributed args,
  no cache misses, some parallelism in the caller, and good but not great
  CC and CFLAGS, etc., this saves about 40 cycles or 38% in the newly
  optimized range, or about 27% on average across the range |x| <= 2pi
  (~65 cycles for most args, while the A64 hardware fcos and fsin take
  ~75 cycles for half the args and 125 cycles for the other half).  The
  speedup for tan() is much smaller, especially relatively.  The speedup
  on i386 (A64) is slightly smaller, especially relatively.  i386 is
  still much slower than amd64 here (unlike in the float case where it
  is slightly faster).
  
  Revision  Changes    Path
  1.11      +56 -18    src/lib/msun/src/e_rem_pio2.c