From owner-cvs-src@FreeBSD.ORG Fri Feb 22 15:55:16 2008 Return-Path: Delivered-To: cvs-src@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2FEB316A400; Fri, 22 Feb 2008 15:55:16 +0000 (UTC) (envelope-from bde@FreeBSD.org) Received: from repoman.freebsd.org (repoman.freebsd.org [IPv6:2001:4f8:fff6::29]) by mx1.freebsd.org (Postfix) with ESMTP id 29FC413C448; Fri, 22 Feb 2008 15:55:16 +0000 (UTC) (envelope-from bde@FreeBSD.org) Received: from repoman.freebsd.org (localhost [127.0.0.1]) by repoman.freebsd.org (8.14.1/8.14.1) with ESMTP id m1MFtF9w016655; Fri, 22 Feb 2008 15:55:15 GMT (envelope-from bde@repoman.freebsd.org) Received: (from bde@localhost) by repoman.freebsd.org (8.14.1/8.14.1/Submit) id m1MFtFkE016654; Fri, 22 Feb 2008 15:55:15 GMT (envelope-from bde) Message-Id: <200802221555.m1MFtFkE016654@repoman.freebsd.org> From: Bruce Evans Date: Fri, 22 Feb 2008 15:55:15 +0000 (UTC) To: src-committers@FreeBSD.org, cvs-src@FreeBSD.org, cvs-all@FreeBSD.org X-FreeBSD-CVS-Branch: HEAD Cc: Subject: cvs commit: src/lib/msun/src e_rem_pio2.c e_rem_pio2f.c X-BeenThere: cvs-src@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Feb 2008 15:55:16 -0000 bde 2008-02-22 15:55:15 UTC FreeBSD src repository Modified files: lib/msun/src e_rem_pio2.c e_rem_pio2f.c Log: Optimize the 9pi/2 < |x| <= 2**19pi/2 case on amd64 and i386 by avoiding the the double to int conversion operation which is very slow on these arches. Assume that the current rounding mode is the default of round-to-nearest and use rounding operations in this mode instead of faking this mode using the round-towards-zero mode for conversion to int. Round the double to an integer as a double first and as an int second since the double result is needed much earler. Double rounding isn't a problem since we only need a rough approximation. We didn't support other current rounding modes and produce much larger errors than before if called in a non-default mode. This saves an average about 10 cycles on amd64 (A64) and about 25 on i386 (A64) for x in the above range. In some cases the saving is over 25%. Most cases with |x| < 1000pi now take about 88 cycles for cos and sin (with certain CFLAGS, etc.), except on i386 where cos and sin (but not cosf and sinf) are much slower at 111 and 121 cycles respectivly due to the compiler only optimizing well for float precision. A64 hardware cos and sin are slower at 105 cycles on i386 and 110 cycles on amd64. Revision Changes Path 1.12 +9 -0 src/lib/msun/src/e_rem_pio2.c 1.22 +9 -0 src/lib/msun/src/e_rem_pio2f.c