From owner-cvs-src@FreeBSD.ORG Tue Dec 13 20:17:24 2005 Return-Path: X-Original-To: cvs-src@FreeBSD.org Delivered-To: cvs-src@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 93B0416A420; Tue, 13 Dec 2005 20:17:24 +0000 (GMT) (envelope-from bde@FreeBSD.org) Received: from repoman.freebsd.org (repoman.freebsd.org [216.136.204.115]) by mx1.FreeBSD.org (Postfix) with ESMTP id 62A7043D46; Tue, 13 Dec 2005 20:17:24 +0000 (GMT) (envelope-from bde@FreeBSD.org) Received: from repoman.freebsd.org (localhost [127.0.0.1]) by repoman.freebsd.org (8.13.1/8.13.1) with ESMTP id jBDKHOOi077271; Tue, 13 Dec 2005 20:17:24 GMT (envelope-from bde@repoman.freebsd.org) Received: (from bde@localhost) by repoman.freebsd.org (8.13.1/8.13.1/Submit) id jBDKHOG5077270; Tue, 13 Dec 2005 20:17:24 GMT (envelope-from bde) Message-Id: <200512132017.jBDKHOG5077270@repoman.freebsd.org> From: Bruce Evans Date: Tue, 13 Dec 2005 20:17:24 +0000 (UTC) To: src-committers@FreeBSD.org, cvs-src@FreeBSD.org, cvs-all@FreeBSD.org X-FreeBSD-CVS-Branch: HEAD Cc: Subject: cvs commit: src/lib/msun/src s_cbrt.c s_cbrtf.c X-BeenThere: cvs-src@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Dec 2005 20:17:24 -0000 bde 2005-12-13 20:17:24 UTC FreeBSD src repository Modified files: lib/msun/src s_cbrt.c s_cbrtf.c Log: Optimize by not doing excessive conversions for handling the sign bit. This gives an optimization of between 9 and 22% on Athlons (largest for cbrt() on amd64 -- from 205 to 159 cycles). We extracted the sign bit and worked with |x|, and restored the sign bit as the last step. We avoided branches to a fault by using accesses to FP values as bits to clear and restore the sign bit. Avoiding branches is usually good, but the bit access macros are not so good (especially for setting FP values), and here they always caused pipeline stalls on Athlons. Even using branches would be faster except on args that give perfect branch misprediction, since only mispredicted branches cause stalls, but it possible to avoid touching the sign bit in FP values at all (except to preserve it in conversions from bits to FP not related to the sign bit). Do this. The results are identical except in 2 of the 3 unsupported rounding modes, since all the approximations use odd rational functions so they work right on strictly negative values, and the special case of -0 doesn't use an approximation. Revision Changes Path 1.10 +5 -7 src/lib/msun/src/s_cbrt.c 1.12 +4 -8 src/lib/msun/src/s_cbrtf.c