From owner-freebsd-arch@FreeBSD.ORG Fri Jul 13 14:19:12 2007 Return-Path: X-Original-To: freebsd-arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B0BEF16A403 for ; Fri, 13 Jul 2007 14:19:12 +0000 (UTC) (envelope-from scf@FreeBSD.org) Received: from mail.farley.org (farley.org [67.64.95.201]) by mx1.freebsd.org (Postfix) with ESMTP id 62CB413C47E for ; Fri, 13 Jul 2007 14:19:12 +0000 (UTC) (envelope-from scf@FreeBSD.org) Received: from thor.farley.org (thor.farley.org [192.168.1.5]) by mail.farley.org (8.14.1/8.14.1) with ESMTP id l6DEKw0Y048729; Fri, 13 Jul 2007 09:20:58 -0500 (CDT) (envelope-from scf@FreeBSD.org) Date: Fri, 13 Jul 2007 09:18:50 -0500 (CDT) From: "Sean C. Farley" To: Bruce Evans In-Reply-To: <20070713135453.H8054@delplex.bde.org> Message-ID: <20070713085330.H21970@thor.farley.org> References: <20070711134721.D2385@thor.farley.org> <20070712191616.A4682@delplex.bde.org> <20070712211245.M8625@besplex.bde.org> <20070712142024.Q8789@thor.farley.org> <20070713135453.H8054@delplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Spam-Status: No, score=-4.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.1 X-Spam-Checker-Version: SpamAssassin 3.2.1 (2007-05-02) on mail.farley.org Cc: freebsd-arch@FreeBSD.org Subject: Re: Assembly string functions in i386 libc X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Jul 2007 14:19:12 -0000 On Fri, 13 Jul 2007, Bruce Evans wrote: > On Thu, 12 Jul 2007, Sean C. Farley wrote: > >> On Thu, 12 Jul 2007, Bruce Evans wrote: > >>> Now I've looked at it. I think it is not testing strlen() at all, >>> except for the libc case, because __pure prevents more than 1 call >>> to strlen(). (The existence of __pure is also a bug. __pure was >>> the FreeBSD spelling of the __const__ attribute in gcc-1. It was >>> removed when special support for gcc-1 was dropped, and should not >>> have been recycled.) __pure is a syntax error in the old version of >>> FreeBSD that I tested on. I first tried __pure2, which is the >>> FreeBSD spelling of the __const__ attribute in gcc-2. I think it is >>> weaker than the __pure__ attribute in gcc-3. >> >>> From what I could find, strlen() should not have the __const__ >>> (__pure2) attribute since it is being passed a pointer, but __pure__ >>> (__pure) should work. Are you saying that __pure used to mean >>> __const__ in gcc-1 but now it means __pure__ for gcc-2.96 and above? >>> The redefinition of __pure is what you are saying is a bug. Yes? > > Yes to most of this. __pure2 is actually weaker than __pure[>2.96]. > __pure2 has the very large effect of removing all calls to strlen() > from the loop. This affected everything except libc strlen() since > everything else was named xstrlen() and declared as __pure*, while > libc strlen() was declared in without __pure*. Actually, the reason I had __pure in main.c was because it exists in string.h. > OTOH, __pure[>2.96] has no effect on this benchmark, at least with > gcc-3.3.3. I don't understand why it has no effect. It has no effect > even when I change the arg to a literal. The context is very simple, > with no aliasing problems in sight, at least with the literal arg > (with the arg possibly being argv[2], maybe gcc has to worry about the > arg being modified by a signal handler). If __pure[>2.96] doesn't > work in this simple context, then it isn't clear when it can work. Using or not using __pure with gcc-3.4.6 has no effect for me even with the literal argument regardless of optimization (-O0, -O1, or -O2). > BTW, starting somewhere near gcc-3.4 for -O2 and gcc-4.2 for -O, > simple loops like this don't always work in benchmarks, because the > compiler removes the whole loop if it can see that it doesn't do > anything. The compiler can see this if it can see inside any function > calls in the loop (this currently requires the functions to be in the > same source file or #included there), or if the functions are declared > as sufficiently __pure. When I used __pure2 with gcc-3.3.3 -O, gcc > removed the function calls but not the loop. gcc-4.2 would also > remove the loop. Interesting. I need to remember this. Just to note, __pure2 is not valid with strlen() since it examines data passed via a pointer, according to the GCC docs. > ...[A64 in 32-bit mode similar to AXP] BTW, does AXP refer to Athlon XP or Alpha AXP? When I first saw you write AXP, I thought it was an Alpha. :) >> ...[asm version more than twice as slow on P3-P4] > >> The Athlon XP did much better with the assembly version than either >> Intel CPU for me. For all three CPU's using various string lengths >> from 1 to 256, the C versions always beat the assembly version >> although it came somewhat close for the 9 to 32 byte lengths to >> basestrlen. > > Intel CPUs are remarkably different from AXP :-). I'm surprised at > the sign of the difference here -- I would have expected them to be > better for the string instructions. That is what has been confusing me. Possibly, Intel has not touched the basics of these string instructions for a longer time than AMD. Sean -- scf@FreeBSD.org