From owner-freebsd-alpha Wed Jan 29 21:13:26 2003 Delivered-To: freebsd-alpha@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 91B1737B401 for ; Wed, 29 Jan 2003 21:13:23 -0800 (PST) Received: from eru.dd.chalmers.se (eru.dd.chalmers.se [129.16.117.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6F53F43F3F for ; Wed, 29 Jan 2003 21:13:22 -0800 (PST) (envelope-from g@dd.chalmers.se) Received: from manwendil.dd.chalmers.se (manwendil.dd.chalmers.se [129.16.116.24]) by eru.dd.chalmers.se (8.12.6/8.12.6) with ESMTP id h0U5DKAO000013 for ; Thu, 30 Jan 2003 06:13:20 +0100 (MET) Date: Thu, 30 Jan 2003 06:13:20 +0100 (MET) From: Anders Gavare X-X-Sender: f98anga@manwendil.dd.chalmers.se To: freebsd-alpha@FreeBSD.ORG Subject: Re: faster strlen() using longs (?) In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-alpha@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Regarding the subject: has anyone looked further into this, or is it a nonsense optimization? My previous attempt gave me a speedup factor of 2.8. Loading 8 longs instead of only one at a time, and using data prefetch (... which might be a bit too implementation specific to work on all alphas), results in even faster execution for long strings. (A speedup factor of 4.35, on my machine using the best compiler+options.) Anyway, here's what it looks like: size_t my_strlen4(const char *databuf) { long d0, d1, d2, d3, d4, d5, d6, d7; long *lp; size_t len = 0; while (((size_t) databuf) & (sizeof(long)-1)) { if (!*databuf++) return len; len++; } lp = (long *) databuf; #define testforzeroes(X) if ( ((X & 0xff) == 0) || \ ((X & 0xff00) == 0) || \ ((X & 0xff0000) == 0) || \ ((X & 0xff000000) == 0) || \ ((X & 0xff00000000) == 0) || \ ((X & 0xff0000000000) == 0) || \ ((X & 0xff000000000000) == 0) || \ ((X & 0xff00000000000000) == 0) \ ) \ break; while (((size_t) lp) & (sizeof(long)*8-1)) { d0 = *lp++; testforzeroes(d0); len += sizeof(long); } #define returnlength(X) if (!(X & 0xff)) return len; \ if (!(X & 0xff00)) return len + 1; \ if (!(X & 0xff0000)) return len + 2; \ if (!(X & 0xff000000)) return len + 3; \ if (!(X & 0xff00000000)) return len + 4; \ if (!(X & 0xff0000000000)) return len + 5; \ if (!(X & 0xff000000000000)) return len + 6; \ if (!(X & 0xff00000000000000)) return len + 7; if (((size_t) lp) & (sizeof(long)*8-1)) returnlength(d0); for (;;) { d0 = lp[0]; d1 = lp[1]; d2 = lp[2]; d3 = lp[3]; d4 = lp[4]; d5 = lp[5]; d6 = lp[6]; d7 = lp[7]; lp += 8; /* Prefetch next octo-word: */ asm ("ldl $31,0(%0)" : : "g" (lp)); testforzeroes(d0); testforzeroes(d1); testforzeroes(d2); testforzeroes(d3); testforzeroes(d4); testforzeroes(d5); testforzeroes(d6); testforzeroes(d7); len += (sizeof(long)*8); } returnlength(d0); len += 8; returnlength(d1); len += 8; returnlength(d2); len += 8; returnlength(d3); len += 8; returnlength(d4); len += 8; returnlength(d5); len += 8; returnlength(d6); len += 8; returnlength(d7); return 0; } I'm a bit tired, so there are probably a number of bugs in there. Anders To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-alpha" in the body of the message