From owner-freebsd-alpha Sat Dec 28 15:25:55 2002 Delivered-To: freebsd-alpha@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EBDA137B401 for ; Sat, 28 Dec 2002 15:25:52 -0800 (PST) Received: from eru.dd.chalmers.se (eru.dd.chalmers.se [129.16.117.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id D7C0343EA9 for ; Sat, 28 Dec 2002 15:25:51 -0800 (PST) (envelope-from g@dd.chalmers.se) Received: from elros.dd.chalmers.se (elros.dd.chalmers.se [129.16.116.22]) by eru.dd.chalmers.se (8.12.6/8.12.6) with ESMTP id gBSNPnQQ010734 for ; Sun, 29 Dec 2002 00:25:50 +0100 (MET) Date: Sun, 29 Dec 2002 00:25:49 +0100 (MET) From: Anders Gavare X-X-Sender: f98anga@elros.dd.chalmers.se To: freebsd-alpha@freebsd.org Subject: faster strlen() using longs (?) Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-alpha@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Hi. I'm using FreeBSD 4.5 on an Alpha, and I noticed that strlen() isn't implemented using words, but using chars. I've experimented with several different variations of using longs, and this is the fastest one I've come up with. It's a quick hack, I know, and it's a bit hard coded, so it would have to be placed in the alpha-specific part of libc. It is 2.8 times faster than the default strlen() in libc. size_t my_strlen3(char *databuf) { long data; long *lp; size_t len = 0; /* Count non-aligned chars: */ while (((size_t) databuf) & (sizeof(long)-1)) { if (!*databuf++) return len; len++; } lp = (long *) databuf; /* Loop through full 'long' words: */ for (;;) { /* See comment (START) */ data = *lp++; if ( ((data & 0xff) == 0) || ((data & 0xff00) == 0) || ((data & 0xff0000) == 0) || ((data & 0xff000000) == 0) || ((data & 0xff00000000) == 0) || ((data & 0xff0000000000) == 0) || ((data & 0xff000000000000) == 0) || ((data & 0xff00000000000000) == 0) ) break; len += sizeof(long); /* See comment (END) */ } /* Return the actual length: */ if (!(data & 0xff)) return len; if (!(data & 0xff00)) return len + 1; if (!(data & 0xff0000)) return len + 2; if (!(data & 0xff000000)) return len + 3; data >>= 32; len += 4; if (!(data & 0xff)) return len; if (!(data & 0xff00)) return len + 1; if (!(data & 0xff0000)) return len + 2; return len + 3; } It is sometimes faster when compiled with -O than with -O3 (!), but this depends on which compiler is used. The stuff between (START) and (END) can be #defined and then included multiple times. That way, there will be multiple word tests before the jump back to the start of the for loop. This give a small performance gain using some compilers / compiler options. If this is not an issue anymore with newer releases of FreeBSD on Alpha, then just ignore this mail :-) Anders PS. I'm not on the list, so please CC me if you feel like replying. DS. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-alpha" in the body of the message