Date: Thu, 8 Jan 2009 20:25:57 -0800 From: Alfred Perlstein <alfred@freebsd.org> To: d@delphij.net Cc: freebsd-arch@FreeBSD.org Subject: Re: RFC: MI strlen() Message-ID: <20090109042557.GH60686@elvis.mu.org> In-Reply-To: <4966B5D4.7040709@delphij.net> References: <4966B5D4.7040709@delphij.net>
next in thread | previous in thread | raw e-mail | index | archive | help
* Xin LI <delphij@delphij.net> [090108 18:27] wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi, > > Here is a new implementation of strlen() which employed the bitmask > skill in order to achieve better performance on modern hardware. For > common case, this would be a 5.2x boost on FreeBSD/amd64. The code is > intended for MI use when there is no hand-optimized assembly. > > http://people.freebsd.org/~delphij/for_review/strlen.diff > > Note that this version of strlen() has suboptimal performance if there > are a lot of characters that has their highest bit set (we can change it > to have uniform performance at the expense of about ~30% performance > penalty). > > Comments? It's pretty cool, what does valgrind say when you walk past the end of a string, can it figure that out? > > Cheers, > - -- > Xin LI <delphij@delphij.net> http://www.delphij.net/ > FreeBSD - The Power to Serve! > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2.0.9 (FreeBSD) > > iEYEARECAAYFAklmtdQACgkQi+vbBBjt66CltACfY1j+JJnn3PDLOmO1uOpMkMlg > 94gAn3v9eSExj84hcNYEI0AhLGWBCxMj > =xzFJ > -----END PGP SIGNATURE----- > Index: strlen.c > =================================================================== > --- strlen.c (revision 186910) > +++ strlen.c (working copy) > @@ -1,6 +1,6 @@ > /*- > - * Copyright (c) 1990, 1993 > - * The Regents of the University of California. All rights reserved. > + * Copyright (c) 2009 Xin LI <delphij@FreeBSD.org> > + * All rights reserved. > * > * Redistribution and use in source and binary forms, with or without > * modification, are permitted provided that the following conditions > @@ -10,14 +10,11 @@ > * 2. Redistributions in binary form must reproduce the above copyright > * notice, this list of conditions and the following disclaimer in the > * documentation and/or other materials provided with the distribution. > - * 4. Neither the name of the University nor the names of its contributors > - * may be used to endorse or promote products derived from this software > - * without specific prior written permission. > * > - * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND > + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND > * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE > * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE > - * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE > + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE > * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL > * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS > * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) > @@ -27,21 +24,93 @@ > * SUCH DAMAGE. > */ > > -#if defined(LIBC_SCCS) && !defined(lint) > -static char sccsid[] = "@(#)strlen.c 8.1 (Berkeley) 6/4/93"; > -#endif /* LIBC_SCCS and not lint */ > #include <sys/cdefs.h> > __FBSDID("$FreeBSD$"); > > +#include <sys/limits.h> > +#include <sys/types.h> > #include <string.h> > > +#ifndef CTASSERT > +#define CTASSERT(x) _CTASSERT(x, __LINE__) > +#define _CTASSERT(x, y) __CTASSERT(x, y) > +#define __CTASSERT(x, y) typedef char __assert_ ## y [(x) ? 1 : -1] > +#endif > + > +CTASSERT(LONG_BIT == 32 || LONG_BIT == 64); > + > +/* > + * Portable strlen() for 32-bit and 64-bit systems. > + * > + * Rationale: it is generally much more efficient to do word length > + * operations and avoid branches on modern computer systems, as > + * compared to byte-length operations with a lot of branches. > + * > + * The expression: > + * > + * ((x - 0x01....01) & ~x & 0x80....80) > + * > + * would evaluate to a non-zero value iff any of the bytes in the > + * original word is zero. However, we can further reduce ~1/3 of > + * time if we consider that strlen() usually operate on 7-bit ASCII > + * by employing the following expression, which allows false positive > + * when high bit of 1 and use the tail case to catch these case: > + * > + * ((x - 0x01....01) & 0x80....80) > + * > + * This is more than 5.2 times as compared to the raw implementation > + * on Intel T7300 under EM64T mode. > + */ > + > +/* Magic numbers for the algorithm */ > +#if LONG_BIT == 32 > +static const unsigned long mask01 = 0x01010101; > +static const unsigned long mask80 = 0x80808080; > +#elif LONG_BIT == 64 > +static const unsigned long mask01 = 0x0101010101010101; > +static const unsigned long mask80 = 0x8080808080808080; > +#endif > + > +#define LONGPTR_MASK (sizeof(long) - 1) > + > +/* > + * Helper macro to return string length if we caught the zero > + * byte. > + */ > +#define testbyte(x) \ > + do { \ > + if (p[x] == '\0') \ > + return (p - str + x); \ > + } while (0); > + > size_t > -strlen(str) > - const char *str; > +strlen(const char *str) > { > - const char *s; > + const char *p; > + const unsigned long *lp; > > - for (s = str; *s; ++s); > - return(s - str); > + /* Skip the first few bytes until we have an aligned p */ > + for (p = str; (uintptr_t)p & LONGPTR_MASK; ++p) > + if (*p == 0) > + return (p - str); > + > + /* Scan the rest of the string using word sized operation */ > + for (lp = (const unsigned long *)p; ; lp++) > + if ((*lp - mask01) & mask80) { > + p = (const char *)(lp); > + testbyte(0); > + testbyte(1); > + testbyte(2); > + testbyte(3); > +#if (LONG_BIT >= 64) > + testbyte(4); > + testbyte(5); > + testbyte(6); > + testbyte(7); > +#endif > + } > + > + /* NOTREACHED */ > + return 0; > } > > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" -- - Alfred Perlstein
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090109042557.GH60686>