From owner-freebsd-arch@FreeBSD.ORG  Mon Jan 19 20:36:09 2009
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@FreeBSD.ORG
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B83A41065673
	for <freebsd-arch@FreeBSD.ORG>; Mon, 19 Jan 2009 20:36:09 +0000 (UTC)
	(envelope-from das@FreeBSD.ORG)
Received: from zim.MIT.EDU (ZIM.MIT.EDU [18.95.3.101])
	by mx1.freebsd.org (Postfix) with ESMTP id 7A11F8FC19
	for <freebsd-arch@FreeBSD.ORG>; Mon, 19 Jan 2009 20:36:09 +0000 (UTC)
	(envelope-from das@FreeBSD.ORG)
Received: from zim.MIT.EDU (localhost [127.0.0.1])
	by zim.MIT.EDU (8.14.3/8.14.2) with ESMTP id n0JK44RJ027313;
	Mon, 19 Jan 2009 15:04:04 -0500 (EST) (envelope-from das@FreeBSD.ORG)
Received: (from das@localhost)
	by zim.MIT.EDU (8.14.3/8.14.2/Submit) id n0JK42cW027312;
	Mon, 19 Jan 2009 15:04:02 -0500 (EST) (envelope-from das@FreeBSD.ORG)
Date: Mon, 19 Jan 2009 15:04:02 -0500
From: David Schultz <das@FreeBSD.ORG>
To: d@delphij.net
Message-ID: <20090119200402.GA26878@zim.MIT.EDU>
Mail-Followup-To: d@delphij.net, freebsd-arch@FreeBSD.ORG
References: <4966B5D4.7040709@delphij.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4966B5D4.7040709@delphij.net>
Cc: freebsd-arch@FreeBSD.ORG
Subject: Re: RFC: MI strlen()
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Jan 2009 20:36:10 -0000

On Thu, Jan 08, 2009, Xin LI wrote:
> Here is a new implementation of strlen() which employed the bitmask
> skill in order to achieve better performance on modern hardware.  For
> common case, this would be a 5.2x boost on FreeBSD/amd64.  The code is
> intended for MI use when there is no hand-optimized assembly.

I ran some microbenchmarks on amd64, which show that the version
of strlen() in libc is up to twice as fast as yours for short
strings (< 4 bytes), but your implementation is nearly 5 times as
fast for longer strings.

As Bruce pointed out, gcc will almost use its builtin
strlen(). However, that may change in the future, and nobody has
suggested that your version would actually hurt anything, so I
think you should commit it.

Benchmark results:
	  http://www.freebsd.org/~das/strlen.gif

I ran this on a Wolfdale core using word-aligned ASCII strings and
an adaptive number of iterations. As you can see, the gcc builtin
is always slower than your code, but faster than our current libc
implementation. I can't explain why the builtin is faster for
strings of length 10 than it is for strings of length 1, but the
results are repeatable. Another interesting thing to note is that
your implementation is the only one that gets less throughput when
the string no longer fits in the L2 cache. This suggests that
either the other two are so slow that they can't use the full
memory bandwidth, or they are more effective at triggering the
CPU's prefetch heuristics.