Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 30 Jan 2003 06:13:20 +0100 (MET)
From:      Anders Gavare <g@dd.chalmers.se>
To:        freebsd-alpha@FreeBSD.ORG
Subject:   Re: faster strlen() using longs (?)
Message-ID:  <Pine.GSO.4.44.0301300603340.2166-100000@manwendil.dd.chalmers.se>
In-Reply-To: <Pine.GSO.4.44.0212300847350.17153-100000@kili.dd.chalmers.se>

index | next in thread | previous in thread | raw e-mail

Regarding the subject:  has anyone looked further into this, or is it a
nonsense optimization?

My previous attempt gave me a speedup factor of 2.8. Loading 8 longs
instead of only one at a time, and using data prefetch (... which might
be a bit too implementation specific to work on all alphas), results in
even faster execution for long strings. (A speedup factor of 4.35, on my
machine using the best compiler+options.)

Anyway, here's what it looks like:


size_t my_strlen4(const char *databuf)
{
	long d0, d1, d2, d3, d4, d5, d6, d7;
	long *lp;
	size_t len = 0;

	while (((size_t) databuf) & (sizeof(long)-1)) {
		if (!*databuf++)
			return len;
		len++;
	}

	lp = (long *) databuf;

#define testforzeroes(X) if (	((X & 0xff) == 0) || \
			((X & 0xff00) == 0) || \
			((X & 0xff0000) == 0) || \
			((X & 0xff000000) == 0) || \
			((X & 0xff00000000) == 0) || \
			((X & 0xff0000000000) == 0) || \
			((X & 0xff000000000000) == 0) || \
			((X & 0xff00000000000000) == 0) \
		    ) \
			break;

	while (((size_t) lp) & (sizeof(long)*8-1)) {
		d0 = *lp++;
		testforzeroes(d0);
		len += sizeof(long);
	}

#define returnlength(X) if (!(X & 0xff)) return len; \
	if (!(X & 0xff00))		return len + 1; \
	if (!(X & 0xff0000))		return len + 2; \
	if (!(X & 0xff000000))		return len + 3; \
	if (!(X & 0xff00000000))	return len + 4; \
	if (!(X & 0xff0000000000))	return len + 5; \
	if (!(X & 0xff000000000000))	return len + 6; \
	if (!(X & 0xff00000000000000))	return len + 7;

	if (((size_t) lp) & (sizeof(long)*8-1))
		returnlength(d0);

	for (;;) {
		d0 = lp[0];  d1 = lp[1];  d2 = lp[2];  d3 = lp[3];
		d4 = lp[4];  d5 = lp[5];  d6 = lp[6];  d7 = lp[7];
		lp += 8;

		/*  Prefetch next octo-word:  */
		asm ("ldl $31,0(%0)" : : "g" (lp));

		testforzeroes(d0);
		testforzeroes(d1);
		testforzeroes(d2);
		testforzeroes(d3);
		testforzeroes(d4);
		testforzeroes(d5);
		testforzeroes(d6);
		testforzeroes(d7);

		len += (sizeof(long)*8);
	}

	returnlength(d0);  len += 8;
	returnlength(d1);  len += 8;
	returnlength(d2);  len += 8;
	returnlength(d3);  len += 8;
	returnlength(d4);  len += 8;
	returnlength(d5);  len += 8;
	returnlength(d6);  len += 8;
	returnlength(d7);

	return 0;
}


I'm a bit tired, so there are probably a number of bugs in there.

Anders


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-alpha" in the body of the message



home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.GSO.4.44.0301300603340.2166-100000>