From owner-freebsd-arch@FreeBSD.ORG Wed Jul 11 22:13:41 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0307E16A41F for ; Wed, 11 Jul 2007 22:13:41 +0000 (UTC) (envelope-from peterjeremy@optushome.com.au) Received: from turion.vk2pj.dyndns.org (c220-239-20-82.belrs4.nsw.optusnet.com.au [220.239.20.82]) by mx1.freebsd.org (Postfix) with ESMTP id 2E38113C44B for ; Wed, 11 Jul 2007 22:13:39 +0000 (UTC) (envelope-from peterjeremy@optushome.com.au) Received: from turion.vk2pj.dyndns.org (localhost.vk2pj.dyndns.org [127.0.0.1]) by turion.vk2pj.dyndns.org (8.14.1/8.14.1) with ESMTP id l6BMDc0H032248; Thu, 12 Jul 2007 08:13:38 +1000 (EST) (envelope-from peter@turion.vk2pj.dyndns.org) Received: (from peter@localhost) by turion.vk2pj.dyndns.org (8.14.1/8.14.1/Submit) id l6BMDcPL032247; Thu, 12 Jul 2007 08:13:38 +1000 (EST) (envelope-from peter) Date: Thu, 12 Jul 2007 08:13:38 +1000 From: Peter Jeremy To: "Sean C. Farley" Message-ID: <20070711221338.GC20178@turion.vk2pj.dyndns.org> References: <20070711134721.D2385@thor.farley.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="dkEUBIird37B8yKS" Content-Disposition: inline In-Reply-To: <20070711134721.D2385@thor.farley.org> X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc User-Agent: Mutt/1.5.16 (2007-06-09) Cc: freebsd-arch@freebsd.org Subject: Re: Assembly string functions in i386 libc X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Jul 2007 22:13:41 -0000 --dkEUBIird37B8yKS Content-Type: multipart/mixed; boundary="FkmkrVfFsRoUs1wW" Content-Disposition: inline --FkmkrVfFsRoUs1wW Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2007-Jul-11 15:24:01 -0500, "Sean C. Farley" wrote: >libc compared to the version I was writing. After more testing, I found >it was only the assembly version that is really slow. The C version is >fairly quick. Is there a need to continue to use the assembly versions >of string functions on i386? Does it mainly help slower systems such as >those with i386 or i486 CPU's? The performance of string instructions has varied wildly across various x86 implementations. Definitely, for short strings, the overhead in initialising the various registers outweighs any actual difference in loop performance. For any recent CPU, the location of the string in the memory hierarchy far outweighs implementation issues. bde@ has done various testing in the last and posted results. Some comments: - comparing the strlen() in a shared libc with a statically linked one is unfair - especially on the i386. - Your results don't include non-aligned inputs - Your results don't include non-power-of-2 lengths >I would appreciate it if anyone could see if strlen and strlen2 perform >any better on an amd64. Although the current C version of strlen() in >7-CURRENT is faster than mine for smaller values, they perform better >for larger strings. I've tested on: FreeBSD 6.2-STABLE #28: Fri Jun 22 11:44:13 EST 2007 root@turion.vk2pj.dyndns.org:/usr/obj/usr/src/sys/turion CPU: AMD Turion(tm) 64 Mobile ML-40 (2194.52-MHz K8-class = CPU) Origin =3D "AuthenticAMD" Id =3D 0x20f42 Stepping =3D 2 Features=3D0x78bfbff Features2=3D0x1 AMD Features=3D0xe2500800 AMD Features2=3D0x1 There is no asm strlen so libcstrlen and basestrlen should be identical (and disassembling [x]strlen() shows that the code _is_ identical) but there are significant differences for short strings and measurable differences for all lengths except 32 bytes. This indicates that your program is not able to accurately compare strlen() performance. I've tried statically linking all the test programs and this removes the libcstrlen/basestrlen differences. The very poor results for 4 and 8 byte strings are unexpected but (as expected), your unrolled strlen() implementations behave better for longer strings. The attached results all reflect your code with '-static' added to every gcc/link step. --=20 Peter Jeremy --FkmkrVfFsRoUs1wW Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="results.01" x libcstrlen.01 + basestrlen.01 * strlen.01 % strlen2.01 +--------------------------------------------------------------------------+ | % | | * * % | | * * % | | ** *x %% | |* ** * * ** + % %%%+# x +| ||____M_A______| ||______M______AA_________|_A|__| | +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 10 0.056836 0.076297 0.0571435 0.0600979 0.0065785842 + 10 0.056941 0.079997 0.0571135 0.0605826 0.0075392895 No difference proven at 95.0% confidence * 10 0.045764 0.057498 0.047954 0.0489822 0.0031965751 Difference at 95.0% confidence -0.0111157 +/- 0.00485944 -18.496% +/- 8.08587% (Student's t, pooled s = 0.00517184) % 10 0.0642 0.067644 0.0662535 0.0662219 0.00087897572 Difference at 95.0% confidence 0.006124 +/- 0.00440962 10.19% +/- 7.33739% (Student's t, pooled s = 0.0046931) --FkmkrVfFsRoUs1wW Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="results.02" x libcstrlen.02 + basestrlen.02 * strlen.02 % strlen2.02 +--------------------------------------------------------------------------+ | % | | % | | * * % | | ** *x %% | | * *** * +*xx + * x %% % % * %| ||_________M____|_A_|MA|A______|___| |___M__A_____| | +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 10 0.06611 0.076365 0.0663895 0.0673789 0.0031655271 + 10 0.065865 0.068425 0.0662385 0.0664137 0.00072301053 No difference proven at 95.0% confidence * 10 0.059657 0.08414 0.06171 0.0648375 0.0073763495 No difference proven at 95.0% confidence % 10 0.079855 0.089286 0.0801355 0.0812853 0.0029096056 Difference at 95.0% confidence 0.0139064 +/- 0.00285662 20.6391% +/- 4.23963% (Student's t, pooled s = 0.00304026) --FkmkrVfFsRoUs1wW Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="results.04" x libcstrlen.04 + basestrlen.04 * strlen.04 % strlen2.04 +--------------------------------------------------------------------------+ | * * % | | * * % | | * * * % % | | x ****x +***** + % %%% + % %| ||____|MM|____|A_|___________| |______M___A__________| | +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 10 0.082714 0.086885 0.08454 0.0848665 0.0011530442 + 10 0.084376 0.113232 0.0852015 0.0900253 0.0098820415 No difference proven at 95.0% confidence * 10 0.089334 0.091925 0.089935 0.090297 0.00094932268 Difference at 95.0% confidence 0.0054305 +/- 0.000992314 6.39887% +/- 1.16926% (Student's t, pooled s = 0.00105611) % 10 0.105559 0.131599 0.1080435 0.1111049 0.0078954972 Difference at 95.0% confidence 0.0262384 +/- 0.00530137 30.9173% +/- 6.24671% (Student's t, pooled s = 0.00564218) --FkmkrVfFsRoUs1wW Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="results.08" x libcstrlen.08 + basestrlen.08 * strlen.08 % strlen2.08 +--------------------------------------------------------------------------+ | ** % * | | ** %%% % ** * | | +** +x +x %%%% % *** * *| ||__MM_A____| |_MA__| |__M_A_____| | +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 10 0.121139 0.146324 0.1226245 0.125436 0.0079127939 + 10 0.12069 0.145306 0.1219265 0.1250433 0.0076468179 No difference proven at 95.0% confidence * 10 0.194276 0.218771 0.1965875 0.1998421 0.0075342985 Difference at 95.0% confidence 0.0744061 +/- 0.00725919 59.318% +/- 5.78717% (Student's t, pooled s = 0.00772586) % 10 0.162464 0.173597 0.164107 0.1656017 0.0041219019 Difference at 95.0% confidence 0.0401657 +/- 0.00592774 32.0209% +/- 4.72571% (Student's t, pooled s = 0.00630882) --FkmkrVfFsRoUs1wW Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="results.16" x libcstrlen.16 + basestrlen.16 * strlen.16 % strlen2.16 +--------------------------------------------------------------------------+ | % | | % | | % * * | | %% ** x*+ + | | %% %% % *** * * * * x*++ x*+ x *| ||_M__A____| |__M_A_____| ||M_M_A____|| | +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 10 0.267459 0.292087 0.2683745 0.2740531 0.0087043756 + 10 0.267893 0.292362 0.2707545 0.2747774 0.0078299701 No difference proven at 95.0% confidence * 10 0.212733 0.236073 0.2156615 0.2196616 0.007802558 Difference at 95.0% confidence -0.0543915 +/- 0.00776649 -19.8471% +/- 2.83394% (Student's t, pooled s = 0.00826577) % 10 0.185465 0.208264 0.186767 0.1902279 0.0071633648 Difference at 95.0% confidence -0.0838252 +/- 0.0074897 -30.5872% +/- 2.73294% (Student's t, pooled s = 0.0079712) --FkmkrVfFsRoUs1wW Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="results.32" x libcstrlen.32 + basestrlen.32 * strlen.32 % strlen2.32 +--------------------------------------------------------------------------+ | % * | | % * * x + + | | % % % * * xx ++ + | |%%% % % * ** * * +xx+** * xx| ||M_A__| |__A__| |__|MA_|_| | +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 10 0.414594 0.45129 0.4222915 0.4275801 0.014180339 + 10 0.412549 0.438513 0.426586 0.4282437 0.0078657702 No difference proven at 95.0% confidence * 10 0.249627 0.276534 0.260869 0.2597264 0.0093742772 Difference at 95.0% confidence -0.167854 +/- 0.0112939 -39.2567% +/- 2.64135% (Student's t, pooled s = 0.01202) % 10 0.212447 0.236769 0.2154385 0.2202512 0.0093600922 Difference at 95.0% confidence -0.207329 +/- 0.0112887 -48.4889% +/- 2.64014% (Student's t, pooled s = 0.0120144) --FkmkrVfFsRoUs1wW Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="results.64" x libcstrlen.64 + basestrlen.64 * strlen.64 % strlen2.64 +--------------------------------------------------------------------------+ |% ** | |% % ** | |%%% ** +**+ x| |%%% % ** * ******| ||A| |A| ||A_| | +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 10 0.709884 0.744587 0.7251365 0.7276627 0.011059351 + 10 0.711416 0.742595 0.723867 0.7256684 0.010705189 No difference proven at 95.0% confidence * 10 0.324096 0.345875 0.330732 0.33092 0.0067527522 Difference at 95.0% confidence -0.396743 +/- 0.0086092 -54.5229% +/- 1.18313% (Student's t, pooled s = 0.00916267) % 10 0.267484 0.290712 0.273968 0.2746264 0.0073885214 Difference at 95.0% confidence -0.453036 +/- 0.00883668 -62.2591% +/- 1.21439% (Student's t, pooled s = 0.00940477) --FkmkrVfFsRoUs1wW-- --dkEUBIird37B8yKS Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (FreeBSD) iD8DBQFGlVYS/opHv/APuIcRAqoBAKCzXMdchhSIPMG45ppHii3kW6Kv7ACgv7E2 iM1384/vPqaZSmdHun+3iJg= =ET/D -----END PGP SIGNATURE----- --dkEUBIird37B8yKS--