Date: Mon, 1 Nov 1999 11:39:54 +0000 (GMT) From: Stephen Roome <steveroo@mothra.bri.hp.com> To: freebsd-questions@freebsd.org Subject: Athlons.. Message-ID: <Pine.HPX.4.10.9911011133350.20098-100000@mothra.bri.hp.com>
next in thread | raw e-mail | index | archive | help
Does anyone know if the following example, taken from AMD's documentation,
would improve bcopy performance on the Athlon? (over the routine that would be
defaulted to ?)
I'm asking, because I don't know enough asm, to put this in safely, and I quite
probably don't know what I'm talking about at all, but someone's got to make a
fool of themselves now and again.
(example at end)
Steve
; xfer label should be 32 byte aligned.
movq-movq example :
_asm { mov eax, [src]
mov edx, [dst]
mov ecx, (SIZE >> 6)
xfer:
movq mm0, [eax]
add edx, 64
movq mm1, [eax+8]
add eax, 64
movq mm2, [eax-48]
movq [edx-64], mm0
movq mm3, [eax-40]
movq [edx-56], mm1
movq mm4, [eax-32]
movq [edx-48], mm2
movq mm5, [eax-24]
movq [edx-40], mm3
movq mm6, [eax-16]
movq [edx-32], mm4
movq mm7, [eax-8]
movq [edx-24], mm5
movq [edx-16], mm6
dec ecx
movq [edx-8], mm7
jnz xfer }
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.HPX.4.10.9911011133350.20098-100000>
