Date: Wed, 24 Jan 1996 01:19:34 -0800 (PST) From: asami@cs.berkeley.edu (Satoshi Asami) To: tege@matematik.su.se Cc: freebsd-hackers@freebsd.org, ccd@forgery.cs.berkeley.edu Subject: Re: Pentium bcopy Message-ID: <199601240919.BAA28229@silvia.HIP.Berkeley.EDU> In-Reply-To: <199512240116.CAA26645@insanus.matematik.su.se> (message from Torbjorn Granlund on Sun, 24 Dec 1995 02:15:58 %2B0100)
next in thread | previous in thread | raw e-mail | index | archive | help
Sorry for replying to an old mail.... * This time I want to help improving the bcopy/memcpy/memmove functions for * the Pentium (and 486). Here is a skeleton bcopy/memcpy that runs about 5 * times faster than your current implementation on a Pentium. This bcopy * handles up to about 350 MB/s on a Pentium 133, compared to the current 70 * MB/s. We've also tried some optimizations as part of the ccd project. What we did was to use the fp registers to load/store 8 bytes at a time, with an unrolled loop (of course). We got up to about 96MB/s on a P5-133/Triton for large copies like 2M (i.e., by going to memory, not cache) by using a standalone program. However, when we tried to plug it into the kernel to see the difference it makes for disk caches, the kernel died quite nicely during boot. (Not even DDB could help.) Does anyone know if there are any `gotchas' concerning the use of fp regs in the kernel? Satoshi
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199601240919.BAA28229>