From owner-freebsd-hackers Wed Jan 24 01:20:00 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.3/8.7.3) id BAA13447 for hackers-outgoing; Wed, 24 Jan 1996 01:20:00 -0800 (PST) Received: from silvia.HIP.Berkeley.EDU (silvia.HIP.Berkeley.EDU [136.152.64.181]) by freefall.freebsd.org (8.7.3/8.7.3) with ESMTP id BAA13436 for ; Wed, 24 Jan 1996 01:19:52 -0800 (PST) Received: (from asami@localhost) by silvia.HIP.Berkeley.EDU (8.7.3/8.6.9) id BAA28229; Wed, 24 Jan 1996 01:19:34 -0800 (PST) Date: Wed, 24 Jan 1996 01:19:34 -0800 (PST) Message-Id: <199601240919.BAA28229@silvia.HIP.Berkeley.EDU> To: tege@matematik.su.se CC: freebsd-hackers@freebsd.org, ccd@forgery.cs.berkeley.edu In-reply-to: <199512240116.CAA26645@insanus.matematik.su.se> (message from Torbjorn Granlund on Sun, 24 Dec 1995 02:15:58 +0100) Subject: Re: Pentium bcopy From: asami@cs.berkeley.edu (Satoshi Asami) Sender: owner-hackers@freebsd.org Precedence: bulk Sorry for replying to an old mail.... * This time I want to help improving the bcopy/memcpy/memmove functions for * the Pentium (and 486). Here is a skeleton bcopy/memcpy that runs about 5 * times faster than your current implementation on a Pentium. This bcopy * handles up to about 350 MB/s on a Pentium 133, compared to the current 70 * MB/s. We've also tried some optimizations as part of the ccd project. What we did was to use the fp registers to load/store 8 bytes at a time, with an unrolled loop (of course). We got up to about 96MB/s on a P5-133/Triton for large copies like 2M (i.e., by going to memory, not cache) by using a standalone program. However, when we tried to plug it into the kernel to see the difference it makes for disk caches, the kernel died quite nicely during boot. (Not even DDB could help.) Does anyone know if there are any `gotchas' concerning the use of fp regs in the kernel? Satoshi