From owner-freebsd-current Fri Apr 5 02:23:34 1996 Return-Path: owner-current Received: (from root@localhost) by freefall.freebsd.org (8.7.3/8.7.3) id CAA09693 for current-outgoing; Fri, 5 Apr 1996 02:23:34 -0800 (PST) Received: from Root.COM (implode.Root.COM [198.145.90.17]) by freefall.freebsd.org (8.7.3/8.7.3) with ESMTP id CAA09688 for ; Fri, 5 Apr 1996 02:23:29 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by Root.COM (8.7.5/8.6.5) with SMTP id CAA00222; Fri, 5 Apr 1996 02:21:48 -0800 (PST) Message-Id: <199604051021.CAA00222@Root.COM> X-Authentication-Warning: implode.Root.COM: Host localhost [127.0.0.1] didn't use HELO protocol To: asami@cs.berkeley.edu (Satoshi Asami) cc: current@FreeBSD.org, nisha@cs.berkeley.edu, tege@matematik.su.se, hasty@rah.star-gate.com Subject: Re: fast memory copy for large data sizes In-reply-to: Your message of "Fri, 05 Apr 1996 01:35:16 PST." <199604050935.BAA24263@silvia.HIP.Berkeley.EDU> From: David Greenman Reply-To: davidg@Root.COM Date: Fri, 05 Apr 1996 02:21:48 -0800 Sender: owner-current@FreeBSD.org X-Loop: FreeBSD.org Precedence: bulk >We've put together a fast memory copy that uses floating point >registers to speed up large transfers. The original idea was taken >from Amancio Hasty's old post to use floating point registers to move >8 bytes at a time. (We tried using integer registers too but with our >wits we could only get 10MB/s less than the FP case.) > >By the way, we plugged this thing in as a replacement to >copyin/copyout and our ccd testing machine, (striping disk driver, see >http://stampede.cs.berkeley.edu/ccd/ for details) and maximum read >performance improved from 21MB/s to 24MB/s using 9 disks. But that's >only to our interest, so here's a comparison with the libc bcopy() >(which is essentially the same code as the stock copyin/copyout). > >Here are the kind of numbers we are seeing, and hope you will see, if >you run the program attached at the end of this mail: > > 90MHz Pentium (silvia), SiS chipset, 256KB cache: > > size libc ours > 32 15.258789 MB/s 6.103516 MB/s > 64 20.345052 MB/s 15.258789 MB/s > 128 17.438616 MB/s 15.258789 MB/s This would be a big lose in the kernel since just about all bcopy's fall into this range _except_ disk I/O block copies. I know this can be done better using other techniques (non-FP, see hackers mail from about 3 months ago). You should talk to John Dyson who's also working on this. -DG David Greenman Core-team/Principal Architect, The FreeBSD Project