Date: Fri, 28 Mar 2003 06:05:04 +1100 From: Peter Jeremy <peterjeremy@optushome.com.au> To: Dag-Erling =?iso-8859-1?Q?Sm=F8rgrav?= <des@ofug.org> Cc: cvs-all@freebsd.org Subject: Re: Checksum/copy Message-ID: <20030327190504.GD11307@cirb503493.alcatel.com.au> In-Reply-To: <xzp7kalw5j4.fsf@flood.ping.uio.no> References: <Pine.BSF.4.21.0303260956250.27748-100000@root.org> <20030326225530.G2075@odysseus.silby.com> <20030327180247.D1825@gamplex.bde.org> <xzp7kalw5j4.fsf@flood.ping.uio.no>
next in thread | previous in thread | raw e-mail | index | archive | help
[I think this is getting somewhat off topic for the CVS lists] On Thu, Mar 27, 2003 at 09:57:35AM +0100, Dag-Erling Smørgrav wrote: >Might it be a good idea to have separate b{copy,zero} implementations >for special purposes like pmap_{copy,zero}_page? Since these cases >copy or zero a fixed and relatively large amount of data, they should >lend themselves well to optimization. I think it would be useful - even ignoring SSE, most of the fast b{zero,copy} implementations include a fair amount of special code to handle alignment issues and the odd few bytes at the beginning/end that don't fit into the main loop's work unit. Having a known size and alignment simplifies the code a lot. > Zeroing a 4096-byte page on an >SSE-enabled i386 should take no more than 35 SSE instructions The downside is that we need multiple implementations to take advantage of features available in different CPUs. I guess it's a "put up your patches and benchmark results" issue. Peter
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030327190504.GD11307>