Date: Fri, 28 Mar 2003 06:05:04 +1100 From: Peter Jeremy <peterjeremy@optushome.com.au> To: Dag-Erling =?iso-8859-1?Q?Sm=F8rgrav?= <des@ofug.org> Cc: cvs-all@freebsd.org Subject: Re: Checksum/copy Message-ID: <20030327190504.GD11307@cirb503493.alcatel.com.au> In-Reply-To: <xzp7kalw5j4.fsf@flood.ping.uio.no> References: <Pine.BSF.4.21.0303260956250.27748-100000@root.org> <20030326225530.G2075@odysseus.silby.com> <20030327180247.D1825@gamplex.bde.org> <xzp7kalw5j4.fsf@flood.ping.uio.no>
next in thread | previous in thread | raw e-mail | index | archive | help
[I think this is getting somewhat off topic for the CVS lists]
On Thu, Mar 27, 2003 at 09:57:35AM +0100, Dag-Erling Smørgrav wrote:
>Might it be a good idea to have separate b{copy,zero} implementations
>for special purposes like pmap_{copy,zero}_page? Since these cases
>copy or zero a fixed and relatively large amount of data, they should
>lend themselves well to optimization.
I think it would be useful - even ignoring SSE, most of the fast
b{zero,copy} implementations include a fair amount of special code
to handle alignment issues and the odd few bytes at the beginning/end
that don't fit into the main loop's work unit. Having a known size
and alignment simplifies the code a lot.
> Zeroing a 4096-byte page on an
>SSE-enabled i386 should take no more than 35 SSE instructions
The downside is that we need multiple implementations to take advantage
of features available in different CPUs.
I guess it's a "put up your patches and benchmark results" issue.
Peter
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030327190504.GD11307>
