Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 8 May 1996 09:41:48 +1000
From:      Bruce Evans <bde@zeta.org.au>
To:        asami@cs.berkeley.edu, bde@zeta.org.au
Cc:        culler@cs.berkeley.edu, current@freebsd.org, ken@area238.residence.gatech.edu, marc@bowtie.nl, nisha@cs.berkeley.edu, pattrsn@cs.berkeley.edu, wollman@lcs.mit.edu, wscott@ichips.intel.com
Subject:   Re: more on fast bcopy
Message-ID:  <199605072341.JAA11831@godzilla.zeta.org.au>

next in thread | raw e-mail | index | archive | help
> * Why not? :-)  It should be possible to use the fpu after saving and
> * restoring the FP registers reentrantly.
>                              ^^^^^^^^^^^

>Yeah, we were running into problems with this.  Can you tell us how to
>do it? ;)

Something like:

	subl	$108,%esp
	movl	%cr0,%edx
	pushl	%edx		# if used
	clts
	fnsave	(%esp)
	...

	frstor	(%esp)
	popl	%edx		# if used
	movl	%edx,%cr0
	addl	$108,%esp

The stack may need to be larger.

The complications involving IRQ13 don't apply since this method is too slow
to use on systems with external coprocessors.

The commented out code in fpunrolled.s doesn't preserve CR0_TS.

>I see.  By the way, we tried unrolling the loops even more, and
>actually got up to 80MB/s for FP and 60MB/s for integer registers
>(this is for bcopy).

I don't think more unrolling is good.  It will bust the I-cache and it
should be possible to schedule the loop control instructions to take
essentially zero time compared with the D-cache-missing memory access
instructions.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199605072341.JAA11831>