Date: Mon, 22 Oct 2001 09:44:23 +0100 (BST) From: Doug Rabson <dfr@nlsystems.com> To: Marcel Moolenaar <marcel@xcllnt.net> Cc: Peter Wemm <peter@wemm.org>, <ia64@FreeBSD.ORG> Subject: Re: Hazards [was: Re: cvs commit: src/sys/ia64/ia64 sal.c] Message-ID: <20011022094201.L549-100000@salmon.nlsystems.com> In-Reply-To: <20011021212935.C28459@dhcp01.pn.xcllnt.net>
index | next in thread | previous in thread | raw e-mail
On Sun, 21 Oct 2001, Marcel Moolenaar wrote: > On Sun, Oct 21, 2001 at 02:34:35PM -0700, Peter Wemm wrote: > > > > 52: 3: tbit.nz p6,p0=in0,0 ;; > > 53: (p6) st1 [in0]=r0,1 > > 54: (p6) add in1=-1,in1 > > 55: > > 56: tbit.nz p6,p0=in0,1 ;; > > 57: (p6) st2 [in0]=r0,2 > > 58: (p6) add in1=-2,in1 > > 59: > > 60: tbit.nz p6,p0=in0,2 ;; > > 61: (p6) st4 [in0]=r0,4 > > 62: (p6) add in1=-4,in1 > > 63: > > 64: ;; > > [snip] > > > but that hardly seems efficient. could we copy in0 to somewhere else in > > order to avoid the RAW? the bits we're interested in are not going to change > > by the st1/2/4 adds. > > The code is inherently sequential in that the result of the > postinc is used by subsequent tbit instructions. One way to > increase ILP is to do an aligned ld8, zero-out the bytes > that need to be zeroed in the temporary register and write > the result back. in0 (ptr) and in1 (size) can be updated > without there being an immediate use for them. The code > will be endianness sensitive though. Something like: > > and t0 = 0xf8, in0;; // sign-extension > ld8 t1 = [t0];; > // Zero-out the bytes in t1 that need zeroed > st8 [t0] = t1 > > in0 can be updated by a simple add: > > add in0 = 8, t0 > > in1 can be updated by the following sequence: > > or t2 = 7, in0 > mov t3 = in1 ;; > sub in1 = t3, t2 > > Both updates can be performed concurrently with the zeroing > of t1. The zeroing of t1 can be sequence of predicated dep > instructions. > > Just a thought, I'm not too worried about performance here - this is just cleaning up the pointer so that we can do an aligned store in the main loop. I'm just going to add the stops as Peter suggested. We can revisit this (and all the other string code) and work on performance later. The whole lot probably needs rewriting. Perhaps Intel has some sample code... -- Doug Rabson Mail: dfr@nlsystems.com Phone: +44 20 8348 6160 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-ia64" in the body of the messagehelp
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20011022094201.L549-100000>
