From owner-freebsd-hackers Mon Jun 5 17:17:25 2000 Delivered-To: freebsd-hackers@freebsd.org Received: from happy.checkpoint.com (happy.checkpoint.com [199.203.156.41]) by hub.freebsd.org (Postfix) with ESMTP id 4DB4637BE32 for ; Mon, 5 Jun 2000 17:17:20 -0700 (PDT) (envelope-from mellon@pobox.com) Received: (from mellon@localhost) by happy.checkpoint.com (8.9.3/8.9.3) id DAA41181; Tue, 6 Jun 2000 03:17:07 +0300 (IDT) (envelope-from mellon@pobox.com) Date: Tue, 6 Jun 2000 03:17:07 +0300 From: Anatoly Vorobey To: Zach Brown Cc: hackers@freebsd.org Subject: Re: Optimization Message-ID: <20000606031706.A41154@happy.checkpoint.com> References: <200006052347.IAA00583@daniel.sobral> <20000606025717.A40896@happy.checkpoint.com> <20000605170742.C9146@mrnutty.zabbo.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0.1i In-Reply-To: <20000605170742.C9146@mrnutty.zabbo.net>; from zab@zabbo.net on Mon, Jun 05, 2000 at 05:07:42PM -0700 Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Mon, Jun 05, 2000 at 05:07:42PM -0700, Zach Brown wrote: > On Tue, Jun 06, 2000 at 02:57:18AM +0300, Anatoly Vorobey wrote: > > > > Can someone discuss the performance trade-offs of the following two > > > alternative codes (and maybe suggest alternatives)? > > > > > > Problem: I need to retrieve two values from a table. > > > > > > Alternative A: > > > > > > x = table[i].x; > > > y = table[i].y; > > > > > > Alternative B: > > > > > > d = table[i]; > > > x = d & MASK; > > > y = d >> SHIFT; > > > > Alternative A should be much faster. The compiler should be smart > > Don't forget the effects of caching. If x/y are always referenced > together, and memory is slow slow slow (on, say, any processor made in > the last few years) then the cost of unmushing the data in the cpu > could be much cheaper than the cost of going to memory to get x and y > from different tables. On the other hand, if the array is properly aligned, getting x will get the whole dword (qword, etc.) into the cache, and CPU won't have to run to the memory for y. Another problem with B is that I'm not sure the compiler will be smart enough to squeeze a structure into a register if it fits there, even with optimizations. Uhm, I think I'll run some tests on that, just for kicks. -- Anatoly Vorobey, mellon@pobox.com http://pobox.com/~mellon/ "Angels can fly because they take themselves lightly" - G.K.Chesterton To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message