From owner-freebsd-hackers Mon Jun 5 18: 0:15 2000 Delivered-To: freebsd-hackers@freebsd.org Received: from happy.checkpoint.com (happy.checkpoint.com [199.203.156.41]) by hub.freebsd.org (Postfix) with ESMTP id 035E637BE87 for ; Mon, 5 Jun 2000 18:00:06 -0700 (PDT) (envelope-from mellon@pobox.com) Received: (from mellon@localhost) by happy.checkpoint.com (8.9.3/8.9.3) id DAA43191; Tue, 6 Jun 2000 03:59:48 +0300 (IDT) (envelope-from mellon@pobox.com) Date: Tue, 6 Jun 2000 03:59:48 +0300 From: Anatoly Vorobey To: "Daniel C. Sobral" Cc: hackers@freebsd.org Subject: Re: Optimization Message-ID: <20000606035948.A41527@happy.checkpoint.com> References: <20000606031706.A41154@happy.checkpoint.com> <200006060031.JAA00841@daniel.sobral> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0.1i In-Reply-To: <200006060031.JAA00841@daniel.sobral>; from dcs@newsguy.com on Tue, Jun 06, 2000 at 09:31:46AM +0900 Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Tue, Jun 06, 2000 at 09:31:46AM +0900, Daniel C. Sobral wrote: > > > > > Alternative A: > > > > > > > > > > x = table[i].x; > > > > > y = table[i].y; > > > > > > > > > > Alternative B: > > > > > > > > > > d = table[i]; > > > > > x = d & MASK; > > > > > y = d >> SHIFT; > > > > > > > > Alternative A should be much faster. The compiler should be smart > [stuff about d being a structure] > > It isn't. Ah, I didn't realize you have freedom of changing table[i]'s type between implementations . Okay, I change my mind then. B is better. I ran a quick test with -O3 on i386. What happens in A is that it transfers 32-bit values anyway, but isn't smart enough to do it only once. So it accesses *(table+i*2), and then *(table+2+i*2), both accesses taking one instruction (and i*2 sitting precomputed in a register). It puts one in eax, stores ax away, then puts the other in eax, and stores ax away. In B, it accesses (*table+i*2) once, puts it in eax, stores ax away, rotates eax, stores ax away. Rotation should win over memory access even if it goes through cache, especially considering the memory access has a constant displacement inside the instrution. If you test it, be sure to declare x and y volatile, otherwise you'll the hardest time getting gcc from keeping them in registers. Don't use a constant i, or it'll precompute addresses, etc. Use -O3 -g -S, and .stabs entries in the assembly file will mark line boundaries in source. -- Anatoly Vorobey, mellon@pobox.com http://pobox.com/~mellon/ "Angels can fly because they take themselves lightly" - G.K.Chesterton To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message