From owner-freebsd-current@FreeBSD.ORG Wed Jan 17 20:29:55 2007 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BB71516A51B; Wed, 17 Jan 2007 20:29:55 +0000 (UTC) (envelope-from nevans@talkpoint.com) Received: from relay.talkpoint.com (pobox.talkpoint.com [204.141.15.158]) by mx1.freebsd.org (Postfix) with ESMTP id 8146A13C467; Wed, 17 Jan 2007 20:29:55 +0000 (UTC) (envelope-from nevans@talkpoint.com) Received: from ASSP-nospam ([127.0.0.1]) by relay.talkpoint.com with Microsoft SMTPSVC(5.0.2195.6713); Wed, 17 Jan 2007 15:29:54 -0500 Received: from 204.141.15.136 ([204.141.15.136] helo=postal.talkpoint.com) by ASSP-nospam ; 17 Jan 07 20:29:54 -0000 Received: from pleiades.nextvenue.com ([204.141.15.194]) by postal.talkpoint.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id ZN3V7MJS; Wed, 17 Jan 2007 15:29:50 -0500 Date: Wed, 17 Jan 2007 15:29:50 -0500 From: Nick Evans To: Ivan Voras Message-ID: <20070117152950.6c372f24@pleiades.nextvenue.com> In-Reply-To: <45AE7BF8.10703@fer.hr> References: <45AE7BF8.10703@fer.hr> X-Mailer: Sylpheed-Claws 2.4.0 (GTK+ 2.8.20; i386-portbld-freebsd6.1) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 17 Jan 2007 20:29:54.0310 (UTC) FILETIME=[3F221A60:01C73A76] Cc: freebsd-current@freebsd.org, Bruce Evans , freebsd-arch@freebsd.org Subject: Re: Optimized copy&move (was: Re: [PATCH] Mantaining turnstile aligne d to 128 bytes in i386 CPUs) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Jan 2007 20:29:55 -0000 On Wed, 17 Jan 2007 14:41:44 -0500 Ivan Voras wrote: > Bruce Evans wrote: > > > And MMX/XMM registers ar not needed to get movnt on machines with > SSE2, > > since movnti is part of SSE2. This reduces the advantages of using > MMX/XMM > > registers on P4's and A64's in 32-bit mode to the non-nt parts of the > > above (fully cached case), which I think are less important than the > nt > > parts. > > Hmm, I'm looking at i386/i386/support.s and there are several versions > of bcopy and bmove functions, including some that optimize by using FPU > registers (large_i586_bcopy_loop), and a version that uses movnti > (sse2_pagezero), but I can't find the bit of magic which glues them to > bzero() call. > > Also, as as I can tell by the comments, the FPU version works by > manually saving context... why is this possible (i.e. won't something > preempt it?) > Potentially stupid question but, is it not possible to benchmark these variations at build or boot time and use the most appropriate method? Or at least the one most appropriate 90% of the time? Nick