From owner-cvs-src@FreeBSD.ORG  Tue Nov  9 19:12:02 2004
Return-Path: <owner-cvs-src@FreeBSD.ORG>
Delivered-To: cvs-src@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 3B92516A4CE
	for <cvs-src@FreeBSD.org>; Tue,  9 Nov 2004 19:12:02 +0000 (GMT)
Received: from duchess.speedfactory.net (duchess.speedfactory.net
	[66.23.201.84])	by mx1.FreeBSD.org (Postfix) with SMTP id BC4FE43D55
	for <cvs-src@FreeBSD.org>; Tue,  9 Nov 2004 19:12:01 +0000 (GMT)
	(envelope-from ups@tree.com)
Received: (qmail 22368 invoked by uid 89); 9 Nov 2004 19:11:58 -0000
Received: from duchess.speedfactory.net (66.23.201.84)
  by duchess.speedfactory.net with SMTP; 9 Nov 2004 19:11:58 -0000
Received: (qmail 22351 invoked by uid 89); 9 Nov 2004 19:11:58 -0000
Received: from unknown (HELO palm.tree.com) (66.23.216.49)
  by duchess.speedfactory.net with SMTP; 9 Nov 2004 19:11:58 -0000
Received: from [127.0.0.1] (localhost.tree.com [127.0.0.1])
	by palm.tree.com (8.12.10/8.12.10) with ESMTP id iA9JBw5R030417;
	Tue, 9 Nov 2004 14:11:58 -0500 (EST)
	(envelope-from ups@tree.com)
From: Stephan Uphoff <ups@tree.com>
To: Robert Watson <rwatson@FreeBSD.org>
In-Reply-To: <Pine.NEB.3.96L.1041109130229.73102V-100000@fledge.watson.org>
References: <Pine.NEB.3.96L.1041109130229.73102V-100000@fledge.watson.org>
Content-Type: text/plain
Message-Id: <1100027518.29384.87.camel@palm.tree.com>
Mime-Version: 1.0
X-Mailer: Ximian Evolution 1.4.6 
Date: Tue, 09 Nov 2004 14:11:58 -0500
Content-Transfer-Encoding: 7bit
cc: src-committers@FreeBSD.org
cc: John Baldwin <jhb@FreeBSD.org>
cc: Alan Cox <alc@FreeBSD.org>
cc: cvs-src@FreeBSD.org
cc: Mike Silbersack <silby@silby.com>
cc: cvs-all@FreeBSD.org
Subject: Re: cvs commit: src/sys/i386/i386 pmap.c
X-BeenThere: cvs-src@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: CVS commit messages for the src tree <cvs-src.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-src>,
	<mailto:cvs-src-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/cvs-src>
List-Post: <mailto:cvs-src@freebsd.org>
List-Help: <mailto:cvs-src-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-src>,
	<mailto:cvs-src-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Nov 2004 19:12:02 -0000

On Tue, 2004-11-09 at 08:03, Robert Watson wrote:
> On Tue, 9 Nov 2004, Robert Watson wrote:
> 
> > > I've tried changing the store_rel() to just do a simple store since writes are 
> > > ordered on x86, but benchmarks on SMP showed that it actually hurt.  However, 
> > > it would probably be good to at least do that for UP.  The current patch to 
> > > do it for all kernels is:
> 
> Interestingly, I've now run through some more "macro" benchmarks.  I saw a
> couple of percent improvement on UP from the change, but indeed, I saw a
> slight decrease in performance for the rapid packet send benchmark on SMP. 
> 
> So I guess my recommendation is to get this in the tree for UP, and see if
> we can figure out why it's having the slow-down effect on SMP.

We are probably talking cache line effects here.
My guess is that we should:

1) Make sure that important spin mutexes are alone in a cache line.
2) Take care not to dirty the cache line unnecessarily.

I think for 2 we need to change the spin mutex slightly (for SMP) to
never call LOCK cmpxchgl before a simple load operation finds
m->mtx_lock == MTX_UNOWNED since LOCK cmpxchgl always seems to dirty the
cache line.

I have a dual Xeon (p4) where I can run some tests. Please let me know
if there are any tests that you can recommend - I don't want to reinvent
the wheel here.
Interestingly enough the linux spin locks implementation is mentioning
some PPRO errata that seem to require a locked operation.
Guess that means we should take a look at the errata of all SMP able
processors out there :-(
Intel also recommends a locked operation (or SFENCE) for future
processors.
Guess this means either non optimal code, lots of compile options or
self modifying code.

	Stephan