From owner-cvs-all@FreeBSD.ORG  Sat Oct 25 20:14:51 2003
Return-Path: <owner-cvs-all@FreeBSD.ORG>
Delivered-To: cvs-all@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id B40DD16A4B3; Sat, 25 Oct 2003 20:14:51 -0700 (PDT)
Received: from mail.chesapeake.net (chesapeake.net [208.142.252.6])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 6B5B343F3F; Sat, 25 Oct 2003 20:14:50 -0700 (PDT)
	(envelope-from jroberson@chesapeake.net)
Received: from localhost (jroberson@localhost)
	by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id h9Q3EmP81879;
	Sat, 25 Oct 2003 23:14:48 -0400 (EDT)
	(envelope-from jroberson@chesapeake.net)
Date: Sat, 25 Oct 2003 23:14:48 -0400 (EDT)
From: Jeff Roberson <jroberson@chesapeake.net>
To: Peter Wemm <peter@wemm.org>
In-Reply-To: <20031025230711.B20F92A7EA@canning.wemm.org>
Message-ID: <20031025231217.B43805-100000@mail.chesapeake.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
cc: cvs-src@FreeBSD.org
cc: src-committers@FreeBSD.org
cc: cvs-all@FreeBSD.org
Subject: Re: cvs commit: src/sys/i386/i386 pmap.c 
X-BeenThere: cvs-all@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: CVS commit messages for the entire tree <cvs-all.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-all>,
	<mailto:cvs-all-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/cvs-all>
List-Post: <mailto:cvs-all@freebsd.org>
List-Help: <mailto:cvs-all-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-all>,
	<mailto:cvs-all-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 26 Oct 2003 03:14:51 -0000

On Sat, 25 Oct 2003, Peter Wemm wrote:

> Jeff Roberson wrote:
> > On Sat, 25 Oct 2003, Peter Wemm wrote:
> >
> > > peter       2003/10/25 11:51:41 PDT
> > >
> > >   FreeBSD src repository
> > >
> > >   Modified files:
> > >     sys/i386/i386        pmap.c
> > >   Log:
> > >   For the SMP case, flush the TLB at the beginning of the page zero/copy
> > >   routines.  Otherwise we run into trouble with speculative tlb preloads
> > >   on SMP systems.  This effectively defeats Jeff's revision 1.438
> > >   optimization (for his pentium4-M laptop) in the SMP case.  It breaks
> > >   other systems, particularly athlon-MP's.
> >
> > If the page tables are NULL why does this break speculative tlb preloads?
>
> While we're zeroing the page, CMAP2 (or friends) are non-NULL.  If another
> cpu accesses a nearby page and the cpu decides to speculatively preload
> the nearby TLB entries, then it will cache the CMAP2 value.  Meanwhile, the
> originating cpu clears it again and flushes its own cache.  But, if we then
> do a pmap_zero_page on the other cpu, it can still have the speculatively
> cached tlb entry and zero the wrong page.
>
> Poul-Henning was able to reproduce this problem in short order.  The first
> hack we tried was to change invlcaddr() to do a global shootdown.  It solved
> the crashes.. presumably by purging all other cpu's copies of CMAP2 including
> any speculatively loaded values.  Obviously this is expensive and defeats
> the point of doing local flushes only.
>
> So, as a lighter weight solution, we tried flushing after every page table
> modification, as the IA32 system programmers manual says we must, and it
> too solved the problem - without the expense of extra tlb shootdowns.
>
> Perhaps we should change back to using the the switchin purge and flush at the
> beginning as an alternative to two flushes.  The expense of invlpg seems to
> be unique to the pentium-4's.  athlon's run at about 100 clock cycles (80 on
> athlon64's).

Uhm, dumb question, why don't we just allocate one page of kva per
processor and avoid the mutex, the switchin/out, etc?  To save KVA? At 3
pages per processor and a max of 8 processors on intel, that's 192k.  We
can probably spare it.  I just saved that much with some UMA tuning.

What's the largest intel smp system we've run on?  I thought with > 4
intel had to use funky cache bridges.  Then we just need 2x for ht.

Cheers,
Jeff

>
> Cheers,
> -Peter
> --
> Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com
> "All of this is for nothing if we don't go to the stars" - JMS/B5
>