From owner-freebsd-smp  Tue Dec 10 19:49:54 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id TAA05901
          for smp-outgoing; Tue, 10 Dec 1996 19:49:54 -0800 (PST)
Received: from pat.idt.unit.no (0@pat.idt.unit.no [129.241.103.5])
          by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id TAA05890
          for <smp@freebsd.org>; Tue, 10 Dec 1996 19:49:50 -0800 (PST)
Received: from idt.unit.no (tegge@ikke.idt.unit.no [129.241.111.65])
          by pat.idt.unit.no (8.8.4/8.8.4) with ESMTP
	  id EAA00310; Wed, 11 Dec 1996 04:49:01 +0100 (MET)
Message-Id: <199612110349.EAA00310@pat.idt.unit.no>
To: peter@spinner.dialix.com
Cc: smp@bluenose.na.tuns.ca, smp@freebsd.org
Subject: Re: More info about fatal trap 12 
In-Reply-To: Your message of "Sat, 07 Dec 1996 02:15:12 +0800"
References: <199612061815.CAA19205@spinner.DIALix.COM>
Mime-Version: 1.0
Content-Type: Text/Plain; charset=us-ascii
X-Mailer: Mew version 1.06 on Emacs 19.33.1
Date: Wed, 11 Dec 1996 04:49:01 +0100
From: Tor Egge <Tor.Egge@idt.ntnu.no>
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

> Tor Egge wrote:
> > A closer examination of the kernel dump shows that the first page fault 
> > is from the user process /bin/sh. The call stack is
> [..]
> > The first access to the stack by the child process failed when trying 
> > to save the return value from fork.
> > 
> > The parent process was running on CPU #1, and the child process
> > was running on CPU #0.
> > 
> > - Tor Egge
> 
> Hmm!!  The plot thickens!  I noticed the failing pmap_enter was at
> 0xefbfd000 which is the first stack page already, but I wasn't sure
> if it was the initial creation, or if the stack had been paged out
> and was failing on pagein.

I applied the following diff to pmap.c

Index: pmap.c
===================================================================
RCS file: /export/akg1/smp-cvs/sys/i386/i386/pmap.c,v
retrieving revision 1.31
diff -c -r1.31 pmap.c
*** pmap.c	1996/12/03 05:51:12	1.31
--- pmap.c	1996/12/11 00:48:46
***************
*** 1982,1987 ****
--- 1982,1991 ----
  	vm_offset_t opa;
  	vm_offset_t origpte, newpte;
  	vm_page_t mpte;
+ 	volatile u_long old_cr3;
+ 	volatile u_long old_frame;
+ 	volatile u_long old_PTDpde;
+ 	volatile int old_cpunum;
  
  	if (pmap == NULL)
  		return;
***************
*** 2011,2016 ****
--- 2015,2024 ----
  			pmap->pm_pdir[PTDPTDI], va);
  	}
  
+ 	old_cr3 = rcr3();
+ 	old_frame = pmap->pm_pdir[PTDPTDI];
+ 	old_PTDpde = PTDpde;
+ 	old_cpunum = cpunumber();
  	origpte = *(vm_offset_t *)pte;
  	pa &= PG_FRAME;
  	opa = origpte & PG_FRAME;
------------

Afterwards, when looking at the kernel stack trace:
----
#0  boot (howto=256) at ../../kern/kern_shutdown.c:264
#1  0xe0112d69 in panic (fmt=0xe01bcd7f "page fault")
    at ../../kern/kern_shutdown.c:392
#2  0xe01bda65 in trap_fatal (frame=0xdfbffe4c) at ../../i386/i386/trap.c:747
#3  0xe01bd498 in trap_pfault (frame=0xdfbffe4c, usermode=0)
    at ../../i386/i386/trap.c:654
#4  0xe01bd0cb in trap (frame={tf_es = -453967856, tf_ds = 16, 
      tf_edi = -533289196, tf_esi = -541077504, tf_ebp = -541065552, 
      tf_isp = -541065612, tf_ebx = 86614016, tf_edx = -4194304, 
      tf_ecx = -528396, tf_eax = 0, tf_trapno = 12, tf_err = 0, 
      tf_eip = -535058445, tf_cs = 8, tf_eflags = 66050, tf_esp = -533683197, 
      tf_ss = -453959040}) at ../../i386/i386/trap.c:313
#5  0xe01ba7f3 in pmap_enter (pmap=0xe4ee0f64, va=3753889792, pa=86614016, 
    prot=7 '\a', wired=0) at ../../i386/i386/pmap.c:2022
#6  0xe01a41b3 in vm_fault (map=0xe4ee0f00, vaddr=3753889792, 
    fault_type=3 '\003', change_wiring=0) at ../../vm/vm_fault.c:773
#7  0xe01bd3f0 in trap_pfault (frame=0xdfbfffbc, usermode=1)
    at ../../i386/i386/trap.c:634
#8  0xe01bcf73 in trap (frame={tf_es = 39, tf_ds = 39, tf_edi = 352256, 
      tf_esi = 331156, tf_ebp = -541075036, tf_isp = -541065244, tf_ebx = 2, 
      tf_edx = 1, tf_ecx = -541075000, tf_eax = 0, tf_trapno = 12, tf_err = 7, 
      tf_eip = 45296, tf_cs = 31, tf_eflags = 66050, tf_esp = -541075060, 
      tf_ss = 39}) at ../../i386/i386/trap.c:241
#9  0xb0f0 in ?? ()
#10 0x63ab in ?? ()
#11 0x5ef0 in ?? ()
#12 0x7d01 in ?? ()
#13 0x7984 in ?? ()
#14 0x7754 in ?? ()
#15 0x60eb in ?? ()
#16 0x58e1 in ?? ()
#17 0xc11f in ?? ()
#18 0xc02e in ?? ()
#19 0x107e in ?? ()
(kgdb) up 5
#5  0xe01ba7f3 in pmap_enter (pmap=0xe4ee0f64, va=3753889792, pa=86614016, 
    prot=7 '\a', wired=0) at ../../i386/i386/pmap.c:2022
(kgdb) info locals
va = 3753889792
pa = 86614016
prot = 7 '\a'
pte = (unsigned int *) 0xfff7eff4
opa = 0
origpte = 3761678100
newpte = 0
mpte = (struct vm_page *) 0xe035f7c8
old_cr3 = 85966848
old_frame = 0
old_PTDpde = 85966883
old_cpunum = 0
(kgdb) print/x pmap->pm_pdir[0x37f]
$20 = 0x51fc023
----

This indicates that cr3 was correct, PTDpde was correct, but 
pmap->pm_pdir[PTDPTDI] evaluated to 0. This triggered the use of the
alternate page table memory area.

Later on, during the post mortem investigation, pmap->pm_pdir[PTDPTDI]
evaluates to the correct value. 

- Tor Egge

From owner-freebsd-smp  Tue Dec 10 21:23:33 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id VAA14969
          for smp-outgoing; Tue, 10 Dec 1996 21:23:33 -0800 (PST)
Received: from dyson.iquest.net (dyson.iquest.net [198.70.144.127])
          by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id VAA14957
          for <smp@FreeBSD.org>; Tue, 10 Dec 1996 21:23:29 -0800 (PST)
Received: (from root@localhost) by dyson.iquest.net (8.8.2/8.6.9) id AAA00192; Wed, 11 Dec 1996 00:22:24 -0500 (EST)
From: "John S. Dyson" <toor@dyson.iquest.net>
Message-Id: <199612110522.AAA00192@dyson.iquest.net>
Subject: Re: More info about fatal trap 12
To: Tor.Egge@idt.ntnu.no (Tor Egge)
Date: Wed, 11 Dec 1996 00:22:24 -0500 (EST)
Cc: peter@spinner.dialix.com, smp@bluenose.na.tuns.ca, smp@FreeBSD.org
In-Reply-To: <199612110349.EAA00310@pat.idt.unit.no> from "Tor Egge" at Dec 11, 96 04:49:01 am
Reply-To: dyson@FreeBSD.org
X-Mailer: ELM [version 2.4 PL24 ME8]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-smp@FreeBSD.org
X-Loop: FreeBSD.org
Precedence: bulk

> 
> This indicates that cr3 was correct, PTDpde was correct, but 
> pmap->pm_pdir[PTDPTDI] evaluated to 0. This triggered the use of the
> alternate page table memory area.
> 
> Later on, during the post mortem investigation, pmap->pm_pdir[PTDPTDI]
> evaluates to the correct value. 
> 
I am not watching things extremely closely on this front, but it smells
like a missing pmap_update (or a defective one.)  -- just a shot in the
dark...  Still lusting after my 2nd CPU :-).

John

From owner-freebsd-smp  Wed Dec 11 13:39:16 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id NAA11345
          for smp-outgoing; Wed, 11 Dec 1996 13:39:16 -0800 (PST)
Received: (from fsmp@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id NAA11338
          for freebsd-smp; Wed, 11 Dec 1996 13:39:14 -0800 (PST)
Date: Wed, 11 Dec 1996 13:39:14 -0800 (PST)
From: Steve Passe <fsmp>
Message-Id: <199612112139.NAA11338@freefall.freebsd.org>
To: freebsd-smp
Subject: cvs commit:  sys/i386/i386 mp_machdep.c sys/i386/include mpapic.h smptests.h sys/i386/isa icu.h vector.s
Sender: owner-smp@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

fsmp        96/12/11 13:39:13

  Modified:    i386/i386  mp_machdep.c
  Log:
  fixed minor bug were SMP_INVLTLB needs icu.h,
  which was missing when "IMEN_NOT_MERGED" was defined.
  
  Revision  Changes    Path
  1.34      +5 -1      sys/i386/i386/mp_machdep.c

  Modified:    i386/include  mpapic.h smptests.h
  Log:
  made "IMEN_NOT_MERGED" official version of kernel.
  this is just to provide continuity in CVS tree.
  IMEN_NOT_MERGED is no longer used,
  and will be removed from source as soon as I am sure the imen/IOApicMask
  merge is not the cause of our remaining "fatal trap12" problem.
  
  Revision  Changes    Path
  1.8       +2 -1      sys/i386/include/mpapic.h
  1.6       +9 -1      sys/i386/include/smptests.h

  Modified:    i386/isa  icu.h vector.s
  Log:
  made "IMEN_NOT_MERGED" official version of kernel.
  this is just to provide continuity in CVS tree.
  IMEN_NOT_MERGED is no longer used,
  and will be removed from source as soon as I am sure the imen/IOApicMask
  merge is not the cause of our remaining "fatal trap12" problem.
  
  icu.h and vector.s didn't have conditional compilation of the pre-merged
  code, add to keep the CVS tree consistant (see above comments).
  
  fix a bug in vector.s where APIC_IO without IPI_INTS had a bogus trailing
  ',' in a variable list.  this caused an extra 0 to be added, spamming
  the swi* masks.
  
  Revision  Changes    Path
  1.10      +21 -1     sys/i386/isa/icu.h
  1.31      +27 -3     sys/i386/isa/vector.s

From owner-freebsd-smp  Wed Dec 11 15:15:30 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id PAA17184
          for smp-outgoing; Wed, 11 Dec 1996 15:15:30 -0800 (PST)
Received: from ormail.intel.com (ormail.intel.com [134.134.248.3])
          by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id PAA17178
          for <freebsd-smp@freebsd.org>; Wed, 11 Dec 1996 15:15:27 -0800 (PST)
Received: from ichips.intel.com (ichips.intel.com [134.134.50.200]) by ormail.intel.com (8.8.4/8.7.3) with ESMTP id PAA23537 for <freebsd-smp@freebsd.org>; Wed, 11 Dec 1996 15:15:18 -0800 (PST)
Received: from pdxcs078.intel.com by ichips.intel.com (8.7.4/jIII)
	id PAA22149; Wed, 11 Dec 1996 15:12:54 -0800 (PST)
Received: by pdxcs078.intel.com (AIX 3.2/UCB 5.64/SW1.11) 
	id AA58904; Wed, 11 Dec 1996 15:15:21 -0800
Message-Id: <9612112315.AA58904@pdxcs078.intel.com>
To: freebsd-smp@freebsd.org
Subject: some questions concerning TLB shootdowns in FreeBSD
Date: Wed, 11 Dec 1996 15:15:21 -0800
From: Mike Haertel <haertel@ichips.intel.com>
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

I'm curious how/when people are doing TLB shootdowns.
Obviously when reducing permission or unmapping pages.
How about for manipulations of the dirty/accessed bits?
(Does FreeBSD use these?)

And how is the shootdown implemented?  One simple
method:

	interprocessor interrupt to everyone concerned
	everyone meets at barrier
	manipulate page tables
	everyone flushes appropriate TLB entries and resumes

Or is it done in some less conservative fashion?

Thanks,

	Mike

From owner-freebsd-smp  Wed Dec 11 16:05:45 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id QAA21539
          for smp-outgoing; Wed, 11 Dec 1996 16:05:45 -0800 (PST)
Received: (from fsmp@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id QAA21529
          for freebsd-smp; Wed, 11 Dec 1996 16:05:41 -0800 (PST)
Date: Wed, 11 Dec 1996 16:05:41 -0800 (PST)
From: Steve Passe <fsmp>
Message-Id: <199612120005.QAA21529@freefall.freebsd.org>
To: freebsd-smp
Subject: cvs commit:  sys/i386/isa vector.s
Sender: owner-smp@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

fsmp        96/12/11 16:05:40

  Modified:    i386/isa  vector.s
  Log:
  folded (APIC_IO && APIC_LAZY) and non-APIC_IO versions of INTR() into 1 macro.
  
  Suggested by: Peter Wemm
  
  Revision  Changes    Path
  1.32      +47 -81    sys/i386/isa/vector.s

From owner-freebsd-smp  Wed Dec 11 16:52:10 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id QAA24796
          for smp-outgoing; Wed, 11 Dec 1996 16:52:10 -0800 (PST)
Received: (from fsmp@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id QAA24789
          for freebsd-smp; Wed, 11 Dec 1996 16:52:08 -0800 (PST)
Date: Wed, 11 Dec 1996 16:52:08 -0800 (PST)
From: Steve Passe <fsmp>
Message-Id: <199612120052.QAA24789@freefall.freebsd.org>
To: freebsd-smp
Subject: cvs commit:  sys/i386/include smptests.h sys/i386/isa icu.s vector.s
Sender: owner-smp@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

fsmp        96/12/11 16:52:07

  Modified:    i386/include  smptests.h
  Log:
  removed APIC_LAZY test.
  the actions of "APIC_LAZY" are now default for APIC_IO.
  
  Revision  Changes    Path
  1.7       +7 -32     sys/i386/include/smptests.h

  Modified:    i386/isa  icu.s vector.s
  Log:
  the actions of "APIC_LAZY" are now default for APIC_IO.
  
  Revision  Changes    Path
  1.20      +8 -12     sys/i386/isa/icu.s
  1.33      +32 -37    sys/i386/isa/vector.s

From owner-freebsd-smp  Thu Dec 12 00:44:24 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id AAA00289
          for smp-outgoing; Thu, 12 Dec 1996 00:44:24 -0800 (PST)
Received: (from fsmp@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id AAA00276
          for freebsd-smp; Thu, 12 Dec 1996 00:44:22 -0800 (PST)
Date: Thu, 12 Dec 1996 00:44:22 -0800 (PST)
From: Steve Passe <fsmp>
Message-Id: <199612120844.AAA00276@freefall.freebsd.org>
To: freebsd-smp
Subject: cvs commit:  sys/i386/i386 mp_machdep.c mpapic.c pmap.c sys/i386/include apic.h mpapic.h smp.h sys/i386/isa icu.h vector.s
Sender: owner-smp@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

fsmp        96/12/12 00:44:21

  Modified:    i386/i386  mp_machdep.c mpapic.c pmap.c
               i386/include  apic.h mpapic.h smp.h
               i386/isa  icu.h vector.s
  Log:
  another pass at preparing for multiple IO APICs.
  nothing more can be done till we go to the ">32 INT" model.
  code is bracketed with "MULTIPLE_IOAPICS".
  
  removed the BYTE register access macros for the IO APIC.
  it appears that long accesses work with all 'flavors' of IO APICS.
  
  Revision  Changes    Path
  1.35      +27 -39    sys/i386/i386/mp_machdep.c
  1.25      +57 -52    sys/i386/i386/mpapic.c
  1.32      +4 -10     sys/i386/i386/pmap.c
  1.16      +7 -12     sys/i386/include/apic.h
  1.9       +29 -21    sys/i386/include/mpapic.h
  1.26      +17 -19    sys/i386/include/smp.h
  1.11      +11 -7     sys/i386/isa/icu.h
  1.34      +4 -10     sys/i386/isa/vector.s

From owner-freebsd-smp  Thu Dec 12 01:53:04 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id BAA02731
          for smp-outgoing; Thu, 12 Dec 1996 01:53:04 -0800 (PST)
Received: (from fsmp@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id BAA02724
          for freebsd-smp; Thu, 12 Dec 1996 01:53:03 -0800 (PST)
Date: Thu, 12 Dec 1996 01:53:03 -0800 (PST)
From: Steve Passe <fsmp>
Message-Id: <199612120953.BAA02724@freefall.freebsd.org>
To: freebsd-smp
Subject: cvs commit:  sys/i386/isa isa.c isa_device.h sio.c
Sender: owner-smp@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

fsmp        96/12/12 01:53:03

  Modified:    i386/isa  isa.c isa_device.h sio.c
  Log:
  created icu_irq_pending(), a function which examines the 8259 IRQ pending
  bits.  this is needed by some device probes during boot, when the IO APIC
  is being used for actual INTerrupt service.
  
  code sio.c to use icu_irq_pending() during probe.
  
  Revision  Changes    Path
  1.13      +16 -1     sys/i386/isa/isa.c
  1.7       +3 -0      sys/i386/isa/isa_device.h
  1.13      +10 -20    sys/i386/isa/sio.c

From owner-freebsd-smp  Thu Dec 12 02:13:04 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id CAA03842
          for smp-outgoing; Thu, 12 Dec 1996 02:13:04 -0800 (PST)
Received: (from fsmp@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id CAA03835
          for freebsd-smp; Thu, 12 Dec 1996 02:13:01 -0800 (PST)
Date: Thu, 12 Dec 1996 02:13:01 -0800 (PST)
From: Steve Passe <fsmp>
Message-Id: <199612121013.CAA03835@freefall.freebsd.org>
To: freebsd-smp
Subject: cvs commit:  sys/i386/i386 mpboot.s
Sender: owner-smp@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

fsmp        96/12/12 02:13:01

  Modified:    i386/i386  mpboot.s
  Log:
  removed an old set of unneeded nops.
  one more "FIXME" gone!
  
  Revision  Changes    Path
  1.17      +2 -7      sys/i386/i386/mpboot.s

From owner-freebsd-smp  Thu Dec 12 02:38:36 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id CAA04706
          for smp-outgoing; Thu, 12 Dec 1996 02:38:36 -0800 (PST)
Received: (from fsmp@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id CAA04698
          for freebsd-smp; Thu, 12 Dec 1996 02:38:34 -0800 (PST)
Date: Thu, 12 Dec 1996 02:38:34 -0800 (PST)
From: Steve Passe <fsmp>
Message-Id: <199612121038.CAA04698@freefall.freebsd.org>
To: freebsd-smp
Subject: cvs commit:  sys/i386/i386 swtch.s
Sender: owner-smp@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

fsmp        96/12/12 02:38:33

  Modified:    i386/i386  swtch.s
  Log:
  do a proper r/m/w of APIC TPR on way out of cpu_switch().
  
  Revision  Changes    Path
  1.31      +7 -4      sys/i386/i386/swtch.s

From owner-freebsd-smp  Thu Dec 12 07:03:25 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id HAA15825
          for smp-outgoing; Thu, 12 Dec 1996 07:03:25 -0800 (PST)
Received: from spinner.DIALix.COM (root@spinner.DIALix.COM [192.203.228.67])
          by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id HAA15816;
          Thu, 12 Dec 1996 07:03:08 -0800 (PST)
Received: from spinner.DIALix.COM (peter@localhost.DIALix.oz.au [127.0.0.1])
          by spinner.DIALix.COM (8.8.4/8.8.4) with ESMTP id XAA03371;
          Thu, 12 Dec 1996 23:02:58 +0800 (WST)
Message-Id: <199612121502.XAA03371@spinner.DIALix.COM>
To: Steve Passe <fsmp@freefall.freebsd.org>
cc: freebsd-smp@freefall.freebsd.org
Subject: Re: cvs commit: sys/i386/isa isa.c isa_device.h sio.c 
In-reply-to: Your message of "Thu, 12 Dec 1996 01:53:03 PST."
             <199612120953.BAA02724@freefall.freebsd.org> 
Date: Thu, 12 Dec 1996 23:02:58 +0800
From: Peter Wemm <peter@spinner.dialix.com>
Sender: owner-smp@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

Steve Passe wrote:
> fsmp        96/12/12 01:53:03
> 
>   Modified:    i386/isa  isa.c isa_device.h sio.c
>   Log:
>   created icu_irq_pending(), a function which examines the 8259 IRQ pending
>   bits.  this is needed by some device probes during boot, when the IO APIC
>   is being used for actual INTerrupt service.
>   
>   code sio.c to use icu_irq_pending() during probe.

Umm, silly question I guess, but does this code take LOWPRI delivery mode
into account?  If you're looking on the local apic, you won't see the
pending interrupt if it's been sent to a different cpu....  But I guess this
should be fine during boot though.  (no, I've not read the code, I've just
skimmed 3000 email messages and am about to start on a second pass :-] )

Cheers,
-Peter

From owner-freebsd-smp  Thu Dec 12 10:39:09 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id KAA27193
          for smp-outgoing; Thu, 12 Dec 1996 10:39:09 -0800 (PST)
Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131])
          by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id KAA27188
          for <freebsd-smp@freefall.freebsd.org>; Thu, 12 Dec 1996 10:39:06 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id LAA02012; Thu, 12 Dec 1996 11:38:46 -0700
Message-Id: <199612121838.LAA02012@clem.systemsix.com>
X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol
X-Mailer: exmh version 1.6.5 12/11/95
From: Steve Passe <smp@csn.net>
To: Peter Wemm <peter@spinner.dialix.com>
cc: freebsd-smp@freefall.freebsd.org
Subject: Re: cvs commit: sys/i386/isa isa.c isa_device.h sio.c 
In-reply-to: Your message of "Thu, 12 Dec 1996 23:02:58 +0800."
             <199612121502.XAA03371@spinner.DIALix.COM> 
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Thu, 12 Dec 1996 11:38:46 -0700
Sender: owner-smp@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

Hi,
> Steve Passe wrote:
> > fsmp        96/12/12 01:53:03
> > 
> >   Modified:    i386/isa  isa.c isa_device.h sio.c
> >   Log:
> >   created icu_irq_pending(), a function which examines the 8259 IRQ pending
> >   bits.  this is needed by some device probes during boot, when the IO APIC
> >   is being used for actual INTerrupt service.
> >   
> >   code sio.c to use icu_irq_pending() during probe.
> 
> Umm, silly question I guess, but does this code take LOWPRI delivery mode
> into account?  If you're looking on the local apic, you won't see the
> pending interrupt if it's been sent to a different cpu....  But I guess this
> should be fine during boot though.  (no, I've not read the code, I've just

its just for boot.  the sio is presumming all INTs
are masked, then tickles the sio in a way that it expects to see a pending
INT.  no harm in letting it do that, it appears to work fine.  the IO APIC
isn't ready for use at this point, and because of all the complexity of
the APIC exchanges it wouldn't make sense to use them for this anyways.


--
Steve Passe	| powered by
smp@csn.net	|            FreeBSD


From owner-freebsd-smp  Fri Dec 13 13:04:19 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id NAA12477
          for smp-outgoing; Fri, 13 Dec 1996 13:04:19 -0800 (PST)
Received: (from fsmp@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id NAA12468
          for freebsd-smp; Fri, 13 Dec 1996 13:04:17 -0800 (PST)
Date: Fri, 13 Dec 1996 13:04:17 -0800 (PST)
From: Steve Passe <fsmp>
Message-Id: <199612132104.NAA12468@freefall.freebsd.org>
To: freebsd-smp
Subject: cvs commit:  sys/i386/isa isa.c
Sender: owner-smp@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

fsmp        96/12/13 13:04:16

  Modified:    i386/isa  isa.c
  Log:
  removed an unneeded temp, fixing another 'FIXME'.
  
  Revision  Changes    Path
  1.14      +4 -8      sys/i386/isa/isa.c

From owner-freebsd-smp  Fri Dec 13 14:36:06 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id OAA22678
          for smp-outgoing; Fri, 13 Dec 1996 14:36:06 -0800 (PST)
Received: (from fsmp@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id OAA22671
          for freebsd-smp; Fri, 13 Dec 1996 14:36:04 -0800 (PST)
Date: Fri, 13 Dec 1996 14:36:04 -0800 (PST)
From: Steve Passe <fsmp>
Message-Id: <199612132236.OAA22671@freefall.freebsd.org>
To: freebsd-smp
Subject: cvs commit:  sys/i386/include spl.h sys/i386/isa icu.h vector.s
Sender: owner-smp@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

fsmp        96/12/13 14:36:03

  Modified:    i386/include  spl.h
  Log:
  use same imask layout for both IPI_INTS and non-IPI_INTS.
  
  Revision  Changes    Path
  1.8       +3 -3      sys/i386/include/spl.h

  Modified:    i386/isa  icu.h vector.s
  Log:
  use same imask layout for both IPI_INTS and non-IPI_INTS.
  
  removed unnecessary "FIXME:" message from icu.h.
  
  Revision  Changes    Path
  1.12      +0 -1      sys/i386/isa/icu.h
  1.35      +33 -35    sys/i386/isa/vector.s

From owner-freebsd-smp  Fri Dec 13 15:02:56 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id PAA23998
          for smp-outgoing; Fri, 13 Dec 1996 15:02:56 -0800 (PST)
Received: (from fsmp@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id PAA23990
          for freebsd-smp; Fri, 13 Dec 1996 15:02:54 -0800 (PST)
Date: Fri, 13 Dec 1996 15:02:54 -0800 (PST)
From: Steve Passe <fsmp>
Message-Id: <199612132302.PAA23990@freefall.freebsd.org>
To: freebsd-smp
Subject: cvs commit:  sys/i386/isa vector.s
Sender: owner-smp@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

fsmp        96/12/13 15:02:54

  Modified:    i386/isa  vector.s
  Log:
  fix a bug just introduced in the last commit of vector.s.
  my defines for defaulting NCPU was broken, now fixed & tested.
  
  Revision  Changes    Path
  1.36      +2 -3      sys/i386/isa/vector.s

From owner-freebsd-smp  Fri Dec 13 15:47:55 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id PAA25251
          for smp-outgoing; Fri, 13 Dec 1996 15:47:55 -0800 (PST)
Received: from uruk.org (root@faustus.dev.com [198.145.95.253])
          by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id PAA25243
          for <smp@freebsd.org>; Fri, 13 Dec 1996 15:47:50 -0800 (PST)
Received: from erich by uruk.org with local (Exim 0.53 #1)
	id E0vYiIU-0002rK-00; Fri, 13 Dec 1996 16:49:46 -0800
To: smp@freebsd.org
Subject: Tried SMP kernel from early morning CVS tree
Message-Id: <E0vYiIU-0002rK-00@uruk.org>
From: "Erich Boleyn,,,," <erich@uruk.org>
Date: Fri, 13 Dec 1996 16:49:46 -0800
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk


Hi all.  I tried the SMP kernel from the CVS tree from early this
morning.  It still has the problem on my 4-CPU Pentium Pro test box
where a long compile kills it by getting a kernel page fault in
pmap_enter.

There was one real bug (typo?) in "i386/isa/if_ze.c".  There was,
when using "APIC_IO", an undefined reference to "readIOApic24()" (I
think that's what it was), which was bracketed by "#if defined(APIC_IO)"
preprocessor stuff.  After looking in some other files, they were
using "INTRGET()" in the same way, so I just put it in place, and
everything appears to work fine (though I'm not using that driver).

I always compile the generic kernel + SMP stuff.

Erich Boleyn
<erich@uruk.org>


From owner-freebsd-smp  Fri Dec 13 16:56:05 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id QAA28478
          for smp-outgoing; Fri, 13 Dec 1996 16:56:05 -0800 (PST)
Received: (from fsmp@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id QAA28471
          for freebsd-smp; Fri, 13 Dec 1996 16:56:03 -0800 (PST)
Date: Fri, 13 Dec 1996 16:56:03 -0800 (PST)
From: Steve Passe <fsmp>
Message-Id: <199612140056.QAA28471@freefall.freebsd.org>
To: freebsd-smp
Subject: cvs commit:  sys/i386/isa if_ze.c
Sender: owner-smp@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

fsmp        96/12/13 16:56:02

  Modified:    i386/isa  if_ze.c
  Log:
  fixed a bug introduced by new "MULTIPLE_IOAPICS" code.
  
  Submitted by: "Erich Boleyn,,,," <erich@uruk.org>
  
  Revision  Changes    Path
  1.8       +2 -2      sys/i386/isa/if_ze.c

From owner-freebsd-smp  Fri Dec 13 17:08:36 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id RAA28762
          for smp-outgoing; Fri, 13 Dec 1996 17:08:36 -0800 (PST)
Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131])
          by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id RAA28755
          for <smp@freebsd.org>; Fri, 13 Dec 1996 17:08:25 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id SAA09847; Fri, 13 Dec 1996 18:08:10 -0700
Message-Id: <199612140108.SAA09847@clem.systemsix.com>
X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol
X-Mailer: exmh version 1.6.5 12/11/95
From: Steve Passe <smp@csn.net>
To: "Erich Boleyn,,,," <erich@uruk.org>
cc: smp@freebsd.org
Subject: Re: Tried SMP kernel from early morning CVS tree 
In-reply-to: Your message of "Fri, 13 Dec 1996 16:49:46 PST."
             <E0vYiIU-0002rK-00@uruk.org> 
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Fri, 13 Dec 1996 18:08:09 -0700
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

Hi,

> Hi all.  I tried the SMP kernel from the CVS tree from early this
> morning.  It still has the problem on my 4-CPU Pentium Pro test box
> where a long compile kills it by getting a kernel page fault in
> pmap_enter.

I wish I knew how to help with this, but I have neither a P6 machine or
skills/knowledge in this area...  I've been taking advantage of the lull to
go thru my code and cleanup a lot of little details that have fallen
thru the cracks.

---
> There was one real bug (typo?) in "i386/isa/if_ze.c".  There was,
> when using "APIC_IO", an undefined reference to "readIOApic24()" (I
> think that's what it was), which was bracketed by "#if defined(APIC_IO)"
> preprocessor stuff.  After looking in some other files, they were
> using "INTRGET()" in the same way, so I just put it in place, and
> everything appears to work fine (though I'm not using that driver).

thanx, your fix looks correct, I just committed it on freefall.

--
Steve Passe	| powered by
smp@csn.net	|            FreeBSD


From owner-freebsd-smp  Fri Dec 13 17:46:02 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id RAA00543
          for smp-outgoing; Fri, 13 Dec 1996 17:46:02 -0800 (PST)
Received: from spinner.DIALix.COM (root@spinner.DIALix.COM [192.203.228.67])
          by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id RAA00530
          for <smp@freebsd.org>; Fri, 13 Dec 1996 17:45:53 -0800 (PST)
Received: from spinner.DIALix.COM (peter@localhost.DIALix.oz.au [127.0.0.1])
          by spinner.DIALix.COM (8.8.4/8.8.4) with ESMTP id JAA13805;
          Sat, 14 Dec 1996 09:45:17 +0800 (WST)
Message-Id: <199612140145.JAA13805@spinner.DIALix.COM>
To: Steve Passe <smp@csn.net>
cc: "Erich Boleyn,,,," <erich@uruk.org>, smp@freebsd.org
Subject: Re: Tried SMP kernel from early morning CVS tree 
In-reply-to: Your message of "Fri, 13 Dec 1996 18:08:09 MST."
             <199612140108.SAA09847@clem.systemsix.com> 
Date: Sat, 14 Dec 1996 09:45:16 +0800
From: Peter Wemm <peter@spinner.dialix.com>
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

Steve Passe wrote:
> Hi,
> 
> > Hi all.  I tried the SMP kernel from the CVS tree from early this
> > morning.  It still has the problem on my 4-CPU Pentium Pro test box
> > where a long compile kills it by getting a kernel page fault in
> > pmap_enter.
> 
> I wish I knew how to help with this, but I have neither a P6 machine or
> skills/knowledge in this area...  I've been taking advantage of the lull to
> go thru my code and cleanup a lot of little details that have fallen
> thru the cracks.

Same here, but I've been out of action for different reasons (like:
doing some final work on a new house and preparing to move).

There were some good details posted on this problem a few days ago
from the other person with the P6 system, there is probably a good
clue in there.  My initial reaction to the details was that it
almost looked like both cpu's accessed a shared data structure at
nearly the same time, which should be impossible due to the locking.
I can't imagine why this might be happening yet, but I must re-examine
that part of the code.  An extra local tlb flush might help, but
I'm not 100% sure yet.

On the other hand:
% uname -a
FreeBSD spinner.DIALix.COM 3.0-SMP FreeBSD 3.0-SMP #154: Thu Dec  5 02:26:10 WST 1996     peter@spinner.DIALix.COM:/home/peter/smp/sys/compile/SMP  i386
% uptime
 9:38AM  up 9 days,  7:04, 12 users, load averages: 0.00, 0.01, 0.00

This machine has had the stuffing hammered out of it over the last
week, I'm really happy with the stability of the P5 systems apart
from the Floating point problems.  This kernel was built with
APIC_IO+APIC_LAZY+SMP_INVLTLB, and I've had none of the subtle
corruption problems that I used to have before SMP_INVLTLB.  It's
been easy to forget that it's running SMP most of the time. :-)

Cheers,
-Peter

From owner-freebsd-smp  Fri Dec 13 18:33:56 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id SAA01904
          for smp-outgoing; Fri, 13 Dec 1996 18:33:56 -0800 (PST)
Received: from bluenose.na.tuns.ca (bluenose.na.tuns.ca [134.190.50.156])
          by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id SAA01899
          for <smp@freebsd.org>; Fri, 13 Dec 1996 18:33:53 -0800 (PST)
Received: (from smp@localhost) by bluenose.na.tuns.ca (8.7.6/8.7.3) id WAA19725; Fri, 13 Dec 1996 22:36:14 -0400 (AST)
From: "J.M. Chuang"  <smp@bluenose.na.tuns.ca>
Message-Id: <199612140236.WAA19725@bluenose.na.tuns.ca>
Subject: Re: Tried SMP kernel from early morning CVS tree
To: peter@spinner.dialix.com (Peter Wemm)
Date: Fri, 13 Dec 1996 22:36:14 -0400 (AST)
Cc: smp@freebsd.org
In-Reply-To: <199612140145.JAA13805@spinner.DIALix.COM> from Peter Wemm at "Dec 14, 96 09:45:16 am"
X-Mailer: ELM [version 2.4ME+ PL13 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

> > > Hi all.  I tried the SMP kernel from the CVS tree from early this
> > > morning.  It still has the problem on my 4-CPU Pentium Pro test box
> > > where a long compile kills it by getting a kernel page fault in
> > > pmap_enter.
> > 
> > I wish I knew how to help with this, but I have neither a P6 machine or
> > skills/knowledge in this area...  I've been taking advantage of the lull to
> > go thru my code and cleanup a lot of little details that have fallen
> > thru the cracks.
> 
> Same here, but I've been out of action for different reasons (like:
> doing some final work on a new house and preparing to move).
> 
> There were some good details posted on this problem a few days ago
> from the other person with the P6 system, there is probably a good
> clue in there.  My initial reaction to the details was that it
> almost looked like both cpu's accessed a shared data structure at
> nearly the same time, which should be impossible due to the locking.
> I can't imagine why this might be happening yet, but I must re-examine
> that part of the code.  An extra local tlb flush might help, but
> I'm not 100% sure yet.

How to do an extra local tlb flush?

I found that if Dual P6 is booted from IDE drive with current smp-kernel+
SMP_INVLTLB, coredump and sig11 still show up right after the second CPU activated
which is very similar to the problem of dual P5 booted from IDE with current
smp-kernl without SMP_INVLTLB.

Could this IDE problem for P6 be related to trap 12?

Jim

From owner-freebsd-smp  Fri Dec 13 22:39:09 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id WAA14052
          for smp-outgoing; Fri, 13 Dec 1996 22:39:09 -0800 (PST)
Received: from uruk.org (root@faustus.dev.com [198.145.95.253])
          by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id WAA14047
          for <smp@freebsd.org>; Fri, 13 Dec 1996 22:39:06 -0800 (PST)
Received: from uruk.org [127.0.0.1] (erich)
	by uruk.org with esmtp (Exim 0.53 #1)
	id E0vYohv-0003dK-00; Fri, 13 Dec 1996 23:40:27 -0800
To: Peter Wemm <peter@spinner.dialix.com>
cc: smp@freebsd.org
Subject: TLB shootdown problems? (was -> Re: Tried SMP kernel from early morning CVS tree )
In-reply-to: Your message of "Sat, 14 Dec 1996 09:45:16 +0800."
             <199612140145.JAA13805@spinner.DIALix.COM> 
Date: Fri, 13 Dec 1996 23:40:27 -0800
From: Erich Boleyn <erich@uruk.org>
Message-Id: <E0vYohv-0003dK-00@uruk.org>
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk


Peter Wemm <peter@spinner.DIALix.COM> writes:

> There were some good details posted on this problem a few days ago
> from the other person with the P6 system, there is probably a good
> clue in there.  My initial reaction to the details was that it
> almost looked like both cpu's accessed a shared data structure at
> nearly the same time, which should be impossible due to the locking.
> I can't imagine why this might be happening yet, but I must re-examine
> that part of the code.  An extra local tlb flush might help, but
> I'm not 100% sure yet.

Here's a question (I'm going to look this up myself, but thought it'd
be worthwhile to see if you'd shed light on it before I get to it on
my copious spare time ;-) ...

How exactly are TLB shootdown IPIs implemented?  (or are they any
different from any other IPIs?)

>From what I could see, it looks like the IPI is considered "finished"
(and the function returns) when the APIC status is "delivered".  This
could be a problem, because the interrupt doesn't necessarily happen
on the other CPU at that point (and it certainly isn't completed at
that point).  You really need some other mechanism to tell you that
the operation has completed before you can continue.

This might not be as major a problem on the P5 for implementation and
shorter pipeline reasons, and the P6 also has deeper pipelines and is much
faster relative to the external bus clock (which the APICs use).

--
  Erich Stefan Boleyn                 \_ E-mail (preferred):  <erich@uruk.org>
Mad Genius wanna-be, CyberMuffin        \__      (finger me for other stats)
Web:  http://www.uruk.org/~erich/     Motto: "I'll live forever or die trying"

From owner-freebsd-smp  Fri Dec 13 23:06:25 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id XAA14856
          for smp-outgoing; Fri, 13 Dec 1996 23:06:25 -0800 (PST)
Received: from root.com (implode.root.com [198.145.90.17])
          by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id XAA14851
          for <freebsd-smp@freebsd.org>; Fri, 13 Dec 1996 23:06:23 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by root.com (8.7.6/8.6.5) with SMTP id XAA00954; Fri, 13 Dec 1996 23:05:04 -0800 (PST)
Message-Id: <199612140705.XAA00954@root.com>
X-Authentication-Warning: implode.root.com: Host localhost [127.0.0.1] didn't use HELO protocol
To: Mike Haertel <haertel@ichips.intel.com>
cc: freebsd-smp@freebsd.org
Subject: Re: some questions concerning TLB shootdowns in FreeBSD 
In-reply-to: Your message of "Wed, 11 Dec 1996 15:15:21 PST."
             <9612112315.AA58904@pdxcs078.intel.com> 
From: David Greenman <dg@root.com>
Reply-To: dg@root.com
Date: Fri, 13 Dec 1996 23:05:04 -0800
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

>I'm curious how/when people are doing TLB shootdowns.
>Obviously when reducing permission or unmapping pages.
>How about for manipulations of the dirty/accessed bits?
>(Does FreeBSD use these?)

   Speaking of the uni-processor case, FreeBSD does the access/modify bit
changes in the pmap_changebit() function which does a TLB flush if anything
is actually changed.

-DG

David Greenman
Core-team/Principal Architect, The FreeBSD Project

From owner-freebsd-smp  Sat Dec 14 02:46:24 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id CAA21509
          for smp-outgoing; Sat, 14 Dec 1996 02:46:24 -0800 (PST)
Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131])
          by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id CAA21504
          for <smp@freebsd.org>; Sat, 14 Dec 1996 02:46:21 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id DAA14527; Sat, 14 Dec 1996 03:44:43 -0700
Message-Id: <199612141044.DAA14527@clem.systemsix.com>
X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol
X-Mailer: exmh version 1.6.5 12/11/95
From: Steve Passe <smp@csn.net>
To: Erich Boleyn <erich@uruk.org>
cc: Peter Wemm <peter@spinner.dialix.com>, haertel@ichips.intel.com,
        smp@freebsd.org
Subject: Re: TLB shootdown problems? (was -> Re: Tried SMP kernel from early 
 morning CVS tree ) 
In-reply-to: Your message of "Fri, 13 Dec 1996 23:40:27 PST."
             <E0vYohv-0003dK-00@uruk.org> 
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Sat, 14 Dec 1996 03:44:43 -0700
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

Hi,

> Here's a question (I'm going to look this up myself, but thought it'd
> be worthwhile to see if you'd shed light on it before I get to it on
> my copious spare time ;-) ...
> 
> How exactly are TLB shootdown IPIs implemented?  (or are they any
> different from any other IPIs?)
> 
> >From what I could see, it looks like the IPI is considered "finished"
> (and the function returns) when the APIC status is "delivered".  This
> could be a problem, because the interrupt doesn't necessarily happen
> on the other CPU at that point (and it certainly isn't completed at
> that point).  You really need some other mechanism to tell you that
> the operation has completed before you can continue.

this is an accurate picture of the current situation.  we just send it and
"assumme" that things are now 'OK'.  We know this isn't correct, its just step
one on the way there.  It made remarkable improvement on the P5 machines.
So I guess the next step is a rendezvous mechanism to control this.
If anyone could suggest an effective algorithm for it I could take a whack
at programming it.

--
Steve Passe	| powered by
smp@csn.net	|            FreeBSD


From owner-freebsd-smp  Sat Dec 14 07:04:04 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id HAA05521
          for smp-outgoing; Sat, 14 Dec 1996 07:04:04 -0800 (PST)
Received: from spinner.DIALix.COM (root@spinner.DIALix.COM [192.203.228.67])
          by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id HAA05483
          for <smp@freebsd.org>; Sat, 14 Dec 1996 07:03:59 -0800 (PST)
Received: from spinner.DIALix.COM (peter@localhost.DIALix.oz.au [127.0.0.1])
          by spinner.DIALix.COM (8.8.4/8.8.4) with ESMTP id XAA17454;
          Sat, 14 Dec 1996 23:03:51 +0800 (WST)
Message-Id: <199612141503.XAA17454@spinner.DIALix.COM>
To: smp@freebsd.org
cc: Mike Haertel <haertel@ichips.intel.com>
Subject: Re: some questions concerning TLB shootdowns in FreeBSD 
In-reply-to: Your message of "Fri, 13 Dec 1996 23:05:04 PST."
             <199612140705.XAA00954@root.com> 
Date: Sat, 14 Dec 1996 23:03:51 +0800
From: Peter Wemm <peter@spinner.dialix.com>
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

David Greenman wrote:
> >I'm curious how/when people are doing TLB shootdowns.
> >Obviously when reducing permission or unmapping pages.
> >How about for manipulations of the dirty/accessed bits?
> >(Does FreeBSD use these?)
> 
>    Speaking of the uni-processor case, FreeBSD does the access/modify bit
> changes in the pmap_changebit() function which does a TLB flush if anything
> is actually changed.
> 
> -DG
> 
> David Greenman
> Core-team/Principal Architect, The FreeBSD Project

Also, since I wrote the initial TLB flushing on top of Steve's IPI code,
I can freely admit that what I wrote is pretty sub-standard and does
not go far enough.

There are a couple of major shortfalls:
1: It's async.  it does not syncronise the remote processors as it must do,
or they can get out of sync, slave processors can do updates on stale
data, etc.
2: It does too much work.  There are a lot of cases where a global flush
is done for the local user process on the local cpu.   I am not 100% sure
whether this is needed or not.  I can imagine that APTD accesses might
present a problem if we try to avoid global flushes here.
3: We have no way of doing a local-only tlb flush with the trivial hack
that I did to test the theory.
4: the CADDR/APTD hacks are potential problems from many angles.  As long
as we only have one cpu in the kernel "proper" at present, this shouldn't
be too much of an issue yet - but it will be one of the things waiting to
bite us later on.

There was the query about the possibility of speculative execution
on the PPro being the problem that is breaking the kernel. The
scenario sounds plausable, but my initial reaction to that was that
we are doing this from an _interrupt handler_, and I would be very
suprised if speculative execution from the original code thread
isn't wound up before going into the interrupt...  If not, do we
need some strategic nop's?

I'm still digesting it,  I am almost worried that we might (shudder!)
be forced into doing an IPI to stop all the cpu's *before* the
current cpu changes the page tables, then letting them do the tlb
flush and letting them proceed.  If this actually is a real problem
this means a much bigger code impact.

Cheers,
-Peter

From owner-freebsd-smp  Sat Dec 14 09:19:02 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id JAA16823
          for smp-outgoing; Sat, 14 Dec 1996 09:19:02 -0800 (PST)
Received: from uruk.org (root@faustus.dev.com [198.145.95.253])
          by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id JAA16812
          for <smp@freebsd.org>; Sat, 14 Dec 1996 09:18:57 -0800 (PST)
Received: from uruk.org [127.0.0.1] (erich)
	by uruk.org with esmtp (Exim 0.53 #1)
	id E0vYyUV-0004qa-00; Sat, 14 Dec 1996 10:07:15 -0800
To: Steve Passe <smp@csn.net>
cc: peter@spinner.dialix.com, haertel@ichips.intel.com, smp@freebsd.org
Subject: Re: TLB shootdown problems? (was -> Re: Tried SMP kernel from early morning CVS tree ) 
In-reply-to: Your message of "Sat, 14 Dec 1996 03:44:43 MST."
             <199612141044.DAA14527@clem.systemsix.com> 
Date: Sat, 14 Dec 1996 10:07:15 -0800
From: Erich Boleyn <erich@uruk.org>
Message-Id: <E0vYyUV-0004qa-00@uruk.org>
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk


Steve Passe <smp@csn.net> writes:

> Hi,
> 
> > Here's a question (I'm going to look this up myself, but thought it'd
> > be worthwhile to see if you'd shed light on it before I get to it on
> > my copious spare time ;-) ...
> > 
> > How exactly are TLB shootdown IPIs implemented?  (or are they any
> > different from any other IPIs?)
> > 
> > >From what I could see, it looks like the IPI is considered "finished"
> > (and the function returns) when the APIC status is "delivered".  This
> > could be a problem, because the interrupt doesn't necessarily happen
> > on the other CPU at that point (and it certainly isn't completed at
> > that point).  You really need some other mechanism to tell you that
> > the operation has completed before you can continue.
> 
> this is an accurate picture of the current situation.  we just send it and
> "assumme" that things are now 'OK'.  We know this isn't correct, its just
> step one on the way there.  It made remarkable improvement on the P5
> machines.  So I guess the next step is a rendezvous mechanism to control
> this.  If anyone could suggest an effective algorithm for it I could take
> whack at programming it.

Yes, that was what I thought.

The easiest (and maybe best performing) thing to do is have the sender spin
waiting on bits being twiddled in global memory, then have the target CPUs'
IPI handlers do such twiddling.

The real question at this point is:  Can only one TLB shootdown be
in progress at any one time.  If so, a good example to look at is
Linux-SMP:

Linux-SMP has a bitwise (since SMP-capable x86es have bitwise test and
test-and-set operators) mask "smp_invalidate_needed".  There is one bit
for each CPU.  When an invalidate is needed on a particular CPU, the
corresponding bit is set atomically.  Whenever a TLB invalidate is made
on a particular CPU, the corresponding bit is unset atomically.  There
are ways to play with that so not all CPUs need be sent messages all the
time, plus Linux-SMP does TLB invalidates in it's global spinlock, etc.
It also doesn't necessarily need to try to send the "smp_invalidate"
message right after the pmap change, just when it expects to need to
see it locally or globally...  this allows time in which other CPUs could
do invalidates.  This kind of thing would provide a moderate base on which
to make it more fine-grained over time.

A simple version which could use the same mechanism would be to have the
IPI handler do the right thing, but just have the "smp_invalidate"
message set all the "smp_invalidate_needed" bits (except our own!) for
now, to get everything working.  Avoiding setting bits for CPUs which
don't need invalidates could be done later this way without changing
the reception mechanism at all.

For a kernel architecture which is multi-threaded/re-entrant, then
things get more complicated.  I still have an algorithm in mind, but it's
just a bit long to put here right now (essentially, you have to be able
to guarantee if there are multiple TLB invalidates flying around, that
both the right things happen, and they both terminate reasonably).

--
  Erich Stefan Boleyn                 \_ E-mail (preferred):  <erich@uruk.org>
Mad Genius wanna-be, CyberMuffin        \__      (finger me for other stats)
Web:  http://www.uruk.org/~erich/     Motto: "I'll live forever or die trying"

From owner-freebsd-smp  Sat Dec 14 09:26:12 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id JAA17392
          for smp-outgoing; Sat, 14 Dec 1996 09:26:12 -0800 (PST)
Received: from ormail.intel.com (ormail.intel.com [134.134.248.3])
          by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id JAA17384
          for <smp@freebsd.org>; Sat, 14 Dec 1996 09:26:08 -0800 (PST)
From: haertel@ichips.intel.com
Received: from ichips.intel.com (ichips.intel.com [134.134.50.200]) by ormail.intel.com (8.8.4/8.7.3) with ESMTP id JAA12513; Sat, 14 Dec 1996 09:25:48 -0800 (PST)
Received: from pdxcs078.intel.com by ichips.intel.com (8.7.4/jIII)
	id JAA28180; Sat, 14 Dec 1996 09:23:01 -0800 (PST)
Received: by pdxcs078.intel.com (AIX 3.2/UCB 5.64/SW1.11) 
	id AA57406; Sat, 14 Dec 1996 09:25:51 -0800
Date: Sat, 14 Dec 1996 09:25:51 -0800
Message-Id: <9612141725.AA57406@pdxcs078.intel.com>
To: peter@spinner.dialix.com
Subject: Re: some questions concerning TLB shootdowns in FreeBSD
Cc: dg@root.com, smp@freebsd.org
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

>I'm still digesting it,  I am almost worried that we might (shudder!)
>be forced into doing an IPI to stop all the cpu's *before* the
>current cpu changes the page tables, then letting them do the tlb
>flush and letting them proceed.  If this actually is a real problem
>this means a much bigger code impact.

You must do precisely this.

The x86 architecture includes some complex instructions that
reference the same memory locations more than once--read-modify-write
sequences are the most obvious example.  For various reasons,
there is no guarantee that the TLB entries associated with those
memory locations are locked in the TLB, and so they might be
thrashed out due to other activity while those complex instructions
are executing.  If, in the meantime, some other processor
has manipulated the associated PTE in any way that lowers privilege
or changes the mapping, this processor could get a page fault
in a *non restartable* way, since it would see the mapping and/or
privilege changing under foot, but have already committed to
finishing the instruction (since the privilege checks are
normally only done at the beginning of the instruction).

As for your other question: speculative execution does not
continue past an interrupt.  An interrupt is a totally
serializing event.  However, once you're in the interrupt
handler, speculative execution could go down a different path
than you think of the interrupt as actually taking.  Basically
every time the processor fetches something from the Icache that
it thinks *might* contain a branch, it is an opportunity for the
processor to go off into la-la land, since it will simply ask
the branch predictor what it thinks and go that way.

The effect of this is speculative pollution of the non-renamed
state of the processor like the cache and the TLB entries.  So,
for example, in the uniprocessor case, doing this:

	1.  flush TLB
	2.  manipulate PTE

is not safe, since after (1), the processor may waltz
speculatively off to some code that actually references the PTE
before you manipulate it.  Instead you must always:

	1.  Manipulate PTE
	2.  flush TLB

On multiprocessors, there is the additional concern of corrupting
state which must remain invariant during instruction execution on
other processors.  So then you need the fully bulletproof code:

	1.  IPI to everyone sharing these specific PTE's
	2.  wait at barrier until everyone arrives
	3.  manipulate PTE
	4.  release barrier
	5.  everyone (including us) flushes TLB's

Bleah, I know.

From owner-freebsd-smp  Sat Dec 14 09:27:43 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id JAA17548
          for smp-outgoing; Sat, 14 Dec 1996 09:27:43 -0800 (PST)
Received: from uruk.org (root@faustus.dev.com [198.145.95.253])
          by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id JAA17536
          for <smp@freebsd.org>; Sat, 14 Dec 1996 09:27:38 -0800 (PST)
Received: from uruk.org [127.0.0.1] (erich)
	by uruk.org with esmtp (Exim 0.53 #1)
	id E0vYypx-0004tV-00; Sat, 14 Dec 1996 10:29:25 -0800
To: Peter Wemm <peter@spinner.dialix.com>
cc: smp@freebsd.org, haertel@ichips.intel.com
Subject: Re: some questions concerning TLB shootdowns in FreeBSD 
In-reply-to: Your message of "Sat, 14 Dec 1996 23:03:51 +0800."
             <199612141503.XAA17454@spinner.DIALix.COM> 
Date: Sat, 14 Dec 1996 10:29:25 -0800
From: Erich Boleyn <erich@uruk.org>
Message-Id: <E0vYypx-0004tV-00@uruk.org>
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk


Peter Wemm <peter@spinner.dialix.com> writes:

> 1: It's async.  it does not syncronise the remote processors as it must do,
> or they can get out of sync, slave processors can do updates on stale
> data, etc.

As mentioned in another message, this is bad.

> 2: It does too much work.  There are a lot of cases where a global flush
> is done for the local user process on the local cpu.   I am not 100% sure
> whether this is needed or not.  I can imagine that APTD accesses might
> present a problem if we try to avoid global flushes here.

This is perfectly OK from a functional point of view.  Personally, I
think efficiency is less important than getting it to work at this point.

> There was the query about the possibility of speculative execution
> on the PPro being the problem that is breaking the kernel. The
> scenario sounds plausable, but my initial reaction to that was that
> we are doing this from an _interrupt handler_, and I would be very
> suprised if speculative execution from the original code thread
> isn't wound up before going into the interrupt...  If not, do we
> need some strategic nop's?

No!  Speculative execution which broke interrupt handlers would be very bad,
in a lot of systems.  Perhaps Mike Haertel can comment more clearly, but my
memory claims these kind of actions were serialized.

There are actually some cases which can break, but as far as I know
these are all bus-propagation issues to external devices.

However, I think the IPI can be considered delivered, and that doesn't
guarantee that the CPU has been interrupted (what about interrupts
being masked, for example?).  I think it just says the interrupt was
accepted by the queue on the other APIC.

> I'm still digesting it,  I am almost worried that we might (shudder!)
> be forced into doing an IPI to stop all the cpu's *before* the
> current cpu changes the page tables, then letting them do the tlb
> flush and letting them proceed.  If this actually is a real problem
> this means a much bigger code impact.

I don't think so, but to allay your fears, note that if some page
permissions are changed:

  1)  Increasing permission is OK, because that should simply cause a
	false page-fault.
  2)  Decreasing permissions can cause the situation where thread/process
	A (perhaps a kernel thread) can be trying to deallocate a page
	in thread/process B which is in progress of accessing the data
	in that page (or might be).

#2 might be considered a race condition, but it also looks like a
natural timing problem that you can't get around anyway.

As long as there is some real rondezvous mechanism (such as mentioned in
my last message) for TLB shootdown IPIs to be acknowledged before the
sending CPU continues, you're guaranteeing that the original thread can't
continue until the other CPU's TLBs are really cleared, which is all
that seems important.

--
  Erich Stefan Boleyn                 \_ E-mail (preferred):  <erich@uruk.org>
Mad Genius wanna-be, CyberMuffin        \__      (finger me for other stats)
Web:  http://www.uruk.org/~erich/     Motto: "I'll live forever or die trying"

From owner-freebsd-smp  Sat Dec 14 10:00:18 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id KAA21397
          for smp-outgoing; Sat, 14 Dec 1996 10:00:18 -0800 (PST)
Received: from uruk.org (root@faustus.dev.com [198.145.95.253])
          by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id KAA21390
          for <smp@freebsd.org>; Sat, 14 Dec 1996 10:00:14 -0800 (PST)
Received: from uruk.org [127.0.0.1] (erich)
	by uruk.org with esmtp (Exim 0.53 #1)
	id E0vYzLc-0004xo-00; Sat, 14 Dec 1996 11:02:08 -0800
To: haertel@ichips.intel.com
cc: smp@freebsd.org, dg@root.com
Subject: Re: some questions concerning TLB shootdowns in FreeBSD 
In-reply-to: Your message of "Sat, 14 Dec 1996 09:25:51 PST."
             <9612141725.AA57406@pdxcs078.intel.com> 
Date: Sat, 14 Dec 1996 11:02:08 -0800
From: Erich Boleyn <erich@uruk.org>
Message-Id: <E0vYzLc-0004xo-00@uruk.org>
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk


haertel@ichips.intel.com (Mike Haertel) writes:

> >I'm still digesting it,  I am almost worried that we might (shudder!)
> >be forced into doing an IPI to stop all the cpu's *before* the
> >current cpu changes the page tables, then letting them do the tlb
> >flush and letting them proceed.  If this actually is a real problem
> >this means a much bigger code impact.
> 
> You must do precisely this.
> 
> The x86 architecture includes some complex instructions that
> reference the same memory locations more than once--read-modify-write
> sequences are the most obvious example.  For various reasons,
> there is no guarantee that the TLB entries associated with those
> memory locations are locked in the TLB, and so they might be
> thrashed out due to other activity while those complex instructions
> are executing.  If, in the meantime, some other processor
> has manipulated the associated PTE in any way that lowers privilege
> or changes the mapping, this processor could get a page fault
> in a *non restartable* way, since it would see the mapping and/or
> privilege changing under foot, but have already committed to
> finishing the instruction (since the privilege checks are
> normally only done at the beginning of the instruction).

Urk!  Thanks for clarifying this.  I'm curious as to why this hasn't
been a problem on Linux-SMP ...

--
  Erich Stefan Boleyn                 \_ E-mail (preferred):  <erich@uruk.org>
Mad Genius wanna-be, CyberMuffin        \__      (finger me for other stats)
Web:  http://www.uruk.org/~erich/     Motto: "I'll live forever or die trying"

From owner-freebsd-smp  Sat Dec 14 10:39:26 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id KAA26358
          for smp-outgoing; Sat, 14 Dec 1996 10:39:26 -0800 (PST)
Received: from dyson.iquest.net (dyson.iquest.net [198.70.144.127])
          by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id KAA26347
          for <smp@freebsd.org>; Sat, 14 Dec 1996 10:39:23 -0800 (PST)
Received: (from root@localhost) by dyson.iquest.net (8.8.2/8.6.9) id NAA00208; Sat, 14 Dec 1996 13:38:47 -0500 (EST)
From: "John S. Dyson" <toor@dyson.iquest.net>
Message-Id: <199612141838.NAA00208@dyson.iquest.net>
Subject: Re: some questions concerning TLB shootdowns in FreeBSD
To: peter@spinner.dialix.com (Peter Wemm)
Date: Sat, 14 Dec 1996 13:38:47 -0500 (EST)
Cc: smp@freebsd.org, haertel@ichips.intel.com
In-Reply-To: <199612141503.XAA17454@spinner.DIALix.COM> from "Peter Wemm" at Dec 14, 96 11:03:51 pm
Reply-To: dyson@freebsd.org
X-Mailer: ELM [version 2.4 PL24 ME8]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

> 
> I'm still digesting it,  I am almost worried that we might (shudder!)
> be forced into doing an IPI to stop all the cpu's *before* the
> current cpu changes the page tables, then letting them do the tlb
> flush and letting them proceed.  If this actually is a real problem
> this means a much bigger code impact.
> 
The way that I see it, is that the current pmap code is highly optimized
for single processor operation.  If I was you, I would try to just
try to get something working correctly algorithmically -- almost ignoring
performance issues.  Of course, when performance is easy -- go for that
also.

Alot of things like single page invalidates inside of loops appear that
they could be evil for multi-processor applications (imagine an inter-
processor interrupt for every loop!?!?.)  I think that you (we or us),
will have to look at the performance for the SMP direction, and it
might even entail large differences in pmap eventually.  Hopefully,
we will all be able to isolate the differences for the maintenance of
sanity :-).

John

From owner-freebsd-smp  Sat Dec 14 10:57:16 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id KAA28629
          for smp-outgoing; Sat, 14 Dec 1996 10:57:16 -0800 (PST)
Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131])
          by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id KAA28621
          for <smp@freebsd.org>; Sat, 14 Dec 1996 10:57:12 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id LAA16724; Sat, 14 Dec 1996 11:55:18 -0700
Message-Id: <199612141855.LAA16724@clem.systemsix.com>
X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol
X-Mailer: exmh version 1.6.5 12/11/95
From: Steve Passe <smp@csn.net>
To: haertel@ichips.intel.com
cc: peter@spinner.dialix.com, dg@root.com, smp@freebsd.org,
        toor@dyson.iquest.net
Subject: Re: some questions concerning TLB shootdowns in FreeBSD 
In-reply-to: Your message of "Sat, 14 Dec 1996 09:25:51 PST."
             <9612141725.AA57406@pdxcs078.intel.com> 
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Sat, 14 Dec 1996 11:55:18 -0700
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

Hi,

>On multiprocessors, there is the additional concern of corrupting
>state which must remain invariant during instruction execution on
>other processors.  So then you need the fully bulletproof code:
>
>	1.  IPI to everyone sharing these specific PTE's
>	2.  wait at barrier until everyone arrives
>	3.  manipulate PTE
>	4.  release barrier
>	5.  everyone (including us) flushes TLB's

this was my concern but I didn't know how to word it so concisely!  right
now the code looks like:

/* edited version to show the general idea: */
invlpg(u_int addr)
{
    __asm __volatile("invlpg (%0)" : : "r" (addr) : "memory");
    allButSelfIPI(ICU_OFFSET+27);
}

so some routine modifies a PTE, then calls invlpg().  this works for itself,
as it won't try to use the stale page between modifying the PTE and flushing
its TLB.  However the other CPUs are running async, and may access the page
in question between the time the 1st CPU changes the PTE and the time they
receive the IPI.  It seems like the rfork() situation where we seem to be
getting hit is particularily prone to tripping over this.

The above proposed algorithm seems like the only safe method of dealing
with the problem...

--
Steve Passe	| powered by
smp@csn.net	|            FreeBSD


From owner-freebsd-smp  Sat Dec 14 13:25:39 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id NAA14411
          for smp-outgoing; Sat, 14 Dec 1996 13:25:39 -0800 (PST)
Received: from tfs.com (tfs.com [140.145.250.1])
          by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id NAA14404;
          Sat, 14 Dec 1996 13:25:32 -0800 (PST)
Received: from critter.tfs.com by tfs.com (smail3.1.28.1) with SMTP
	id m0vZ1GT-0003wLC; Sat, 14 Dec 96 13:04 PST
Received: from critter.tfs.com (localhost [127.0.0.1]) by critter.tfs.com (8.8.2/8.8.2) with ESMTP id VAA05559; Sat, 14 Dec 1996 21:06:28 +0100 (MET)
To: dyson@freebsd.org
cc: peter@spinner.dialix.com (Peter Wemm), smp@freebsd.org,
        haertel@ichips.intel.com
Subject: Re: some questions concerning TLB shootdowns in FreeBSD 
In-reply-to: Your message of "Sat, 14 Dec 1996 13:38:47 EST."
             <199612141838.NAA00208@dyson.iquest.net> 
Date: Sat, 14 Dec 1996 21:06:28 +0100
Message-ID: <5557.850593988@critter.tfs.com>
From: Poul-Henning Kamp <phk@critter.tfs.com>
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

In message <199612141838.NAA00208@dyson.iquest.net>, "John S. Dyson" writes:
>The way that I see it, is that the current pmap code is highly optimized
>for single processor operation.  If I was you, I would try to just
>try to get something working correctly algorithmically -- almost ignoring
>performance issues.  Of course, when performance is easy -- go for that
>also.
>
>Alot of things like single page invalidates inside of loops appear that
>they could be evil for multi-processor applications (imagine an inter-
>processor interrupt for every loop!?!?.)  I think that you (we or us),
>will have to look at the performance for the SMP direction, and it
>might even entail large differences in pmap eventually.  Hopefully,
>we will all be able to isolate the differences for the maintenance of
>sanity :-).

The crucial thing, as far as I can see, is to find out >if< we need to
tell the other CPU's about this change to the pagetables.  For a 2cpu
system the penalty of stopping the other CPU is still within bounds
of the reasonable, but stopping three CPUs needlessly is not a good idea.

Is there any cheap way to keep a refcount (or bitmap) per vm-object so
we can see if we need to kick the other CPUs if we fiddle it ?

--
Poul-Henning Kamp           | phk@FreeBSD.ORG       FreeBSD Core-team.
http://www.freebsd.org/~phk | phk@login.dknet.dk    Private mailbox.
whois: [PHK]                | phk@tfs.com           TRW Financial Systems, Inc.
Power and ignorance is a disgusting cocktail.

From owner-freebsd-smp  Sat Dec 14 13:35:32 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id NAA15308
          for smp-outgoing; Sat, 14 Dec 1996 13:35:32 -0800 (PST)
Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211])
          by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id NAA15297;
          Sat, 14 Dec 1996 13:35:25 -0800 (PST)
Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id OAA22198; Sat, 14 Dec 1996 14:12:47 -0700
From: Terry Lambert <terry@lambert.org>
Message-Id: <199612142112.OAA22198@phaeton.artisoft.com>
Subject: Re: some questions concerning TLB shootdowns in FreeBSD
To: phk@critter.tfs.com (Poul-Henning Kamp)
Date: Sat, 14 Dec 1996 14:12:47 -0700 (MST)
Cc: dyson@freebsd.org, peter@spinner.dialix.com, smp@freebsd.org,
        haertel@ichips.intel.com
In-Reply-To: <5557.850593988@critter.tfs.com> from "Poul-Henning Kamp" at Dec 14, 96 09:06:28 pm
X-Mailer: ELM [version 2.4 PL24]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

> The crucial thing, as far as I can see, is to find out >if< we need to
> tell the other CPU's about this change to the pagetables.  For a 2cpu
> system the penalty of stopping the other CPU is still within bounds
> of the reasonable, but stopping three CPUs needlessly is not a good idea.
> 
> Is there any cheap way to keep a refcount (or bitmap) per vm-object so
> we can see if we need to kick the other CPUs if we fiddle it ?

Oh, I like this.

It would make it very easy to have multiple references in the UP case,
as well as the MP case.

This would let us do device/offset as well as vnode/offset based caching
(for instance, hanging all cache buffers for vnodes on a device off the
device vnode).

I've wanted this for some time, since I am determined that vclean must
die...


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

From owner-freebsd-smp  Sat Dec 14 13:43:25 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id NAA16038
          for smp-outgoing; Sat, 14 Dec 1996 13:43:25 -0800 (PST)
Received: from tfs.com (tfs.com [140.145.250.1])
          by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id NAA16024;
          Sat, 14 Dec 1996 13:43:21 -0800 (PST)
Received: from critter.tfs.com by tfs.com (smail3.1.28.1) with SMTP
	id m0vZ1r2-0003vlC; Sat, 14 Dec 96 13:42 PST
Received: from critter.tfs.com (localhost.phk.dk [127.0.0.1]) by critter.tfs.com (8.8.2/8.8.2) with ESMTP id WAA08258; Sat, 14 Dec 1996 22:45:49 +0100 (MET)
To: Terry Lambert <terry@lambert.org>
cc: dyson@freebsd.org, peter@spinner.dialix.com, smp@freebsd.org,
        haertel@ichips.intel.com
Subject: Re: some questions concerning TLB shootdowns in FreeBSD 
In-reply-to: Your message of "Sat, 14 Dec 1996 14:12:47 MST."
             <199612142112.OAA22198@phaeton.artisoft.com> 
Date: Sat, 14 Dec 1996 22:45:49 +0100
Message-ID: <8256.850599949@critter.tfs.com>
From: Poul-Henning Kamp <phk@critter.tfs.com>
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

In message <199612142112.OAA22198@phaeton.artisoft.com>, Terry Lambert writes:
>> The crucial thing, as far as I can see, is to find out >if< we need to
>> tell the other CPU's about this change to the pagetables.  For a 2cpu
>> system the penalty of stopping the other CPU is still within bounds
>> of the reasonable, but stopping three CPUs needlessly is not a good idea.
>> 
>> Is there any cheap way to keep a refcount (or bitmap) per vm-object so
>> we can see if we need to kick the other CPUs if we fiddle it ?
>
>Oh, I like this.
>
>It would make it very easy to have multiple references in the UP case,
>as well as the MP case.
>
>This would let us do device/offset as well as vnode/offset based caching
>(for instance, hanging all cache buffers for vnodes on a device off the
>device vnode).
>
>I've wanted this for some time, since I am determined that vclean must
>die...
>

Cool.  send patches when done :-)

--
Poul-Henning Kamp           | phk@FreeBSD.ORG       FreeBSD Core-team.
http://www.freebsd.org/~phk | phk@login.dknet.dk    Private mailbox.
whois: [PHK]                | phk@tfs.com           TRW Financial Systems, Inc.
Power and ignorance is a disgusting cocktail.

From owner-freebsd-smp  Sat Dec 14 13:55:16 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id NAA17678
          for smp-outgoing; Sat, 14 Dec 1996 13:55:16 -0800 (PST)
Received: from dyson.iquest.net (dyson.iquest.net [198.70.144.127])
          by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id NAA17668;
          Sat, 14 Dec 1996 13:55:09 -0800 (PST)
Received: (from root@localhost) by dyson.iquest.net (8.8.2/8.6.9) id QAA05435; Sat, 14 Dec 1996 16:54:09 -0500 (EST)
From: "John S. Dyson" <toor@dyson.iquest.net>
Message-Id: <199612142154.QAA05435@dyson.iquest.net>
Subject: Re: some questions concerning TLB shootdowns in FreeBSD
To: terry@lambert.org (Terry Lambert)
Date: Sat, 14 Dec 1996 16:54:09 -0500 (EST)
Cc: phk@critter.tfs.com, dyson@freebsd.org, peter@spinner.dialix.com,
        smp@freebsd.org, haertel@ichips.intel.com
In-Reply-To: <199612142112.OAA22198@phaeton.artisoft.com> from "Terry Lambert" at Dec 14, 96 02:12:47 pm
X-Mailer: ELM [version 2.4 PL24 ME8]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

> 
> This would let us do device/offset as well as vnode/offset based caching
> (for instance, hanging all cache buffers for vnodes on a device off the
> device vnode).
> 
> I've wanted this for some time, since I am determined that vclean must
> die...
> 
Slightly off subject, but I plan to sometime carry the vnode/offset
caching to a more generalized scheme that also encompasses device/offset
caching.  Specifically, device/offset is the same as vnode/offset.

This will allow us to cache data without the vnode.  However, we will
continue to have the advantages of the current vnode/offset scheme.

John


From owner-freebsd-smp  Sat Dec 14 13:56:03 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id NAA17809
          for smp-outgoing; Sat, 14 Dec 1996 13:56:03 -0800 (PST)
Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131])
          by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id NAA17776
          for <smp@freebsd.org>; Sat, 14 Dec 1996 13:55:55 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id OAA17577; Sat, 14 Dec 1996 14:51:50 -0700
Message-Id: <199612142151.OAA17577@clem.systemsix.com>
X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol
X-Mailer: exmh version 1.6.5 12/11/95
From: Steve Passe <smp@csn.net>
To: Steve Passe <smp@csn.net>
cc: haertel@ichips.intel.com, peter@spinner.dialix.com, dg@root.com,
        smp@freebsd.org, toor@dyson.iquest.net
Subject: Re: some questions concerning TLB shootdowns in FreeBSD 
In-reply-to: Your message of "Sat, 14 Dec 1996 11:55:18 MST."
             <199612141855.LAA16724@clem.systemsix.com> 
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Sat, 14 Dec 1996 14:51:50 -0700
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

Hi,

so here is a suggested set of code for the TLB sync problem.  There is one
problem with it (that I currently see, surely others also!) in that the
target CPUs can't service their IPI as the invoking CPU holds the mp_lock.
So for now lets pretend that we have a separate lock for IPIs called
ipi_lock, which is manipulated via get_ipilock()/rel_ipilock().

---
usage by the invoking CPU:

	startRendezvous();		/* setup a rendezvous */

	/* 
	 * at this point the other CPUs are all spinning on the end lock
	 * so the code can safely muck with PTD/PTE entries...
	 */

	invltlb();			/* CPU flushes local TLB */

	endRendezvous();		/* end the rendezvous */

---
usage by the invoked CPUs, ie the routine invoked by the IPI:

ipi_invltlb(void)
{
    u_long	temp;

    doRendezvous();			/* declare our arrival and wait */

    __asm __volatile("movl %%cr3, %0; movl %0, %%cr3" : "=r" (temp)
			 : : "memory");
}

----------------------------------- cut -------------------------------------
/* rendezvous.s */

	.text
	.align	4

#define SMP_INVLTLB_IPI		(ICU_OFFSET+27)

/*
 * invoking CPU sets up rendezvous
 */
ENTRY(startRendezvous)
	call	_get_ipilock			/* only one CPU at a time */

	movl	_mp_ncpus, %eax			/* # of CPUs to sync */
	decl	%eax				/* count ourself */
	movl	%eax, _rendezvousCount		/* init the downcounter */
	movl	%eax, _rendezvousEnd		/* init the release lock */

	pushl	SMP_INVLTLB_IPI
	call	_allButSelfIPI
	addl	$4, %esp

	call	_rel_ipilock			/* now safe for other CPUs */

1:	cmpl	$0, _rendezvousCount		/* check current value */
	jnz	1b				/* somebody not here yet */

	call	_get_ipilock			/* is this necessary??? */
	ret

/*
 * invoking CPU releases all other CPUs
 */
ENTRY(endRendezvous)
	movl	$0, _rendezvousEnd
	call	_rel_ipilock			/* is this necessary??? */
	ret

/*
 * invoked CPUs enter and wait for end
 */
ENTRY(doRendezvous)
	call	_rel_ipilock			/* allow other CPUs to IPI */

	lock					/* ensure atomic operation */
	decl	_rendezvousCount		/* declare our arrival */

1:	cmpl	$0, _rendezvousEnd		/* test for end */
	jnz	1b				/* not yet, spin */

	call	_get_ipilock			/* safe exit from IPI */
	ret

	.data
	ALIGN_DATA

	.globl _rendezvousCount
_rendezvousCount:
	.long 0

	.globl _rendezvousEnd
_rendezvousEnd:
	.long 0

----------------------------------- cut -------------------------------------

--
Steve Passe	| powered by
smp@csn.net	|            FreeBSD


From owner-freebsd-smp  Sat Dec 14 14:02:12 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id OAA18389
          for smp-outgoing; Sat, 14 Dec 1996 14:02:12 -0800 (PST)
Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211])
          by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id OAA18374;
          Sat, 14 Dec 1996 14:02:06 -0800 (PST)
Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id OAA22308; Sat, 14 Dec 1996 14:38:54 -0700
From: Terry Lambert <terry@lambert.org>
Message-Id: <199612142138.OAA22308@phaeton.artisoft.com>
Subject: Re: some questions concerning TLB shootdowns in FreeBSD
To: toor@dyson.iquest.net (John S. Dyson)
Date: Sat, 14 Dec 1996 14:38:54 -0700 (MST)
Cc: terry@lambert.org, phk@critter.tfs.com, dyson@freebsd.org,
        peter@spinner.dialix.com, smp@freebsd.org, haertel@ichips.intel.com
In-Reply-To: <199612142154.QAA05435@dyson.iquest.net> from "John S. Dyson" at Dec 14, 96 04:54:09 pm
X-Mailer: ELM [version 2.4 PL24]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

> > This would let us do device/offset as well as vnode/offset based caching
> > (for instance, hanging all cache buffers for vnodes on a device off the
> > device vnode).
> > 
> > I've wanted this for some time, since I am determined that vclean must
> > die...
>
> Slightly off subject, but I plan to sometime carry the vnode/offset
> caching to a more generalized scheme that also encompasses device/offset
> caching.  Specifically, device/offset is the same as vnode/offset.
> 
> This will allow us to cache data without the vnode.  However, we will
> continue to have the advantages of the current vnode/offset scheme.

This is one of the reasons for murdering vclean: so you can get a cache
hit on perfectly good data which is in memory, but for which the vnode
has been reused, freed, destroyed, or whatever.  Without the vnode, the
perfectly good data can not get a cache hit... it has to be loaded in
from disk again (potentially tromping other perfectly good data that
is also in cache, but is older than the perfectly good data we can no
longer reference -- bletch).


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

From owner-freebsd-smp  Sat Dec 14 14:22:32 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id OAA20923
          for smp-outgoing; Sat, 14 Dec 1996 14:22:32 -0800 (PST)
Received: from dyson.iquest.net (dyson.iquest.net [198.70.144.127])
          by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id OAA20914;
          Sat, 14 Dec 1996 14:22:27 -0800 (PST)
Received: (from root@localhost) by dyson.iquest.net (8.8.2/8.6.9) id RAA05499; Sat, 14 Dec 1996 17:22:22 -0500 (EST)
From: "John S. Dyson" <toor@dyson.iquest.net>
Message-Id: <199612142222.RAA05499@dyson.iquest.net>
Subject: Re: some questions concerning TLB shootdowns in FreeBSD
To: phk@critter.tfs.com (Poul-Henning Kamp)
Date: Sat, 14 Dec 1996 17:22:22 -0500 (EST)
Cc: dyson@freebsd.org, peter@spinner.dialix.com, smp@freebsd.org,
        haertel@ichips.intel.com
In-Reply-To: <5557.850593988@critter.tfs.com> from "Poul-Henning Kamp" at Dec 14, 96 09:06:28 pm
X-Mailer: ELM [version 2.4 PL24 ME8]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

> 
> The crucial thing, as far as I can see, is to find out >if< we need to
> tell the other CPU's about this change to the pagetables.  For a 2cpu
> system the penalty of stopping the other CPU is still within bounds
> of the reasonable, but stopping three CPUs needlessly is not a good idea.
> 
> Is there any cheap way to keep a refcount (or bitmap) per vm-object so
> we can see if we need to kick the other CPUs if we fiddle it ?
> 
That would be tricky if we can freely reschedule processes on other
cpu's...  It would entail traversing the map for the process when
the process is scheduled.  Normally, there is also no notification
when a page table entry is fetched into the TLB.  Such notification
can be arranged on the advanced X86 processors, but it doesn't
appear to be a guaranteed type thing.

How's about just making the inter-processor interrupt efficient?
We can probably redo some of the vm/pmap interface to have larger
grained pmap update operations also.

I suggest that in the short term, that the code be made algorithmically
correct with the stop-processor suggestion made earlier.  Later on,
we can improve on the algorithmically correct (but slightly slower code),
and do the things to the vm/pmap interface to make things much more efficient.

John
dyson@freebsd.org


From owner-freebsd-smp  Sat Dec 14 14:27:22 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id OAA21484
          for smp-outgoing; Sat, 14 Dec 1996 14:27:22 -0800 (PST)
Received: from dyson.iquest.net (dyson.iquest.net [198.70.144.127])
          by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id OAA21471;
          Sat, 14 Dec 1996 14:27:17 -0800 (PST)
Received: (from root@localhost) by dyson.iquest.net (8.8.2/8.6.9) id RAA05515; Sat, 14 Dec 1996 17:25:55 -0500 (EST)
From: "John S. Dyson" <toor@dyson.iquest.net>
Message-Id: <199612142225.RAA05515@dyson.iquest.net>
Subject: Re: some questions concerning TLB shootdowns in FreeBSD
To: terry@lambert.org (Terry Lambert)
Date: Sat, 14 Dec 1996 17:25:55 -0500 (EST)
Cc: toor@dyson.iquest.net, terry@lambert.org, phk@critter.tfs.com,
        dyson@freebsd.org, peter@spinner.dialix.com, smp@freebsd.org,
        haertel@ichips.intel.com
In-Reply-To: <199612142138.OAA22308@phaeton.artisoft.com> from "Terry Lambert" at Dec 14, 96 02:38:54 pm
X-Mailer: ELM [version 2.4 PL24 ME8]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

> >
> > Slightly off subject, but I plan to sometime carry the vnode/offset
> > caching to a more generalized scheme that also encompasses device/offset
> > caching.  Specifically, device/offset is the same as vnode/offset.
> > 
> > This will allow us to cache data without the vnode.  However, we will
> > continue to have the advantages of the current vnode/offset scheme.
> 
> This is one of the reasons for murdering vclean: so you can get a cache
> hit on perfectly good data which is in memory, but for which the vnode
> has been reused, freed, destroyed, or whatever.  Without the vnode, the
> perfectly good data can not get a cache hit... it has to be loaded in
> from disk again (potentially tromping other perfectly good data that
> is also in cache, but is older than the perfectly good data we can no
> longer reference -- bletch).
> 
> 
The ONLY reason that it hasn't been done, is (my) time limitations.  Other
things scream louder -- and the "nice" things get left by the wayside.
For example, today I am working on the merge of the Lite/2 stuff (finally).
After the merge, and the commits, I expect that there will be at least
a few days of instability, and guess what I get to do (answer: read
frantic requests for help, look at core dumps, and generally feel bad
about messing up the tree.) :-).

John
dyson@freebsd.org


From owner-freebsd-smp  Sat Dec 14 15:08:48 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id PAA26319
          for smp-outgoing; Sat, 14 Dec 1996 15:08:48 -0800 (PST)
Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211])
          by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id PAA26289;
          Sat, 14 Dec 1996 15:08:41 -0800 (PST)
Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id PAA22458; Sat, 14 Dec 1996 15:45:28 -0700
From: Terry Lambert <terry@lambert.org>
Message-Id: <199612142245.PAA22458@phaeton.artisoft.com>
Subject: Re: some questions concerning TLB shootdowns in FreeBSD
To: toor@dyson.iquest.net (John S. Dyson)
Date: Sat, 14 Dec 1996 15:45:28 -0700 (MST)
Cc: phk@critter.tfs.com, dyson@freebsd.org, peter@spinner.dialix.com,
        smp@freebsd.org, haertel@ichips.intel.com
In-Reply-To: <199612142222.RAA05499@dyson.iquest.net> from "John S. Dyson" at Dec 14, 96 05:22:22 pm
X-Mailer: ELM [version 2.4 PL24]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

> > Is there any cheap way to keep a refcount (or bitmap) per vm-object so
> > we can see if we need to kick the other CPUs if we fiddle it ?
>
> That would be tricky if we can freely reschedule processes on other
> cpu's...  It would entail traversing the map for the process when
> the process is scheduled.  Normally, there is also no notification
> when a page table entry is fetched into the TLB.  Such notification
> can be arranged on the advanced X86 processors, but it doesn't
> appear to be a guaranteed type thing.

You could simplify this a lot by preferential scheduling.

You could also keep a bitmap of the virtual address space and examine only
those areas where a bitmap collides...


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

From owner-freebsd-smp  Sat Dec 14 17:22:09 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id RAA10071
          for smp-outgoing; Sat, 14 Dec 1996 17:22:09 -0800 (PST)
Received: from spinner.DIALix.COM (root@spinner.DIALix.COM [192.203.228.67])
          by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id RAA10063;
          Sat, 14 Dec 1996 17:22:02 -0800 (PST)
Received: from spinner.DIALix.COM (peter@localhost.DIALix.oz.au [127.0.0.1])
          by spinner.DIALix.COM (8.8.4/8.8.4) with ESMTP id JAA12763;
          Sun, 15 Dec 1996 09:21:56 +0800 (WST)
Message-Id: <199612150121.JAA12763@spinner.DIALix.COM>
To: dyson@freebsd.org
cc: smp@freebsd.org, haertel@ichips.intel.com
Subject: Re: some questions concerning TLB shootdowns in FreeBSD 
In-reply-to: Your message of "Sat, 14 Dec 1996 13:38:47 EST."
             <199612141838.NAA00208@dyson.iquest.net> 
Date: Sun, 15 Dec 1996 09:21:55 +0800
From: Peter Wemm <peter@spinner.dialix.com>
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

"John S. Dyson" wrote:
> > 
> > I'm still digesting it,  I am almost worried that we might (shudder!)
> > be forced into doing an IPI to stop all the cpu's *before* the
> > current cpu changes the page tables, then letting them do the tlb
> > flush and letting them proceed.  If this actually is a real problem
> > this means a much bigger code impact.
> > 
> The way that I see it, is that the current pmap code is highly optimized
> for single processor operation.  If I was you, I would try to just
> try to get something working correctly algorithmically -- almost ignoring
> performance issues.  Of course, when performance is easy -- go for that
> also.
> 
> Alot of things like single page invalidates inside of loops appear that
> they could be evil for multi-processor applications (imagine an inter-
> processor interrupt for every loop!?!?.)  I think that you (we or us),
> will have to look at the performance for the SMP direction, and it
> might even entail large differences in pmap eventually.  Hopefully,
> we will all be able to isolate the differences for the maintenance of
> sanity :-).
> 
> John

Originally, I wondered if the CMAP/CADDR and APTD stuff might need
to be per-cpu but couldn't think of a good reason given our presently
99.8% non-reentrant kernel (the IPI code is reentrant).  Perhaps
this is one of them...  I don't recall how much code walks through
the page tables and how much uses CADDR/APTD.

When dealing with the user space of the currently active process
context, remote TLB locking/flushing is not needed as long as other
cpu's cannot get to the same space via their APTD (which should be
valid as long as we have a global lock) for the high level stuff.

However, the shared address space code that I was working on in
-current (for kernel assisted threading in the smp kernel) means
that a single vmspace/pmap/etc can be shared among multiple processes
and this changes the above picture since two cpu's can be using
the user mode parts of the same page tables at once, one in executing
in user mode, one in the kernel.

Cheers,
-Peter

From owner-freebsd-smp  Sat Dec 14 17:36:10 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id RAA11229
          for smp-outgoing; Sat, 14 Dec 1996 17:36:10 -0800 (PST)
Received: from spinner.DIALix.COM (root@spinner.DIALix.COM [192.203.228.67])
          by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id RAA11198;
          Sat, 14 Dec 1996 17:35:56 -0800 (PST)
Received: from spinner.DIALix.COM (peter@localhost.DIALix.oz.au [127.0.0.1])
          by spinner.DIALix.COM (8.8.4/8.8.4) with ESMTP id JAA13241;
          Sun, 15 Dec 1996 09:35:31 +0800 (WST)
Message-Id: <199612150135.JAA13241@spinner.DIALix.COM>
To: Poul-Henning Kamp <phk@critter.tfs.com>
cc: dyson@freebsd.org, smp@freebsd.org, haertel@ichips.intel.com
Subject: Re: some questions concerning TLB shootdowns in FreeBSD 
In-reply-to: Your message of "Sat, 14 Dec 1996 21:06:28 +0100."
             <5557.850593988@critter.tfs.com> 
Date: Sun, 15 Dec 1996 09:35:30 +0800
From: Peter Wemm <peter@spinner.dialix.com>
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

Poul-Henning Kamp wrote:
> In message <199612141838.NAA00208@dyson.iquest.net>, "John S. Dyson" writes:
> >The way that I see it, is that the current pmap code is highly optimized
> >for single processor operation.  If I was you, I would try to just
> >try to get something working correctly algorithmically -- almost ignoring
> >performance issues.  Of course, when performance is easy -- go for that
> >also.
> >
> >Alot of things like single page invalidates inside of loops appear that
> >they could be evil for multi-processor applications (imagine an inter-
> >processor interrupt for every loop!?!?.)  I think that you (we or us),
> >will have to look at the performance for the SMP direction, and it
> >might even entail large differences in pmap eventually.  Hopefully,
> >we will all be able to isolate the differences for the maintenance of
> >sanity :-).
> 
> The crucial thing, as far as I can see, is to find out >if< we need to
> tell the other CPU's about this change to the pagetables.  For a 2cpu
> system the penalty of stopping the other CPU is still within bounds
> of the reasonable, but stopping three CPUs needlessly is not a good idea.

Yes..  Also, there seem to be cases where the cpu needs to invalidate
on entry to the kernel, but does not need to be kicked via an IPI.
eg: if we change the kernel page tables, other cpu's running user
code at the time do not need to flush until they actually try to
enter the kernel.

We should replace the existing simplistic code with a group of bitmaps that
are accessed via atomic bit-set/clear and bit-test-and-set/clear so that
we can syncronise deferred TLB flushes and callins for common PTE's.

Cheers,
-Peter

From owner-freebsd-smp  Sat Dec 14 17:36:44 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id RAA11264
          for smp-outgoing; Sat, 14 Dec 1996 17:36:44 -0800 (PST)
Received: from avatar.avatar.com (avatar.avatar.com [199.33.206.17])
          by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id RAA11259
          for <freebsd-smp@freebsd.org>; Sat, 14 Dec 1996 17:36:41 -0800 (PST)
Received: from avatar.avatar.com (kory@avatar.avatar.com [199.33.206.17]) by avatar.avatar.com (8.7.4/8.6.9) with SMTP id RAA23607 for <freebsd-smp@freebsd.org>; Sat, 14 Dec 1996 17:36:09 -0800 (PST)
Date: Sat, 14 Dec 1996 17:36:07 -0800 (PST)
From: Kory Hamzeh <kory@avatar.com>
To: freebsd-smp@freebsd.org
Subject: SMP Status
Message-ID: <Pine.BSI.3.91.961214173434.23604A-100000@avatar.avatar.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk


Is there anywhere on the freebsd web site where I can get a status of the 
SMP project? I putting togther a fairly high end Pentium Pro machine 
right now and I would like to purchase a motherboard that would be 
compatible with the freebsd SMP support.

Thanks,
Kory