From owner-freebsd-smp Tue Dec 10 19:49:54 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id TAA05901 for smp-outgoing; Tue, 10 Dec 1996 19:49:54 -0800 (PST) Received: from pat.idt.unit.no (0@pat.idt.unit.no [129.241.103.5]) by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id TAA05890 for ; Tue, 10 Dec 1996 19:49:50 -0800 (PST) Received: from idt.unit.no (tegge@ikke.idt.unit.no [129.241.111.65]) by pat.idt.unit.no (8.8.4/8.8.4) with ESMTP id EAA00310; Wed, 11 Dec 1996 04:49:01 +0100 (MET) Message-Id: <199612110349.EAA00310@pat.idt.unit.no> To: peter@spinner.dialix.com Cc: smp@bluenose.na.tuns.ca, smp@freebsd.org Subject: Re: More info about fatal trap 12 In-Reply-To: Your message of "Sat, 07 Dec 1996 02:15:12 +0800" References: <199612061815.CAA19205@spinner.DIALix.COM> Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii X-Mailer: Mew version 1.06 on Emacs 19.33.1 Date: Wed, 11 Dec 1996 04:49:01 +0100 From: Tor Egge Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > Tor Egge wrote: > > A closer examination of the kernel dump shows that the first page fault > > is from the user process /bin/sh. The call stack is > [..] > > The first access to the stack by the child process failed when trying > > to save the return value from fork. > > > > The parent process was running on CPU #1, and the child process > > was running on CPU #0. > > > > - Tor Egge > > Hmm!! The plot thickens! I noticed the failing pmap_enter was at > 0xefbfd000 which is the first stack page already, but I wasn't sure > if it was the initial creation, or if the stack had been paged out > and was failing on pagein. I applied the following diff to pmap.c Index: pmap.c =================================================================== RCS file: /export/akg1/smp-cvs/sys/i386/i386/pmap.c,v retrieving revision 1.31 diff -c -r1.31 pmap.c *** pmap.c 1996/12/03 05:51:12 1.31 --- pmap.c 1996/12/11 00:48:46 *************** *** 1982,1987 **** --- 1982,1991 ---- vm_offset_t opa; vm_offset_t origpte, newpte; vm_page_t mpte; + volatile u_long old_cr3; + volatile u_long old_frame; + volatile u_long old_PTDpde; + volatile int old_cpunum; if (pmap == NULL) return; *************** *** 2011,2016 **** --- 2015,2024 ---- pmap->pm_pdir[PTDPTDI], va); } + old_cr3 = rcr3(); + old_frame = pmap->pm_pdir[PTDPTDI]; + old_PTDpde = PTDpde; + old_cpunum = cpunumber(); origpte = *(vm_offset_t *)pte; pa &= PG_FRAME; opa = origpte & PG_FRAME; ------------ Afterwards, when looking at the kernel stack trace: ---- #0 boot (howto=256) at ../../kern/kern_shutdown.c:264 #1 0xe0112d69 in panic (fmt=0xe01bcd7f "page fault") at ../../kern/kern_shutdown.c:392 #2 0xe01bda65 in trap_fatal (frame=0xdfbffe4c) at ../../i386/i386/trap.c:747 #3 0xe01bd498 in trap_pfault (frame=0xdfbffe4c, usermode=0) at ../../i386/i386/trap.c:654 #4 0xe01bd0cb in trap (frame={tf_es = -453967856, tf_ds = 16, tf_edi = -533289196, tf_esi = -541077504, tf_ebp = -541065552, tf_isp = -541065612, tf_ebx = 86614016, tf_edx = -4194304, tf_ecx = -528396, tf_eax = 0, tf_trapno = 12, tf_err = 0, tf_eip = -535058445, tf_cs = 8, tf_eflags = 66050, tf_esp = -533683197, tf_ss = -453959040}) at ../../i386/i386/trap.c:313 #5 0xe01ba7f3 in pmap_enter (pmap=0xe4ee0f64, va=3753889792, pa=86614016, prot=7 '\a', wired=0) at ../../i386/i386/pmap.c:2022 #6 0xe01a41b3 in vm_fault (map=0xe4ee0f00, vaddr=3753889792, fault_type=3 '\003', change_wiring=0) at ../../vm/vm_fault.c:773 #7 0xe01bd3f0 in trap_pfault (frame=0xdfbfffbc, usermode=1) at ../../i386/i386/trap.c:634 #8 0xe01bcf73 in trap (frame={tf_es = 39, tf_ds = 39, tf_edi = 352256, tf_esi = 331156, tf_ebp = -541075036, tf_isp = -541065244, tf_ebx = 2, tf_edx = 1, tf_ecx = -541075000, tf_eax = 0, tf_trapno = 12, tf_err = 7, tf_eip = 45296, tf_cs = 31, tf_eflags = 66050, tf_esp = -541075060, tf_ss = 39}) at ../../i386/i386/trap.c:241 #9 0xb0f0 in ?? () #10 0x63ab in ?? () #11 0x5ef0 in ?? () #12 0x7d01 in ?? () #13 0x7984 in ?? () #14 0x7754 in ?? () #15 0x60eb in ?? () #16 0x58e1 in ?? () #17 0xc11f in ?? () #18 0xc02e in ?? () #19 0x107e in ?? () (kgdb) up 5 #5 0xe01ba7f3 in pmap_enter (pmap=0xe4ee0f64, va=3753889792, pa=86614016, prot=7 '\a', wired=0) at ../../i386/i386/pmap.c:2022 (kgdb) info locals va = 3753889792 pa = 86614016 prot = 7 '\a' pte = (unsigned int *) 0xfff7eff4 opa = 0 origpte = 3761678100 newpte = 0 mpte = (struct vm_page *) 0xe035f7c8 old_cr3 = 85966848 old_frame = 0 old_PTDpde = 85966883 old_cpunum = 0 (kgdb) print/x pmap->pm_pdir[0x37f] $20 = 0x51fc023 ---- This indicates that cr3 was correct, PTDpde was correct, but pmap->pm_pdir[PTDPTDI] evaluated to 0. This triggered the use of the alternate page table memory area. Later on, during the post mortem investigation, pmap->pm_pdir[PTDPTDI] evaluates to the correct value. - Tor Egge From owner-freebsd-smp Tue Dec 10 21:23:33 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id VAA14969 for smp-outgoing; Tue, 10 Dec 1996 21:23:33 -0800 (PST) Received: from dyson.iquest.net (dyson.iquest.net [198.70.144.127]) by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id VAA14957 for ; Tue, 10 Dec 1996 21:23:29 -0800 (PST) Received: (from root@localhost) by dyson.iquest.net (8.8.2/8.6.9) id AAA00192; Wed, 11 Dec 1996 00:22:24 -0500 (EST) From: "John S. Dyson" Message-Id: <199612110522.AAA00192@dyson.iquest.net> Subject: Re: More info about fatal trap 12 To: Tor.Egge@idt.ntnu.no (Tor Egge) Date: Wed, 11 Dec 1996 00:22:24 -0500 (EST) Cc: peter@spinner.dialix.com, smp@bluenose.na.tuns.ca, smp@FreeBSD.org In-Reply-To: <199612110349.EAA00310@pat.idt.unit.no> from "Tor Egge" at Dec 11, 96 04:49:01 am Reply-To: dyson@FreeBSD.org X-Mailer: ELM [version 2.4 PL24 ME8] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@FreeBSD.org X-Loop: FreeBSD.org Precedence: bulk > > This indicates that cr3 was correct, PTDpde was correct, but > pmap->pm_pdir[PTDPTDI] evaluated to 0. This triggered the use of the > alternate page table memory area. > > Later on, during the post mortem investigation, pmap->pm_pdir[PTDPTDI] > evaluates to the correct value. > I am not watching things extremely closely on this front, but it smells like a missing pmap_update (or a defective one.) -- just a shot in the dark... Still lusting after my 2nd CPU :-). John From owner-freebsd-smp Wed Dec 11 13:39:16 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id NAA11345 for smp-outgoing; Wed, 11 Dec 1996 13:39:16 -0800 (PST) Received: (from fsmp@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id NAA11338 for freebsd-smp; Wed, 11 Dec 1996 13:39:14 -0800 (PST) Date: Wed, 11 Dec 1996 13:39:14 -0800 (PST) From: Steve Passe Message-Id: <199612112139.NAA11338@freefall.freebsd.org> To: freebsd-smp Subject: cvs commit: sys/i386/i386 mp_machdep.c sys/i386/include mpapic.h smptests.h sys/i386/isa icu.h vector.s Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk fsmp 96/12/11 13:39:13 Modified: i386/i386 mp_machdep.c Log: fixed minor bug were SMP_INVLTLB needs icu.h, which was missing when "IMEN_NOT_MERGED" was defined. Revision Changes Path 1.34 +5 -1 sys/i386/i386/mp_machdep.c Modified: i386/include mpapic.h smptests.h Log: made "IMEN_NOT_MERGED" official version of kernel. this is just to provide continuity in CVS tree. IMEN_NOT_MERGED is no longer used, and will be removed from source as soon as I am sure the imen/IOApicMask merge is not the cause of our remaining "fatal trap12" problem. Revision Changes Path 1.8 +2 -1 sys/i386/include/mpapic.h 1.6 +9 -1 sys/i386/include/smptests.h Modified: i386/isa icu.h vector.s Log: made "IMEN_NOT_MERGED" official version of kernel. this is just to provide continuity in CVS tree. IMEN_NOT_MERGED is no longer used, and will be removed from source as soon as I am sure the imen/IOApicMask merge is not the cause of our remaining "fatal trap12" problem. icu.h and vector.s didn't have conditional compilation of the pre-merged code, add to keep the CVS tree consistant (see above comments). fix a bug in vector.s where APIC_IO without IPI_INTS had a bogus trailing ',' in a variable list. this caused an extra 0 to be added, spamming the swi* masks. Revision Changes Path 1.10 +21 -1 sys/i386/isa/icu.h 1.31 +27 -3 sys/i386/isa/vector.s From owner-freebsd-smp Wed Dec 11 15:15:30 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id PAA17184 for smp-outgoing; Wed, 11 Dec 1996 15:15:30 -0800 (PST) Received: from ormail.intel.com (ormail.intel.com [134.134.248.3]) by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id PAA17178 for ; Wed, 11 Dec 1996 15:15:27 -0800 (PST) Received: from ichips.intel.com (ichips.intel.com [134.134.50.200]) by ormail.intel.com (8.8.4/8.7.3) with ESMTP id PAA23537 for ; Wed, 11 Dec 1996 15:15:18 -0800 (PST) Received: from pdxcs078.intel.com by ichips.intel.com (8.7.4/jIII) id PAA22149; Wed, 11 Dec 1996 15:12:54 -0800 (PST) Received: by pdxcs078.intel.com (AIX 3.2/UCB 5.64/SW1.11) id AA58904; Wed, 11 Dec 1996 15:15:21 -0800 Message-Id: <9612112315.AA58904@pdxcs078.intel.com> To: freebsd-smp@freebsd.org Subject: some questions concerning TLB shootdowns in FreeBSD Date: Wed, 11 Dec 1996 15:15:21 -0800 From: Mike Haertel Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk I'm curious how/when people are doing TLB shootdowns. Obviously when reducing permission or unmapping pages. How about for manipulations of the dirty/accessed bits? (Does FreeBSD use these?) And how is the shootdown implemented? One simple method: interprocessor interrupt to everyone concerned everyone meets at barrier manipulate page tables everyone flushes appropriate TLB entries and resumes Or is it done in some less conservative fashion? Thanks, Mike From owner-freebsd-smp Wed Dec 11 16:05:45 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id QAA21539 for smp-outgoing; Wed, 11 Dec 1996 16:05:45 -0800 (PST) Received: (from fsmp@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id QAA21529 for freebsd-smp; Wed, 11 Dec 1996 16:05:41 -0800 (PST) Date: Wed, 11 Dec 1996 16:05:41 -0800 (PST) From: Steve Passe Message-Id: <199612120005.QAA21529@freefall.freebsd.org> To: freebsd-smp Subject: cvs commit: sys/i386/isa vector.s Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk fsmp 96/12/11 16:05:40 Modified: i386/isa vector.s Log: folded (APIC_IO && APIC_LAZY) and non-APIC_IO versions of INTR() into 1 macro. Suggested by: Peter Wemm Revision Changes Path 1.32 +47 -81 sys/i386/isa/vector.s From owner-freebsd-smp Wed Dec 11 16:52:10 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id QAA24796 for smp-outgoing; Wed, 11 Dec 1996 16:52:10 -0800 (PST) Received: (from fsmp@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id QAA24789 for freebsd-smp; Wed, 11 Dec 1996 16:52:08 -0800 (PST) Date: Wed, 11 Dec 1996 16:52:08 -0800 (PST) From: Steve Passe Message-Id: <199612120052.QAA24789@freefall.freebsd.org> To: freebsd-smp Subject: cvs commit: sys/i386/include smptests.h sys/i386/isa icu.s vector.s Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk fsmp 96/12/11 16:52:07 Modified: i386/include smptests.h Log: removed APIC_LAZY test. the actions of "APIC_LAZY" are now default for APIC_IO. Revision Changes Path 1.7 +7 -32 sys/i386/include/smptests.h Modified: i386/isa icu.s vector.s Log: the actions of "APIC_LAZY" are now default for APIC_IO. Revision Changes Path 1.20 +8 -12 sys/i386/isa/icu.s 1.33 +32 -37 sys/i386/isa/vector.s From owner-freebsd-smp Thu Dec 12 00:44:24 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id AAA00289 for smp-outgoing; Thu, 12 Dec 1996 00:44:24 -0800 (PST) Received: (from fsmp@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id AAA00276 for freebsd-smp; Thu, 12 Dec 1996 00:44:22 -0800 (PST) Date: Thu, 12 Dec 1996 00:44:22 -0800 (PST) From: Steve Passe Message-Id: <199612120844.AAA00276@freefall.freebsd.org> To: freebsd-smp Subject: cvs commit: sys/i386/i386 mp_machdep.c mpapic.c pmap.c sys/i386/include apic.h mpapic.h smp.h sys/i386/isa icu.h vector.s Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk fsmp 96/12/12 00:44:21 Modified: i386/i386 mp_machdep.c mpapic.c pmap.c i386/include apic.h mpapic.h smp.h i386/isa icu.h vector.s Log: another pass at preparing for multiple IO APICs. nothing more can be done till we go to the ">32 INT" model. code is bracketed with "MULTIPLE_IOAPICS". removed the BYTE register access macros for the IO APIC. it appears that long accesses work with all 'flavors' of IO APICS. Revision Changes Path 1.35 +27 -39 sys/i386/i386/mp_machdep.c 1.25 +57 -52 sys/i386/i386/mpapic.c 1.32 +4 -10 sys/i386/i386/pmap.c 1.16 +7 -12 sys/i386/include/apic.h 1.9 +29 -21 sys/i386/include/mpapic.h 1.26 +17 -19 sys/i386/include/smp.h 1.11 +11 -7 sys/i386/isa/icu.h 1.34 +4 -10 sys/i386/isa/vector.s From owner-freebsd-smp Thu Dec 12 01:53:04 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id BAA02731 for smp-outgoing; Thu, 12 Dec 1996 01:53:04 -0800 (PST) Received: (from fsmp@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id BAA02724 for freebsd-smp; Thu, 12 Dec 1996 01:53:03 -0800 (PST) Date: Thu, 12 Dec 1996 01:53:03 -0800 (PST) From: Steve Passe Message-Id: <199612120953.BAA02724@freefall.freebsd.org> To: freebsd-smp Subject: cvs commit: sys/i386/isa isa.c isa_device.h sio.c Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk fsmp 96/12/12 01:53:03 Modified: i386/isa isa.c isa_device.h sio.c Log: created icu_irq_pending(), a function which examines the 8259 IRQ pending bits. this is needed by some device probes during boot, when the IO APIC is being used for actual INTerrupt service. code sio.c to use icu_irq_pending() during probe. Revision Changes Path 1.13 +16 -1 sys/i386/isa/isa.c 1.7 +3 -0 sys/i386/isa/isa_device.h 1.13 +10 -20 sys/i386/isa/sio.c From owner-freebsd-smp Thu Dec 12 02:13:04 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id CAA03842 for smp-outgoing; Thu, 12 Dec 1996 02:13:04 -0800 (PST) Received: (from fsmp@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id CAA03835 for freebsd-smp; Thu, 12 Dec 1996 02:13:01 -0800 (PST) Date: Thu, 12 Dec 1996 02:13:01 -0800 (PST) From: Steve Passe Message-Id: <199612121013.CAA03835@freefall.freebsd.org> To: freebsd-smp Subject: cvs commit: sys/i386/i386 mpboot.s Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk fsmp 96/12/12 02:13:01 Modified: i386/i386 mpboot.s Log: removed an old set of unneeded nops. one more "FIXME" gone! Revision Changes Path 1.17 +2 -7 sys/i386/i386/mpboot.s From owner-freebsd-smp Thu Dec 12 02:38:36 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id CAA04706 for smp-outgoing; Thu, 12 Dec 1996 02:38:36 -0800 (PST) Received: (from fsmp@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id CAA04698 for freebsd-smp; Thu, 12 Dec 1996 02:38:34 -0800 (PST) Date: Thu, 12 Dec 1996 02:38:34 -0800 (PST) From: Steve Passe Message-Id: <199612121038.CAA04698@freefall.freebsd.org> To: freebsd-smp Subject: cvs commit: sys/i386/i386 swtch.s Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk fsmp 96/12/12 02:38:33 Modified: i386/i386 swtch.s Log: do a proper r/m/w of APIC TPR on way out of cpu_switch(). Revision Changes Path 1.31 +7 -4 sys/i386/i386/swtch.s From owner-freebsd-smp Thu Dec 12 07:03:25 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id HAA15825 for smp-outgoing; Thu, 12 Dec 1996 07:03:25 -0800 (PST) Received: from spinner.DIALix.COM (root@spinner.DIALix.COM [192.203.228.67]) by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id HAA15816; Thu, 12 Dec 1996 07:03:08 -0800 (PST) Received: from spinner.DIALix.COM (peter@localhost.DIALix.oz.au [127.0.0.1]) by spinner.DIALix.COM (8.8.4/8.8.4) with ESMTP id XAA03371; Thu, 12 Dec 1996 23:02:58 +0800 (WST) Message-Id: <199612121502.XAA03371@spinner.DIALix.COM> To: Steve Passe cc: freebsd-smp@freefall.freebsd.org Subject: Re: cvs commit: sys/i386/isa isa.c isa_device.h sio.c In-reply-to: Your message of "Thu, 12 Dec 1996 01:53:03 PST." <199612120953.BAA02724@freefall.freebsd.org> Date: Thu, 12 Dec 1996 23:02:58 +0800 From: Peter Wemm Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Steve Passe wrote: > fsmp 96/12/12 01:53:03 > > Modified: i386/isa isa.c isa_device.h sio.c > Log: > created icu_irq_pending(), a function which examines the 8259 IRQ pending > bits. this is needed by some device probes during boot, when the IO APIC > is being used for actual INTerrupt service. > > code sio.c to use icu_irq_pending() during probe. Umm, silly question I guess, but does this code take LOWPRI delivery mode into account? If you're looking on the local apic, you won't see the pending interrupt if it's been sent to a different cpu.... But I guess this should be fine during boot though. (no, I've not read the code, I've just skimmed 3000 email messages and am about to start on a second pass :-] ) Cheers, -Peter From owner-freebsd-smp Thu Dec 12 10:39:09 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id KAA27193 for smp-outgoing; Thu, 12 Dec 1996 10:39:09 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id KAA27188 for ; Thu, 12 Dec 1996 10:39:06 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id LAA02012; Thu, 12 Dec 1996 11:38:46 -0700 Message-Id: <199612121838.LAA02012@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: Peter Wemm cc: freebsd-smp@freefall.freebsd.org Subject: Re: cvs commit: sys/i386/isa isa.c isa_device.h sio.c In-reply-to: Your message of "Thu, 12 Dec 1996 23:02:58 +0800." <199612121502.XAA03371@spinner.DIALix.COM> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 12 Dec 1996 11:38:46 -0700 Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Hi, > Steve Passe wrote: > > fsmp 96/12/12 01:53:03 > > > > Modified: i386/isa isa.c isa_device.h sio.c > > Log: > > created icu_irq_pending(), a function which examines the 8259 IRQ pending > > bits. this is needed by some device probes during boot, when the IO APIC > > is being used for actual INTerrupt service. > > > > code sio.c to use icu_irq_pending() during probe. > > Umm, silly question I guess, but does this code take LOWPRI delivery mode > into account? If you're looking on the local apic, you won't see the > pending interrupt if it's been sent to a different cpu.... But I guess this > should be fine during boot though. (no, I've not read the code, I've just its just for boot. the sio is presumming all INTs are masked, then tickles the sio in a way that it expects to see a pending INT. no harm in letting it do that, it appears to work fine. the IO APIC isn't ready for use at this point, and because of all the complexity of the APIC exchanges it wouldn't make sense to use them for this anyways. -- Steve Passe | powered by smp@csn.net | FreeBSD From owner-freebsd-smp Fri Dec 13 13:04:19 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id NAA12477 for smp-outgoing; Fri, 13 Dec 1996 13:04:19 -0800 (PST) Received: (from fsmp@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id NAA12468 for freebsd-smp; Fri, 13 Dec 1996 13:04:17 -0800 (PST) Date: Fri, 13 Dec 1996 13:04:17 -0800 (PST) From: Steve Passe Message-Id: <199612132104.NAA12468@freefall.freebsd.org> To: freebsd-smp Subject: cvs commit: sys/i386/isa isa.c Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk fsmp 96/12/13 13:04:16 Modified: i386/isa isa.c Log: removed an unneeded temp, fixing another 'FIXME'. Revision Changes Path 1.14 +4 -8 sys/i386/isa/isa.c From owner-freebsd-smp Fri Dec 13 14:36:06 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id OAA22678 for smp-outgoing; Fri, 13 Dec 1996 14:36:06 -0800 (PST) Received: (from fsmp@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id OAA22671 for freebsd-smp; Fri, 13 Dec 1996 14:36:04 -0800 (PST) Date: Fri, 13 Dec 1996 14:36:04 -0800 (PST) From: Steve Passe Message-Id: <199612132236.OAA22671@freefall.freebsd.org> To: freebsd-smp Subject: cvs commit: sys/i386/include spl.h sys/i386/isa icu.h vector.s Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk fsmp 96/12/13 14:36:03 Modified: i386/include spl.h Log: use same imask layout for both IPI_INTS and non-IPI_INTS. Revision Changes Path 1.8 +3 -3 sys/i386/include/spl.h Modified: i386/isa icu.h vector.s Log: use same imask layout for both IPI_INTS and non-IPI_INTS. removed unnecessary "FIXME:" message from icu.h. Revision Changes Path 1.12 +0 -1 sys/i386/isa/icu.h 1.35 +33 -35 sys/i386/isa/vector.s From owner-freebsd-smp Fri Dec 13 15:02:56 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id PAA23998 for smp-outgoing; Fri, 13 Dec 1996 15:02:56 -0800 (PST) Received: (from fsmp@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id PAA23990 for freebsd-smp; Fri, 13 Dec 1996 15:02:54 -0800 (PST) Date: Fri, 13 Dec 1996 15:02:54 -0800 (PST) From: Steve Passe Message-Id: <199612132302.PAA23990@freefall.freebsd.org> To: freebsd-smp Subject: cvs commit: sys/i386/isa vector.s Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk fsmp 96/12/13 15:02:54 Modified: i386/isa vector.s Log: fix a bug just introduced in the last commit of vector.s. my defines for defaulting NCPU was broken, now fixed & tested. Revision Changes Path 1.36 +2 -3 sys/i386/isa/vector.s From owner-freebsd-smp Fri Dec 13 15:47:55 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id PAA25251 for smp-outgoing; Fri, 13 Dec 1996 15:47:55 -0800 (PST) Received: from uruk.org (root@faustus.dev.com [198.145.95.253]) by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id PAA25243 for ; Fri, 13 Dec 1996 15:47:50 -0800 (PST) Received: from erich by uruk.org with local (Exim 0.53 #1) id E0vYiIU-0002rK-00; Fri, 13 Dec 1996 16:49:46 -0800 To: smp@freebsd.org Subject: Tried SMP kernel from early morning CVS tree Message-Id: From: "Erich Boleyn,,,," Date: Fri, 13 Dec 1996 16:49:46 -0800 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi all. I tried the SMP kernel from the CVS tree from early this morning. It still has the problem on my 4-CPU Pentium Pro test box where a long compile kills it by getting a kernel page fault in pmap_enter. There was one real bug (typo?) in "i386/isa/if_ze.c". There was, when using "APIC_IO", an undefined reference to "readIOApic24()" (I think that's what it was), which was bracketed by "#if defined(APIC_IO)" preprocessor stuff. After looking in some other files, they were using "INTRGET()" in the same way, so I just put it in place, and everything appears to work fine (though I'm not using that driver). I always compile the generic kernel + SMP stuff. Erich Boleyn From owner-freebsd-smp Fri Dec 13 16:56:05 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id QAA28478 for smp-outgoing; Fri, 13 Dec 1996 16:56:05 -0800 (PST) Received: (from fsmp@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id QAA28471 for freebsd-smp; Fri, 13 Dec 1996 16:56:03 -0800 (PST) Date: Fri, 13 Dec 1996 16:56:03 -0800 (PST) From: Steve Passe Message-Id: <199612140056.QAA28471@freefall.freebsd.org> To: freebsd-smp Subject: cvs commit: sys/i386/isa if_ze.c Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk fsmp 96/12/13 16:56:02 Modified: i386/isa if_ze.c Log: fixed a bug introduced by new "MULTIPLE_IOAPICS" code. Submitted by: "Erich Boleyn,,,," Revision Changes Path 1.8 +2 -2 sys/i386/isa/if_ze.c From owner-freebsd-smp Fri Dec 13 17:08:36 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id RAA28762 for smp-outgoing; Fri, 13 Dec 1996 17:08:36 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id RAA28755 for ; Fri, 13 Dec 1996 17:08:25 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id SAA09847; Fri, 13 Dec 1996 18:08:10 -0700 Message-Id: <199612140108.SAA09847@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: "Erich Boleyn,,,," cc: smp@freebsd.org Subject: Re: Tried SMP kernel from early morning CVS tree In-reply-to: Your message of "Fri, 13 Dec 1996 16:49:46 PST." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 13 Dec 1996 18:08:09 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, > Hi all. I tried the SMP kernel from the CVS tree from early this > morning. It still has the problem on my 4-CPU Pentium Pro test box > where a long compile kills it by getting a kernel page fault in > pmap_enter. I wish I knew how to help with this, but I have neither a P6 machine or skills/knowledge in this area... I've been taking advantage of the lull to go thru my code and cleanup a lot of little details that have fallen thru the cracks. --- > There was one real bug (typo?) in "i386/isa/if_ze.c". There was, > when using "APIC_IO", an undefined reference to "readIOApic24()" (I > think that's what it was), which was bracketed by "#if defined(APIC_IO)" > preprocessor stuff. After looking in some other files, they were > using "INTRGET()" in the same way, so I just put it in place, and > everything appears to work fine (though I'm not using that driver). thanx, your fix looks correct, I just committed it on freefall. -- Steve Passe | powered by smp@csn.net | FreeBSD From owner-freebsd-smp Fri Dec 13 17:46:02 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id RAA00543 for smp-outgoing; Fri, 13 Dec 1996 17:46:02 -0800 (PST) Received: from spinner.DIALix.COM (root@spinner.DIALix.COM [192.203.228.67]) by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id RAA00530 for ; Fri, 13 Dec 1996 17:45:53 -0800 (PST) Received: from spinner.DIALix.COM (peter@localhost.DIALix.oz.au [127.0.0.1]) by spinner.DIALix.COM (8.8.4/8.8.4) with ESMTP id JAA13805; Sat, 14 Dec 1996 09:45:17 +0800 (WST) Message-Id: <199612140145.JAA13805@spinner.DIALix.COM> To: Steve Passe cc: "Erich Boleyn,,,," , smp@freebsd.org Subject: Re: Tried SMP kernel from early morning CVS tree In-reply-to: Your message of "Fri, 13 Dec 1996 18:08:09 MST." <199612140108.SAA09847@clem.systemsix.com> Date: Sat, 14 Dec 1996 09:45:16 +0800 From: Peter Wemm Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Steve Passe wrote: > Hi, > > > Hi all. I tried the SMP kernel from the CVS tree from early this > > morning. It still has the problem on my 4-CPU Pentium Pro test box > > where a long compile kills it by getting a kernel page fault in > > pmap_enter. > > I wish I knew how to help with this, but I have neither a P6 machine or > skills/knowledge in this area... I've been taking advantage of the lull to > go thru my code and cleanup a lot of little details that have fallen > thru the cracks. Same here, but I've been out of action for different reasons (like: doing some final work on a new house and preparing to move). There were some good details posted on this problem a few days ago from the other person with the P6 system, there is probably a good clue in there. My initial reaction to the details was that it almost looked like both cpu's accessed a shared data structure at nearly the same time, which should be impossible due to the locking. I can't imagine why this might be happening yet, but I must re-examine that part of the code. An extra local tlb flush might help, but I'm not 100% sure yet. On the other hand: % uname -a FreeBSD spinner.DIALix.COM 3.0-SMP FreeBSD 3.0-SMP #154: Thu Dec 5 02:26:10 WST 1996 peter@spinner.DIALix.COM:/home/peter/smp/sys/compile/SMP i386 % uptime 9:38AM up 9 days, 7:04, 12 users, load averages: 0.00, 0.01, 0.00 This machine has had the stuffing hammered out of it over the last week, I'm really happy with the stability of the P5 systems apart from the Floating point problems. This kernel was built with APIC_IO+APIC_LAZY+SMP_INVLTLB, and I've had none of the subtle corruption problems that I used to have before SMP_INVLTLB. It's been easy to forget that it's running SMP most of the time. :-) Cheers, -Peter From owner-freebsd-smp Fri Dec 13 18:33:56 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id SAA01904 for smp-outgoing; Fri, 13 Dec 1996 18:33:56 -0800 (PST) Received: from bluenose.na.tuns.ca (bluenose.na.tuns.ca [134.190.50.156]) by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id SAA01899 for ; Fri, 13 Dec 1996 18:33:53 -0800 (PST) Received: (from smp@localhost) by bluenose.na.tuns.ca (8.7.6/8.7.3) id WAA19725; Fri, 13 Dec 1996 22:36:14 -0400 (AST) From: "J.M. Chuang" Message-Id: <199612140236.WAA19725@bluenose.na.tuns.ca> Subject: Re: Tried SMP kernel from early morning CVS tree To: peter@spinner.dialix.com (Peter Wemm) Date: Fri, 13 Dec 1996 22:36:14 -0400 (AST) Cc: smp@freebsd.org In-Reply-To: <199612140145.JAA13805@spinner.DIALix.COM> from Peter Wemm at "Dec 14, 96 09:45:16 am" X-Mailer: ELM [version 2.4ME+ PL13 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > > > Hi all. I tried the SMP kernel from the CVS tree from early this > > > morning. It still has the problem on my 4-CPU Pentium Pro test box > > > where a long compile kills it by getting a kernel page fault in > > > pmap_enter. > > > > I wish I knew how to help with this, but I have neither a P6 machine or > > skills/knowledge in this area... I've been taking advantage of the lull to > > go thru my code and cleanup a lot of little details that have fallen > > thru the cracks. > > Same here, but I've been out of action for different reasons (like: > doing some final work on a new house and preparing to move). > > There were some good details posted on this problem a few days ago > from the other person with the P6 system, there is probably a good > clue in there. My initial reaction to the details was that it > almost looked like both cpu's accessed a shared data structure at > nearly the same time, which should be impossible due to the locking. > I can't imagine why this might be happening yet, but I must re-examine > that part of the code. An extra local tlb flush might help, but > I'm not 100% sure yet. How to do an extra local tlb flush? I found that if Dual P6 is booted from IDE drive with current smp-kernel+ SMP_INVLTLB, coredump and sig11 still show up right after the second CPU activated which is very similar to the problem of dual P5 booted from IDE with current smp-kernl without SMP_INVLTLB. Could this IDE problem for P6 be related to trap 12? Jim From owner-freebsd-smp Fri Dec 13 22:39:09 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id WAA14052 for smp-outgoing; Fri, 13 Dec 1996 22:39:09 -0800 (PST) Received: from uruk.org (root@faustus.dev.com [198.145.95.253]) by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id WAA14047 for ; Fri, 13 Dec 1996 22:39:06 -0800 (PST) Received: from uruk.org [127.0.0.1] (erich) by uruk.org with esmtp (Exim 0.53 #1) id E0vYohv-0003dK-00; Fri, 13 Dec 1996 23:40:27 -0800 To: Peter Wemm cc: smp@freebsd.org Subject: TLB shootdown problems? (was -> Re: Tried SMP kernel from early morning CVS tree ) In-reply-to: Your message of "Sat, 14 Dec 1996 09:45:16 +0800." <199612140145.JAA13805@spinner.DIALix.COM> Date: Fri, 13 Dec 1996 23:40:27 -0800 From: Erich Boleyn Message-Id: Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Peter Wemm writes: > There were some good details posted on this problem a few days ago > from the other person with the P6 system, there is probably a good > clue in there. My initial reaction to the details was that it > almost looked like both cpu's accessed a shared data structure at > nearly the same time, which should be impossible due to the locking. > I can't imagine why this might be happening yet, but I must re-examine > that part of the code. An extra local tlb flush might help, but > I'm not 100% sure yet. Here's a question (I'm going to look this up myself, but thought it'd be worthwhile to see if you'd shed light on it before I get to it on my copious spare time ;-) ... How exactly are TLB shootdown IPIs implemented? (or are they any different from any other IPIs?) >From what I could see, it looks like the IPI is considered "finished" (and the function returns) when the APIC status is "delivered". This could be a problem, because the interrupt doesn't necessarily happen on the other CPU at that point (and it certainly isn't completed at that point). You really need some other mechanism to tell you that the operation has completed before you can continue. This might not be as major a problem on the P5 for implementation and shorter pipeline reasons, and the P6 also has deeper pipelines and is much faster relative to the external bus clock (which the APICs use). -- Erich Stefan Boleyn \_ E-mail (preferred): Mad Genius wanna-be, CyberMuffin \__ (finger me for other stats) Web: http://www.uruk.org/~erich/ Motto: "I'll live forever or die trying" From owner-freebsd-smp Fri Dec 13 23:06:25 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id XAA14856 for smp-outgoing; Fri, 13 Dec 1996 23:06:25 -0800 (PST) Received: from root.com (implode.root.com [198.145.90.17]) by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id XAA14851 for ; Fri, 13 Dec 1996 23:06:23 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by root.com (8.7.6/8.6.5) with SMTP id XAA00954; Fri, 13 Dec 1996 23:05:04 -0800 (PST) Message-Id: <199612140705.XAA00954@root.com> X-Authentication-Warning: implode.root.com: Host localhost [127.0.0.1] didn't use HELO protocol To: Mike Haertel cc: freebsd-smp@freebsd.org Subject: Re: some questions concerning TLB shootdowns in FreeBSD In-reply-to: Your message of "Wed, 11 Dec 1996 15:15:21 PST." <9612112315.AA58904@pdxcs078.intel.com> From: David Greenman Reply-To: dg@root.com Date: Fri, 13 Dec 1996 23:05:04 -0800 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk >I'm curious how/when people are doing TLB shootdowns. >Obviously when reducing permission or unmapping pages. >How about for manipulations of the dirty/accessed bits? >(Does FreeBSD use these?) Speaking of the uni-processor case, FreeBSD does the access/modify bit changes in the pmap_changebit() function which does a TLB flush if anything is actually changed. -DG David Greenman Core-team/Principal Architect, The FreeBSD Project From owner-freebsd-smp Sat Dec 14 02:46:24 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id CAA21509 for smp-outgoing; Sat, 14 Dec 1996 02:46:24 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id CAA21504 for ; Sat, 14 Dec 1996 02:46:21 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id DAA14527; Sat, 14 Dec 1996 03:44:43 -0700 Message-Id: <199612141044.DAA14527@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: Erich Boleyn cc: Peter Wemm , haertel@ichips.intel.com, smp@freebsd.org Subject: Re: TLB shootdown problems? (was -> Re: Tried SMP kernel from early morning CVS tree ) In-reply-to: Your message of "Fri, 13 Dec 1996 23:40:27 PST." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Sat, 14 Dec 1996 03:44:43 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, > Here's a question (I'm going to look this up myself, but thought it'd > be worthwhile to see if you'd shed light on it before I get to it on > my copious spare time ;-) ... > > How exactly are TLB shootdown IPIs implemented? (or are they any > different from any other IPIs?) > > >From what I could see, it looks like the IPI is considered "finished" > (and the function returns) when the APIC status is "delivered". This > could be a problem, because the interrupt doesn't necessarily happen > on the other CPU at that point (and it certainly isn't completed at > that point). You really need some other mechanism to tell you that > the operation has completed before you can continue. this is an accurate picture of the current situation. we just send it and "assumme" that things are now 'OK'. We know this isn't correct, its just step one on the way there. It made remarkable improvement on the P5 machines. So I guess the next step is a rendezvous mechanism to control this. If anyone could suggest an effective algorithm for it I could take a whack at programming it. -- Steve Passe | powered by smp@csn.net | FreeBSD From owner-freebsd-smp Sat Dec 14 07:04:04 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id HAA05521 for smp-outgoing; Sat, 14 Dec 1996 07:04:04 -0800 (PST) Received: from spinner.DIALix.COM (root@spinner.DIALix.COM [192.203.228.67]) by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id HAA05483 for ; Sat, 14 Dec 1996 07:03:59 -0800 (PST) Received: from spinner.DIALix.COM (peter@localhost.DIALix.oz.au [127.0.0.1]) by spinner.DIALix.COM (8.8.4/8.8.4) with ESMTP id XAA17454; Sat, 14 Dec 1996 23:03:51 +0800 (WST) Message-Id: <199612141503.XAA17454@spinner.DIALix.COM> To: smp@freebsd.org cc: Mike Haertel Subject: Re: some questions concerning TLB shootdowns in FreeBSD In-reply-to: Your message of "Fri, 13 Dec 1996 23:05:04 PST." <199612140705.XAA00954@root.com> Date: Sat, 14 Dec 1996 23:03:51 +0800 From: Peter Wemm Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk David Greenman wrote: > >I'm curious how/when people are doing TLB shootdowns. > >Obviously when reducing permission or unmapping pages. > >How about for manipulations of the dirty/accessed bits? > >(Does FreeBSD use these?) > > Speaking of the uni-processor case, FreeBSD does the access/modify bit > changes in the pmap_changebit() function which does a TLB flush if anything > is actually changed. > > -DG > > David Greenman > Core-team/Principal Architect, The FreeBSD Project Also, since I wrote the initial TLB flushing on top of Steve's IPI code, I can freely admit that what I wrote is pretty sub-standard and does not go far enough. There are a couple of major shortfalls: 1: It's async. it does not syncronise the remote processors as it must do, or they can get out of sync, slave processors can do updates on stale data, etc. 2: It does too much work. There are a lot of cases where a global flush is done for the local user process on the local cpu. I am not 100% sure whether this is needed or not. I can imagine that APTD accesses might present a problem if we try to avoid global flushes here. 3: We have no way of doing a local-only tlb flush with the trivial hack that I did to test the theory. 4: the CADDR/APTD hacks are potential problems from many angles. As long as we only have one cpu in the kernel "proper" at present, this shouldn't be too much of an issue yet - but it will be one of the things waiting to bite us later on. There was the query about the possibility of speculative execution on the PPro being the problem that is breaking the kernel. The scenario sounds plausable, but my initial reaction to that was that we are doing this from an _interrupt handler_, and I would be very suprised if speculative execution from the original code thread isn't wound up before going into the interrupt... If not, do we need some strategic nop's? I'm still digesting it, I am almost worried that we might (shudder!) be forced into doing an IPI to stop all the cpu's *before* the current cpu changes the page tables, then letting them do the tlb flush and letting them proceed. If this actually is a real problem this means a much bigger code impact. Cheers, -Peter From owner-freebsd-smp Sat Dec 14 09:19:02 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id JAA16823 for smp-outgoing; Sat, 14 Dec 1996 09:19:02 -0800 (PST) Received: from uruk.org (root@faustus.dev.com [198.145.95.253]) by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id JAA16812 for ; Sat, 14 Dec 1996 09:18:57 -0800 (PST) Received: from uruk.org [127.0.0.1] (erich) by uruk.org with esmtp (Exim 0.53 #1) id E0vYyUV-0004qa-00; Sat, 14 Dec 1996 10:07:15 -0800 To: Steve Passe cc: peter@spinner.dialix.com, haertel@ichips.intel.com, smp@freebsd.org Subject: Re: TLB shootdown problems? (was -> Re: Tried SMP kernel from early morning CVS tree ) In-reply-to: Your message of "Sat, 14 Dec 1996 03:44:43 MST." <199612141044.DAA14527@clem.systemsix.com> Date: Sat, 14 Dec 1996 10:07:15 -0800 From: Erich Boleyn Message-Id: Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Steve Passe writes: > Hi, > > > Here's a question (I'm going to look this up myself, but thought it'd > > be worthwhile to see if you'd shed light on it before I get to it on > > my copious spare time ;-) ... > > > > How exactly are TLB shootdown IPIs implemented? (or are they any > > different from any other IPIs?) > > > > >From what I could see, it looks like the IPI is considered "finished" > > (and the function returns) when the APIC status is "delivered". This > > could be a problem, because the interrupt doesn't necessarily happen > > on the other CPU at that point (and it certainly isn't completed at > > that point). You really need some other mechanism to tell you that > > the operation has completed before you can continue. > > this is an accurate picture of the current situation. we just send it and > "assumme" that things are now 'OK'. We know this isn't correct, its just > step one on the way there. It made remarkable improvement on the P5 > machines. So I guess the next step is a rendezvous mechanism to control > this. If anyone could suggest an effective algorithm for it I could take > whack at programming it. Yes, that was what I thought. The easiest (and maybe best performing) thing to do is have the sender spin waiting on bits being twiddled in global memory, then have the target CPUs' IPI handlers do such twiddling. The real question at this point is: Can only one TLB shootdown be in progress at any one time. If so, a good example to look at is Linux-SMP: Linux-SMP has a bitwise (since SMP-capable x86es have bitwise test and test-and-set operators) mask "smp_invalidate_needed". There is one bit for each CPU. When an invalidate is needed on a particular CPU, the corresponding bit is set atomically. Whenever a TLB invalidate is made on a particular CPU, the corresponding bit is unset atomically. There are ways to play with that so not all CPUs need be sent messages all the time, plus Linux-SMP does TLB invalidates in it's global spinlock, etc. It also doesn't necessarily need to try to send the "smp_invalidate" message right after the pmap change, just when it expects to need to see it locally or globally... this allows time in which other CPUs could do invalidates. This kind of thing would provide a moderate base on which to make it more fine-grained over time. A simple version which could use the same mechanism would be to have the IPI handler do the right thing, but just have the "smp_invalidate" message set all the "smp_invalidate_needed" bits (except our own!) for now, to get everything working. Avoiding setting bits for CPUs which don't need invalidates could be done later this way without changing the reception mechanism at all. For a kernel architecture which is multi-threaded/re-entrant, then things get more complicated. I still have an algorithm in mind, but it's just a bit long to put here right now (essentially, you have to be able to guarantee if there are multiple TLB invalidates flying around, that both the right things happen, and they both terminate reasonably). -- Erich Stefan Boleyn \_ E-mail (preferred): Mad Genius wanna-be, CyberMuffin \__ (finger me for other stats) Web: http://www.uruk.org/~erich/ Motto: "I'll live forever or die trying" From owner-freebsd-smp Sat Dec 14 09:26:12 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id JAA17392 for smp-outgoing; Sat, 14 Dec 1996 09:26:12 -0800 (PST) Received: from ormail.intel.com (ormail.intel.com [134.134.248.3]) by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id JAA17384 for ; Sat, 14 Dec 1996 09:26:08 -0800 (PST) From: haertel@ichips.intel.com Received: from ichips.intel.com (ichips.intel.com [134.134.50.200]) by ormail.intel.com (8.8.4/8.7.3) with ESMTP id JAA12513; Sat, 14 Dec 1996 09:25:48 -0800 (PST) Received: from pdxcs078.intel.com by ichips.intel.com (8.7.4/jIII) id JAA28180; Sat, 14 Dec 1996 09:23:01 -0800 (PST) Received: by pdxcs078.intel.com (AIX 3.2/UCB 5.64/SW1.11) id AA57406; Sat, 14 Dec 1996 09:25:51 -0800 Date: Sat, 14 Dec 1996 09:25:51 -0800 Message-Id: <9612141725.AA57406@pdxcs078.intel.com> To: peter@spinner.dialix.com Subject: Re: some questions concerning TLB shootdowns in FreeBSD Cc: dg@root.com, smp@freebsd.org Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk >I'm still digesting it, I am almost worried that we might (shudder!) >be forced into doing an IPI to stop all the cpu's *before* the >current cpu changes the page tables, then letting them do the tlb >flush and letting them proceed. If this actually is a real problem >this means a much bigger code impact. You must do precisely this. The x86 architecture includes some complex instructions that reference the same memory locations more than once--read-modify-write sequences are the most obvious example. For various reasons, there is no guarantee that the TLB entries associated with those memory locations are locked in the TLB, and so they might be thrashed out due to other activity while those complex instructions are executing. If, in the meantime, some other processor has manipulated the associated PTE in any way that lowers privilege or changes the mapping, this processor could get a page fault in a *non restartable* way, since it would see the mapping and/or privilege changing under foot, but have already committed to finishing the instruction (since the privilege checks are normally only done at the beginning of the instruction). As for your other question: speculative execution does not continue past an interrupt. An interrupt is a totally serializing event. However, once you're in the interrupt handler, speculative execution could go down a different path than you think of the interrupt as actually taking. Basically every time the processor fetches something from the Icache that it thinks *might* contain a branch, it is an opportunity for the processor to go off into la-la land, since it will simply ask the branch predictor what it thinks and go that way. The effect of this is speculative pollution of the non-renamed state of the processor like the cache and the TLB entries. So, for example, in the uniprocessor case, doing this: 1. flush TLB 2. manipulate PTE is not safe, since after (1), the processor may waltz speculatively off to some code that actually references the PTE before you manipulate it. Instead you must always: 1. Manipulate PTE 2. flush TLB On multiprocessors, there is the additional concern of corrupting state which must remain invariant during instruction execution on other processors. So then you need the fully bulletproof code: 1. IPI to everyone sharing these specific PTE's 2. wait at barrier until everyone arrives 3. manipulate PTE 4. release barrier 5. everyone (including us) flushes TLB's Bleah, I know. From owner-freebsd-smp Sat Dec 14 09:27:43 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id JAA17548 for smp-outgoing; Sat, 14 Dec 1996 09:27:43 -0800 (PST) Received: from uruk.org (root@faustus.dev.com [198.145.95.253]) by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id JAA17536 for ; Sat, 14 Dec 1996 09:27:38 -0800 (PST) Received: from uruk.org [127.0.0.1] (erich) by uruk.org with esmtp (Exim 0.53 #1) id E0vYypx-0004tV-00; Sat, 14 Dec 1996 10:29:25 -0800 To: Peter Wemm cc: smp@freebsd.org, haertel@ichips.intel.com Subject: Re: some questions concerning TLB shootdowns in FreeBSD In-reply-to: Your message of "Sat, 14 Dec 1996 23:03:51 +0800." <199612141503.XAA17454@spinner.DIALix.COM> Date: Sat, 14 Dec 1996 10:29:25 -0800 From: Erich Boleyn Message-Id: Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Peter Wemm writes: > 1: It's async. it does not syncronise the remote processors as it must do, > or they can get out of sync, slave processors can do updates on stale > data, etc. As mentioned in another message, this is bad. > 2: It does too much work. There are a lot of cases where a global flush > is done for the local user process on the local cpu. I am not 100% sure > whether this is needed or not. I can imagine that APTD accesses might > present a problem if we try to avoid global flushes here. This is perfectly OK from a functional point of view. Personally, I think efficiency is less important than getting it to work at this point. > There was the query about the possibility of speculative execution > on the PPro being the problem that is breaking the kernel. The > scenario sounds plausable, but my initial reaction to that was that > we are doing this from an _interrupt handler_, and I would be very > suprised if speculative execution from the original code thread > isn't wound up before going into the interrupt... If not, do we > need some strategic nop's? No! Speculative execution which broke interrupt handlers would be very bad, in a lot of systems. Perhaps Mike Haertel can comment more clearly, but my memory claims these kind of actions were serialized. There are actually some cases which can break, but as far as I know these are all bus-propagation issues to external devices. However, I think the IPI can be considered delivered, and that doesn't guarantee that the CPU has been interrupted (what about interrupts being masked, for example?). I think it just says the interrupt was accepted by the queue on the other APIC. > I'm still digesting it, I am almost worried that we might (shudder!) > be forced into doing an IPI to stop all the cpu's *before* the > current cpu changes the page tables, then letting them do the tlb > flush and letting them proceed. If this actually is a real problem > this means a much bigger code impact. I don't think so, but to allay your fears, note that if some page permissions are changed: 1) Increasing permission is OK, because that should simply cause a false page-fault. 2) Decreasing permissions can cause the situation where thread/process A (perhaps a kernel thread) can be trying to deallocate a page in thread/process B which is in progress of accessing the data in that page (or might be). #2 might be considered a race condition, but it also looks like a natural timing problem that you can't get around anyway. As long as there is some real rondezvous mechanism (such as mentioned in my last message) for TLB shootdown IPIs to be acknowledged before the sending CPU continues, you're guaranteeing that the original thread can't continue until the other CPU's TLBs are really cleared, which is all that seems important. -- Erich Stefan Boleyn \_ E-mail (preferred): Mad Genius wanna-be, CyberMuffin \__ (finger me for other stats) Web: http://www.uruk.org/~erich/ Motto: "I'll live forever or die trying" From owner-freebsd-smp Sat Dec 14 10:00:18 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id KAA21397 for smp-outgoing; Sat, 14 Dec 1996 10:00:18 -0800 (PST) Received: from uruk.org (root@faustus.dev.com [198.145.95.253]) by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id KAA21390 for ; Sat, 14 Dec 1996 10:00:14 -0800 (PST) Received: from uruk.org [127.0.0.1] (erich) by uruk.org with esmtp (Exim 0.53 #1) id E0vYzLc-0004xo-00; Sat, 14 Dec 1996 11:02:08 -0800 To: haertel@ichips.intel.com cc: smp@freebsd.org, dg@root.com Subject: Re: some questions concerning TLB shootdowns in FreeBSD In-reply-to: Your message of "Sat, 14 Dec 1996 09:25:51 PST." <9612141725.AA57406@pdxcs078.intel.com> Date: Sat, 14 Dec 1996 11:02:08 -0800 From: Erich Boleyn Message-Id: Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk haertel@ichips.intel.com (Mike Haertel) writes: > >I'm still digesting it, I am almost worried that we might (shudder!) > >be forced into doing an IPI to stop all the cpu's *before* the > >current cpu changes the page tables, then letting them do the tlb > >flush and letting them proceed. If this actually is a real problem > >this means a much bigger code impact. > > You must do precisely this. > > The x86 architecture includes some complex instructions that > reference the same memory locations more than once--read-modify-write > sequences are the most obvious example. For various reasons, > there is no guarantee that the TLB entries associated with those > memory locations are locked in the TLB, and so they might be > thrashed out due to other activity while those complex instructions > are executing. If, in the meantime, some other processor > has manipulated the associated PTE in any way that lowers privilege > or changes the mapping, this processor could get a page fault > in a *non restartable* way, since it would see the mapping and/or > privilege changing under foot, but have already committed to > finishing the instruction (since the privilege checks are > normally only done at the beginning of the instruction). Urk! Thanks for clarifying this. I'm curious as to why this hasn't been a problem on Linux-SMP ... -- Erich Stefan Boleyn \_ E-mail (preferred): Mad Genius wanna-be, CyberMuffin \__ (finger me for other stats) Web: http://www.uruk.org/~erich/ Motto: "I'll live forever or die trying" From owner-freebsd-smp Sat Dec 14 10:39:26 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id KAA26358 for smp-outgoing; Sat, 14 Dec 1996 10:39:26 -0800 (PST) Received: from dyson.iquest.net (dyson.iquest.net [198.70.144.127]) by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id KAA26347 for ; Sat, 14 Dec 1996 10:39:23 -0800 (PST) Received: (from root@localhost) by dyson.iquest.net (8.8.2/8.6.9) id NAA00208; Sat, 14 Dec 1996 13:38:47 -0500 (EST) From: "John S. Dyson" Message-Id: <199612141838.NAA00208@dyson.iquest.net> Subject: Re: some questions concerning TLB shootdowns in FreeBSD To: peter@spinner.dialix.com (Peter Wemm) Date: Sat, 14 Dec 1996 13:38:47 -0500 (EST) Cc: smp@freebsd.org, haertel@ichips.intel.com In-Reply-To: <199612141503.XAA17454@spinner.DIALix.COM> from "Peter Wemm" at Dec 14, 96 11:03:51 pm Reply-To: dyson@freebsd.org X-Mailer: ELM [version 2.4 PL24 ME8] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > > I'm still digesting it, I am almost worried that we might (shudder!) > be forced into doing an IPI to stop all the cpu's *before* the > current cpu changes the page tables, then letting them do the tlb > flush and letting them proceed. If this actually is a real problem > this means a much bigger code impact. > The way that I see it, is that the current pmap code is highly optimized for single processor operation. If I was you, I would try to just try to get something working correctly algorithmically -- almost ignoring performance issues. Of course, when performance is easy -- go for that also. Alot of things like single page invalidates inside of loops appear that they could be evil for multi-processor applications (imagine an inter- processor interrupt for every loop!?!?.) I think that you (we or us), will have to look at the performance for the SMP direction, and it might even entail large differences in pmap eventually. Hopefully, we will all be able to isolate the differences for the maintenance of sanity :-). John From owner-freebsd-smp Sat Dec 14 10:57:16 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id KAA28629 for smp-outgoing; Sat, 14 Dec 1996 10:57:16 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id KAA28621 for ; Sat, 14 Dec 1996 10:57:12 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id LAA16724; Sat, 14 Dec 1996 11:55:18 -0700 Message-Id: <199612141855.LAA16724@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: haertel@ichips.intel.com cc: peter@spinner.dialix.com, dg@root.com, smp@freebsd.org, toor@dyson.iquest.net Subject: Re: some questions concerning TLB shootdowns in FreeBSD In-reply-to: Your message of "Sat, 14 Dec 1996 09:25:51 PST." <9612141725.AA57406@pdxcs078.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Sat, 14 Dec 1996 11:55:18 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, >On multiprocessors, there is the additional concern of corrupting >state which must remain invariant during instruction execution on >other processors. So then you need the fully bulletproof code: > > 1. IPI to everyone sharing these specific PTE's > 2. wait at barrier until everyone arrives > 3. manipulate PTE > 4. release barrier > 5. everyone (including us) flushes TLB's this was my concern but I didn't know how to word it so concisely! right now the code looks like: /* edited version to show the general idea: */ invlpg(u_int addr) { __asm __volatile("invlpg (%0)" : : "r" (addr) : "memory"); allButSelfIPI(ICU_OFFSET+27); } so some routine modifies a PTE, then calls invlpg(). this works for itself, as it won't try to use the stale page between modifying the PTE and flushing its TLB. However the other CPUs are running async, and may access the page in question between the time the 1st CPU changes the PTE and the time they receive the IPI. It seems like the rfork() situation where we seem to be getting hit is particularily prone to tripping over this. The above proposed algorithm seems like the only safe method of dealing with the problem... -- Steve Passe | powered by smp@csn.net | FreeBSD From owner-freebsd-smp Sat Dec 14 13:25:39 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id NAA14411 for smp-outgoing; Sat, 14 Dec 1996 13:25:39 -0800 (PST) Received: from tfs.com (tfs.com [140.145.250.1]) by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id NAA14404; Sat, 14 Dec 1996 13:25:32 -0800 (PST) Received: from critter.tfs.com by tfs.com (smail3.1.28.1) with SMTP id m0vZ1GT-0003wLC; Sat, 14 Dec 96 13:04 PST Received: from critter.tfs.com (localhost [127.0.0.1]) by critter.tfs.com (8.8.2/8.8.2) with ESMTP id VAA05559; Sat, 14 Dec 1996 21:06:28 +0100 (MET) To: dyson@freebsd.org cc: peter@spinner.dialix.com (Peter Wemm), smp@freebsd.org, haertel@ichips.intel.com Subject: Re: some questions concerning TLB shootdowns in FreeBSD In-reply-to: Your message of "Sat, 14 Dec 1996 13:38:47 EST." <199612141838.NAA00208@dyson.iquest.net> Date: Sat, 14 Dec 1996 21:06:28 +0100 Message-ID: <5557.850593988@critter.tfs.com> From: Poul-Henning Kamp Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk In message <199612141838.NAA00208@dyson.iquest.net>, "John S. Dyson" writes: >The way that I see it, is that the current pmap code is highly optimized >for single processor operation. If I was you, I would try to just >try to get something working correctly algorithmically -- almost ignoring >performance issues. Of course, when performance is easy -- go for that >also. > >Alot of things like single page invalidates inside of loops appear that >they could be evil for multi-processor applications (imagine an inter- >processor interrupt for every loop!?!?.) I think that you (we or us), >will have to look at the performance for the SMP direction, and it >might even entail large differences in pmap eventually. Hopefully, >we will all be able to isolate the differences for the maintenance of >sanity :-). The crucial thing, as far as I can see, is to find out >if< we need to tell the other CPU's about this change to the pagetables. For a 2cpu system the penalty of stopping the other CPU is still within bounds of the reasonable, but stopping three CPUs needlessly is not a good idea. Is there any cheap way to keep a refcount (or bitmap) per vm-object so we can see if we need to kick the other CPUs if we fiddle it ? -- Poul-Henning Kamp | phk@FreeBSD.ORG FreeBSD Core-team. http://www.freebsd.org/~phk | phk@login.dknet.dk Private mailbox. whois: [PHK] | phk@tfs.com TRW Financial Systems, Inc. Power and ignorance is a disgusting cocktail. From owner-freebsd-smp Sat Dec 14 13:35:32 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id NAA15308 for smp-outgoing; Sat, 14 Dec 1996 13:35:32 -0800 (PST) Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id NAA15297; Sat, 14 Dec 1996 13:35:25 -0800 (PST) Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id OAA22198; Sat, 14 Dec 1996 14:12:47 -0700 From: Terry Lambert Message-Id: <199612142112.OAA22198@phaeton.artisoft.com> Subject: Re: some questions concerning TLB shootdowns in FreeBSD To: phk@critter.tfs.com (Poul-Henning Kamp) Date: Sat, 14 Dec 1996 14:12:47 -0700 (MST) Cc: dyson@freebsd.org, peter@spinner.dialix.com, smp@freebsd.org, haertel@ichips.intel.com In-Reply-To: <5557.850593988@critter.tfs.com> from "Poul-Henning Kamp" at Dec 14, 96 09:06:28 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > The crucial thing, as far as I can see, is to find out >if< we need to > tell the other CPU's about this change to the pagetables. For a 2cpu > system the penalty of stopping the other CPU is still within bounds > of the reasonable, but stopping three CPUs needlessly is not a good idea. > > Is there any cheap way to keep a refcount (or bitmap) per vm-object so > we can see if we need to kick the other CPUs if we fiddle it ? Oh, I like this. It would make it very easy to have multiple references in the UP case, as well as the MP case. This would let us do device/offset as well as vnode/offset based caching (for instance, hanging all cache buffers for vnodes on a device off the device vnode). I've wanted this for some time, since I am determined that vclean must die... Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. From owner-freebsd-smp Sat Dec 14 13:43:25 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id NAA16038 for smp-outgoing; Sat, 14 Dec 1996 13:43:25 -0800 (PST) Received: from tfs.com (tfs.com [140.145.250.1]) by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id NAA16024; Sat, 14 Dec 1996 13:43:21 -0800 (PST) Received: from critter.tfs.com by tfs.com (smail3.1.28.1) with SMTP id m0vZ1r2-0003vlC; Sat, 14 Dec 96 13:42 PST Received: from critter.tfs.com (localhost.phk.dk [127.0.0.1]) by critter.tfs.com (8.8.2/8.8.2) with ESMTP id WAA08258; Sat, 14 Dec 1996 22:45:49 +0100 (MET) To: Terry Lambert cc: dyson@freebsd.org, peter@spinner.dialix.com, smp@freebsd.org, haertel@ichips.intel.com Subject: Re: some questions concerning TLB shootdowns in FreeBSD In-reply-to: Your message of "Sat, 14 Dec 1996 14:12:47 MST." <199612142112.OAA22198@phaeton.artisoft.com> Date: Sat, 14 Dec 1996 22:45:49 +0100 Message-ID: <8256.850599949@critter.tfs.com> From: Poul-Henning Kamp Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk In message <199612142112.OAA22198@phaeton.artisoft.com>, Terry Lambert writes: >> The crucial thing, as far as I can see, is to find out >if< we need to >> tell the other CPU's about this change to the pagetables. For a 2cpu >> system the penalty of stopping the other CPU is still within bounds >> of the reasonable, but stopping three CPUs needlessly is not a good idea. >> >> Is there any cheap way to keep a refcount (or bitmap) per vm-object so >> we can see if we need to kick the other CPUs if we fiddle it ? > >Oh, I like this. > >It would make it very easy to have multiple references in the UP case, >as well as the MP case. > >This would let us do device/offset as well as vnode/offset based caching >(for instance, hanging all cache buffers for vnodes on a device off the >device vnode). > >I've wanted this for some time, since I am determined that vclean must >die... > Cool. send patches when done :-) -- Poul-Henning Kamp | phk@FreeBSD.ORG FreeBSD Core-team. http://www.freebsd.org/~phk | phk@login.dknet.dk Private mailbox. whois: [PHK] | phk@tfs.com TRW Financial Systems, Inc. Power and ignorance is a disgusting cocktail. From owner-freebsd-smp Sat Dec 14 13:55:16 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id NAA17678 for smp-outgoing; Sat, 14 Dec 1996 13:55:16 -0800 (PST) Received: from dyson.iquest.net (dyson.iquest.net [198.70.144.127]) by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id NAA17668; Sat, 14 Dec 1996 13:55:09 -0800 (PST) Received: (from root@localhost) by dyson.iquest.net (8.8.2/8.6.9) id QAA05435; Sat, 14 Dec 1996 16:54:09 -0500 (EST) From: "John S. Dyson" Message-Id: <199612142154.QAA05435@dyson.iquest.net> Subject: Re: some questions concerning TLB shootdowns in FreeBSD To: terry@lambert.org (Terry Lambert) Date: Sat, 14 Dec 1996 16:54:09 -0500 (EST) Cc: phk@critter.tfs.com, dyson@freebsd.org, peter@spinner.dialix.com, smp@freebsd.org, haertel@ichips.intel.com In-Reply-To: <199612142112.OAA22198@phaeton.artisoft.com> from "Terry Lambert" at Dec 14, 96 02:12:47 pm X-Mailer: ELM [version 2.4 PL24 ME8] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > > This would let us do device/offset as well as vnode/offset based caching > (for instance, hanging all cache buffers for vnodes on a device off the > device vnode). > > I've wanted this for some time, since I am determined that vclean must > die... > Slightly off subject, but I plan to sometime carry the vnode/offset caching to a more generalized scheme that also encompasses device/offset caching. Specifically, device/offset is the same as vnode/offset. This will allow us to cache data without the vnode. However, we will continue to have the advantages of the current vnode/offset scheme. John From owner-freebsd-smp Sat Dec 14 13:56:03 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id NAA17809 for smp-outgoing; Sat, 14 Dec 1996 13:56:03 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id NAA17776 for ; Sat, 14 Dec 1996 13:55:55 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id OAA17577; Sat, 14 Dec 1996 14:51:50 -0700 Message-Id: <199612142151.OAA17577@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: Steve Passe cc: haertel@ichips.intel.com, peter@spinner.dialix.com, dg@root.com, smp@freebsd.org, toor@dyson.iquest.net Subject: Re: some questions concerning TLB shootdowns in FreeBSD In-reply-to: Your message of "Sat, 14 Dec 1996 11:55:18 MST." <199612141855.LAA16724@clem.systemsix.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Sat, 14 Dec 1996 14:51:50 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, so here is a suggested set of code for the TLB sync problem. There is one problem with it (that I currently see, surely others also!) in that the target CPUs can't service their IPI as the invoking CPU holds the mp_lock. So for now lets pretend that we have a separate lock for IPIs called ipi_lock, which is manipulated via get_ipilock()/rel_ipilock(). --- usage by the invoking CPU: startRendezvous(); /* setup a rendezvous */ /* * at this point the other CPUs are all spinning on the end lock * so the code can safely muck with PTD/PTE entries... */ invltlb(); /* CPU flushes local TLB */ endRendezvous(); /* end the rendezvous */ --- usage by the invoked CPUs, ie the routine invoked by the IPI: ipi_invltlb(void) { u_long temp; doRendezvous(); /* declare our arrival and wait */ __asm __volatile("movl %%cr3, %0; movl %0, %%cr3" : "=r" (temp) : : "memory"); } ----------------------------------- cut ------------------------------------- /* rendezvous.s */ .text .align 4 #define SMP_INVLTLB_IPI (ICU_OFFSET+27) /* * invoking CPU sets up rendezvous */ ENTRY(startRendezvous) call _get_ipilock /* only one CPU at a time */ movl _mp_ncpus, %eax /* # of CPUs to sync */ decl %eax /* count ourself */ movl %eax, _rendezvousCount /* init the downcounter */ movl %eax, _rendezvousEnd /* init the release lock */ pushl SMP_INVLTLB_IPI call _allButSelfIPI addl $4, %esp call _rel_ipilock /* now safe for other CPUs */ 1: cmpl $0, _rendezvousCount /* check current value */ jnz 1b /* somebody not here yet */ call _get_ipilock /* is this necessary??? */ ret /* * invoking CPU releases all other CPUs */ ENTRY(endRendezvous) movl $0, _rendezvousEnd call _rel_ipilock /* is this necessary??? */ ret /* * invoked CPUs enter and wait for end */ ENTRY(doRendezvous) call _rel_ipilock /* allow other CPUs to IPI */ lock /* ensure atomic operation */ decl _rendezvousCount /* declare our arrival */ 1: cmpl $0, _rendezvousEnd /* test for end */ jnz 1b /* not yet, spin */ call _get_ipilock /* safe exit from IPI */ ret .data ALIGN_DATA .globl _rendezvousCount _rendezvousCount: .long 0 .globl _rendezvousEnd _rendezvousEnd: .long 0 ----------------------------------- cut ------------------------------------- -- Steve Passe | powered by smp@csn.net | FreeBSD From owner-freebsd-smp Sat Dec 14 14:02:12 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id OAA18389 for smp-outgoing; Sat, 14 Dec 1996 14:02:12 -0800 (PST) Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id OAA18374; Sat, 14 Dec 1996 14:02:06 -0800 (PST) Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id OAA22308; Sat, 14 Dec 1996 14:38:54 -0700 From: Terry Lambert Message-Id: <199612142138.OAA22308@phaeton.artisoft.com> Subject: Re: some questions concerning TLB shootdowns in FreeBSD To: toor@dyson.iquest.net (John S. Dyson) Date: Sat, 14 Dec 1996 14:38:54 -0700 (MST) Cc: terry@lambert.org, phk@critter.tfs.com, dyson@freebsd.org, peter@spinner.dialix.com, smp@freebsd.org, haertel@ichips.intel.com In-Reply-To: <199612142154.QAA05435@dyson.iquest.net> from "John S. Dyson" at Dec 14, 96 04:54:09 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > > This would let us do device/offset as well as vnode/offset based caching > > (for instance, hanging all cache buffers for vnodes on a device off the > > device vnode). > > > > I've wanted this for some time, since I am determined that vclean must > > die... > > Slightly off subject, but I plan to sometime carry the vnode/offset > caching to a more generalized scheme that also encompasses device/offset > caching. Specifically, device/offset is the same as vnode/offset. > > This will allow us to cache data without the vnode. However, we will > continue to have the advantages of the current vnode/offset scheme. This is one of the reasons for murdering vclean: so you can get a cache hit on perfectly good data which is in memory, but for which the vnode has been reused, freed, destroyed, or whatever. Without the vnode, the perfectly good data can not get a cache hit... it has to be loaded in from disk again (potentially tromping other perfectly good data that is also in cache, but is older than the perfectly good data we can no longer reference -- bletch). Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. From owner-freebsd-smp Sat Dec 14 14:22:32 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id OAA20923 for smp-outgoing; Sat, 14 Dec 1996 14:22:32 -0800 (PST) Received: from dyson.iquest.net (dyson.iquest.net [198.70.144.127]) by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id OAA20914; Sat, 14 Dec 1996 14:22:27 -0800 (PST) Received: (from root@localhost) by dyson.iquest.net (8.8.2/8.6.9) id RAA05499; Sat, 14 Dec 1996 17:22:22 -0500 (EST) From: "John S. Dyson" Message-Id: <199612142222.RAA05499@dyson.iquest.net> Subject: Re: some questions concerning TLB shootdowns in FreeBSD To: phk@critter.tfs.com (Poul-Henning Kamp) Date: Sat, 14 Dec 1996 17:22:22 -0500 (EST) Cc: dyson@freebsd.org, peter@spinner.dialix.com, smp@freebsd.org, haertel@ichips.intel.com In-Reply-To: <5557.850593988@critter.tfs.com> from "Poul-Henning Kamp" at Dec 14, 96 09:06:28 pm X-Mailer: ELM [version 2.4 PL24 ME8] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > > The crucial thing, as far as I can see, is to find out >if< we need to > tell the other CPU's about this change to the pagetables. For a 2cpu > system the penalty of stopping the other CPU is still within bounds > of the reasonable, but stopping three CPUs needlessly is not a good idea. > > Is there any cheap way to keep a refcount (or bitmap) per vm-object so > we can see if we need to kick the other CPUs if we fiddle it ? > That would be tricky if we can freely reschedule processes on other cpu's... It would entail traversing the map for the process when the process is scheduled. Normally, there is also no notification when a page table entry is fetched into the TLB. Such notification can be arranged on the advanced X86 processors, but it doesn't appear to be a guaranteed type thing. How's about just making the inter-processor interrupt efficient? We can probably redo some of the vm/pmap interface to have larger grained pmap update operations also. I suggest that in the short term, that the code be made algorithmically correct with the stop-processor suggestion made earlier. Later on, we can improve on the algorithmically correct (but slightly slower code), and do the things to the vm/pmap interface to make things much more efficient. John dyson@freebsd.org From owner-freebsd-smp Sat Dec 14 14:27:22 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id OAA21484 for smp-outgoing; Sat, 14 Dec 1996 14:27:22 -0800 (PST) Received: from dyson.iquest.net (dyson.iquest.net [198.70.144.127]) by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id OAA21471; Sat, 14 Dec 1996 14:27:17 -0800 (PST) Received: (from root@localhost) by dyson.iquest.net (8.8.2/8.6.9) id RAA05515; Sat, 14 Dec 1996 17:25:55 -0500 (EST) From: "John S. Dyson" Message-Id: <199612142225.RAA05515@dyson.iquest.net> Subject: Re: some questions concerning TLB shootdowns in FreeBSD To: terry@lambert.org (Terry Lambert) Date: Sat, 14 Dec 1996 17:25:55 -0500 (EST) Cc: toor@dyson.iquest.net, terry@lambert.org, phk@critter.tfs.com, dyson@freebsd.org, peter@spinner.dialix.com, smp@freebsd.org, haertel@ichips.intel.com In-Reply-To: <199612142138.OAA22308@phaeton.artisoft.com> from "Terry Lambert" at Dec 14, 96 02:38:54 pm X-Mailer: ELM [version 2.4 PL24 ME8] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > > > > Slightly off subject, but I plan to sometime carry the vnode/offset > > caching to a more generalized scheme that also encompasses device/offset > > caching. Specifically, device/offset is the same as vnode/offset. > > > > This will allow us to cache data without the vnode. However, we will > > continue to have the advantages of the current vnode/offset scheme. > > This is one of the reasons for murdering vclean: so you can get a cache > hit on perfectly good data which is in memory, but for which the vnode > has been reused, freed, destroyed, or whatever. Without the vnode, the > perfectly good data can not get a cache hit... it has to be loaded in > from disk again (potentially tromping other perfectly good data that > is also in cache, but is older than the perfectly good data we can no > longer reference -- bletch). > > The ONLY reason that it hasn't been done, is (my) time limitations. Other things scream louder -- and the "nice" things get left by the wayside. For example, today I am working on the merge of the Lite/2 stuff (finally). After the merge, and the commits, I expect that there will be at least a few days of instability, and guess what I get to do (answer: read frantic requests for help, look at core dumps, and generally feel bad about messing up the tree.) :-). John dyson@freebsd.org From owner-freebsd-smp Sat Dec 14 15:08:48 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id PAA26319 for smp-outgoing; Sat, 14 Dec 1996 15:08:48 -0800 (PST) Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id PAA26289; Sat, 14 Dec 1996 15:08:41 -0800 (PST) Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id PAA22458; Sat, 14 Dec 1996 15:45:28 -0700 From: Terry Lambert Message-Id: <199612142245.PAA22458@phaeton.artisoft.com> Subject: Re: some questions concerning TLB shootdowns in FreeBSD To: toor@dyson.iquest.net (John S. Dyson) Date: Sat, 14 Dec 1996 15:45:28 -0700 (MST) Cc: phk@critter.tfs.com, dyson@freebsd.org, peter@spinner.dialix.com, smp@freebsd.org, haertel@ichips.intel.com In-Reply-To: <199612142222.RAA05499@dyson.iquest.net> from "John S. Dyson" at Dec 14, 96 05:22:22 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > > Is there any cheap way to keep a refcount (or bitmap) per vm-object so > > we can see if we need to kick the other CPUs if we fiddle it ? > > That would be tricky if we can freely reschedule processes on other > cpu's... It would entail traversing the map for the process when > the process is scheduled. Normally, there is also no notification > when a page table entry is fetched into the TLB. Such notification > can be arranged on the advanced X86 processors, but it doesn't > appear to be a guaranteed type thing. You could simplify this a lot by preferential scheduling. You could also keep a bitmap of the virtual address space and examine only those areas where a bitmap collides... Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. From owner-freebsd-smp Sat Dec 14 17:22:09 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id RAA10071 for smp-outgoing; Sat, 14 Dec 1996 17:22:09 -0800 (PST) Received: from spinner.DIALix.COM (root@spinner.DIALix.COM [192.203.228.67]) by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id RAA10063; Sat, 14 Dec 1996 17:22:02 -0800 (PST) Received: from spinner.DIALix.COM (peter@localhost.DIALix.oz.au [127.0.0.1]) by spinner.DIALix.COM (8.8.4/8.8.4) with ESMTP id JAA12763; Sun, 15 Dec 1996 09:21:56 +0800 (WST) Message-Id: <199612150121.JAA12763@spinner.DIALix.COM> To: dyson@freebsd.org cc: smp@freebsd.org, haertel@ichips.intel.com Subject: Re: some questions concerning TLB shootdowns in FreeBSD In-reply-to: Your message of "Sat, 14 Dec 1996 13:38:47 EST." <199612141838.NAA00208@dyson.iquest.net> Date: Sun, 15 Dec 1996 09:21:55 +0800 From: Peter Wemm Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk "John S. Dyson" wrote: > > > > I'm still digesting it, I am almost worried that we might (shudder!) > > be forced into doing an IPI to stop all the cpu's *before* the > > current cpu changes the page tables, then letting them do the tlb > > flush and letting them proceed. If this actually is a real problem > > this means a much bigger code impact. > > > The way that I see it, is that the current pmap code is highly optimized > for single processor operation. If I was you, I would try to just > try to get something working correctly algorithmically -- almost ignoring > performance issues. Of course, when performance is easy -- go for that > also. > > Alot of things like single page invalidates inside of loops appear that > they could be evil for multi-processor applications (imagine an inter- > processor interrupt for every loop!?!?.) I think that you (we or us), > will have to look at the performance for the SMP direction, and it > might even entail large differences in pmap eventually. Hopefully, > we will all be able to isolate the differences for the maintenance of > sanity :-). > > John Originally, I wondered if the CMAP/CADDR and APTD stuff might need to be per-cpu but couldn't think of a good reason given our presently 99.8% non-reentrant kernel (the IPI code is reentrant). Perhaps this is one of them... I don't recall how much code walks through the page tables and how much uses CADDR/APTD. When dealing with the user space of the currently active process context, remote TLB locking/flushing is not needed as long as other cpu's cannot get to the same space via their APTD (which should be valid as long as we have a global lock) for the high level stuff. However, the shared address space code that I was working on in -current (for kernel assisted threading in the smp kernel) means that a single vmspace/pmap/etc can be shared among multiple processes and this changes the above picture since two cpu's can be using the user mode parts of the same page tables at once, one in executing in user mode, one in the kernel. Cheers, -Peter From owner-freebsd-smp Sat Dec 14 17:36:10 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id RAA11229 for smp-outgoing; Sat, 14 Dec 1996 17:36:10 -0800 (PST) Received: from spinner.DIALix.COM (root@spinner.DIALix.COM [192.203.228.67]) by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id RAA11198; Sat, 14 Dec 1996 17:35:56 -0800 (PST) Received: from spinner.DIALix.COM (peter@localhost.DIALix.oz.au [127.0.0.1]) by spinner.DIALix.COM (8.8.4/8.8.4) with ESMTP id JAA13241; Sun, 15 Dec 1996 09:35:31 +0800 (WST) Message-Id: <199612150135.JAA13241@spinner.DIALix.COM> To: Poul-Henning Kamp cc: dyson@freebsd.org, smp@freebsd.org, haertel@ichips.intel.com Subject: Re: some questions concerning TLB shootdowns in FreeBSD In-reply-to: Your message of "Sat, 14 Dec 1996 21:06:28 +0100." <5557.850593988@critter.tfs.com> Date: Sun, 15 Dec 1996 09:35:30 +0800 From: Peter Wemm Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Poul-Henning Kamp wrote: > In message <199612141838.NAA00208@dyson.iquest.net>, "John S. Dyson" writes: > >The way that I see it, is that the current pmap code is highly optimized > >for single processor operation. If I was you, I would try to just > >try to get something working correctly algorithmically -- almost ignoring > >performance issues. Of course, when performance is easy -- go for that > >also. > > > >Alot of things like single page invalidates inside of loops appear that > >they could be evil for multi-processor applications (imagine an inter- > >processor interrupt for every loop!?!?.) I think that you (we or us), > >will have to look at the performance for the SMP direction, and it > >might even entail large differences in pmap eventually. Hopefully, > >we will all be able to isolate the differences for the maintenance of > >sanity :-). > > The crucial thing, as far as I can see, is to find out >if< we need to > tell the other CPU's about this change to the pagetables. For a 2cpu > system the penalty of stopping the other CPU is still within bounds > of the reasonable, but stopping three CPUs needlessly is not a good idea. Yes.. Also, there seem to be cases where the cpu needs to invalidate on entry to the kernel, but does not need to be kicked via an IPI. eg: if we change the kernel page tables, other cpu's running user code at the time do not need to flush until they actually try to enter the kernel. We should replace the existing simplistic code with a group of bitmaps that are accessed via atomic bit-set/clear and bit-test-and-set/clear so that we can syncronise deferred TLB flushes and callins for common PTE's. Cheers, -Peter From owner-freebsd-smp Sat Dec 14 17:36:44 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id RAA11264 for smp-outgoing; Sat, 14 Dec 1996 17:36:44 -0800 (PST) Received: from avatar.avatar.com (avatar.avatar.com [199.33.206.17]) by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id RAA11259 for ; Sat, 14 Dec 1996 17:36:41 -0800 (PST) Received: from avatar.avatar.com (kory@avatar.avatar.com [199.33.206.17]) by avatar.avatar.com (8.7.4/8.6.9) with SMTP id RAA23607 for ; Sat, 14 Dec 1996 17:36:09 -0800 (PST) Date: Sat, 14 Dec 1996 17:36:07 -0800 (PST) From: Kory Hamzeh To: freebsd-smp@freebsd.org Subject: SMP Status Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Is there anywhere on the freebsd web site where I can get a status of the SMP project? I putting togther a fairly high end Pentium Pro machine right now and I would like to purchase a motherboard that would be compatible with the freebsd SMP support. Thanks, Kory