From owner-freebsd-smp Tue Apr 1 14:37:09 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id OAA08263 for smp-outgoing; Tue, 1 Apr 1997 14:37:09 -0800 (PST) Received: from cobber.cord.edu (cobber.cord.edu [138.129.1.32]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id OAA08255 for ; Tue, 1 Apr 1997 14:37:06 -0800 (PST) Received: by cobber.cord.edu (4.1/SMI-4.1) id AA14356; Tue, 1 Apr 97 16:30:40 CST Date: Tue, 1 Apr 97 16:30:40 CST From: mestery@cobber.cord.edu (Kyle Mestery) Message-Id: <9704012230.AA14356@cobber.cord.edu> To: freebsd-smp@freebsd.org Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk subscribe mestery@cobber.cord.edu From owner-freebsd-smp Tue Apr 1 15:37:32 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id PAA13807 for smp-outgoing; Tue, 1 Apr 1997 15:37:32 -0800 (PST) Received: from mitra.pgt.mpt.gov.br (mitra.pgt.mpt.gov.br [200.130.0.1]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id PAA13799 for ; Tue, 1 Apr 1997 15:37:27 -0800 (PST) Received: from support.pgt.mpt.gov.br (support.pgt.mpt.gov.br [200.130.0.2]) by mitra.pgt.mpt.gov.br (8.7.6/8.7.3) with SMTP id UAA06213 for ; Tue, 1 Apr 1997 20:43:16 -0300 (EST) Received: by support.pgt.mpt.gov.br with Microsoft Mail id <01BC3EDC.C3C499A0@support.pgt.mpt.gov.br>; Tue, 1 Apr 1997 20:39:15 -0300 Message-ID: <01BC3EDC.C3C499A0@support.pgt.mpt.gov.br> From: Lucas Cotta To: "'freebsd-smp@freebsd.org'" Date: Tue, 1 Apr 1997 20:39:03 -0300 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk subscribe From owner-freebsd-smp Wed Apr 2 19:02:01 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id TAA21437 for smp-outgoing; Wed, 2 Apr 1997 19:02:01 -0800 (PST) Received: from news.quick.net (donegan@news.quick.net [207.212.170.1]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id TAA21425 for ; Wed, 2 Apr 1997 19:01:56 -0800 (PST) Received: (from donegan@localhost) by news.quick.net (8.8.5/8.6.9) id TAA15298; Wed, 2 Apr 1997 19:01:44 -0800 (PST) Date: Wed, 2 Apr 1997 19:01:42 -0800 (PST) From: "Steven P. Donegan" To: smp@freebsd.org Subject: Bugs? Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Since I applied the last patch the system has been rock solid. It's used as my primary X terminal, multiple compile engine, and flogged daily. What 'bugs' still remain that need attention? I'd love to contribute a fix to something no matter how minor :-) Cheers... BTW - just to stress the thingie - am compiling in bpfilter and will monitor the local (very busy) net promiscuously just to hammer things... Have already hammered/anvil'd via ping -f -s 1500 with local unloaded hosts, can't make the system drop a packet :-( Really :-) Cheers... BTW - are there any user-land tools to monitor the performance of the 'cluster' ie like monitor cluster under VMS (gag) - something that shows the multi-cpu environment? I don't seem to have a top-type thingie or anything more sophisticated. Compile is done - Bonzai! :-) Steven P. Donegan donegan@quick.net From owner-freebsd-smp Thu Apr 3 21:14:59 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id VAA16610 for smp-outgoing; Thu, 3 Apr 1997 21:14:59 -0800 (PST) Received: from lamb.sas.com (daemon@lamb.sas.com [192.35.83.8]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id VAA16605 for ; Thu, 3 Apr 1997 21:14:57 -0800 (PST) Received: from mozart by lamb.sas.com (5.65c/SAS/Gateway/01-23-95) id AA07466; Fri, 4 Apr 1997 00:14:55 -0500 Received: from iluvatar.unx.sas.com by mozart (5.65c/SAS/Domains/5-6-90) id AA26651; Fri, 4 Apr 1997 00:14:48 -0500 Received: by iluvatar.unx.sas.com (5.65c/SAS/Generic 9.01/3-26-93) id AA07258; Fri, 4 Apr 1997 00:14:47 -0500 From: "John W. DeBoskey" Message-Id: <199704040514.AA07258@iluvatar.unx.sas.com> Subject: kernel for Dell 6100 4way To: freebsd-smp@freebsd.org Date: Fri, 4 Apr 1997 00:14:47 -0500 (EST) X-Mailer: ELM [version 2.4 PL23] Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hello, Could someone make a NCPU=4 kernel available for ftp? I'm having trouble grabbing the correct code (via cvsup) due to the firewall that I must work behind(socks trouble). The machine is a Dell 6100 with 4 200Mhz PPro processors. Thanks, John ps1: Could someone who uses socks send me a sample of their config file? I'd appreciate it. ps2: Dell ships a symbios based raid subsystem on these machines. Does fbsd have any support for this in any form? -- jwd@unx.sas.com (w) John W. De Boskey (919) 677-8000 x6915 From owner-freebsd-smp Fri Apr 4 03:04:23 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id DAA01690 for smp-outgoing; Fri, 4 Apr 1997 03:04:23 -0800 (PST) Received: from caleche.kecl.ntt.co.jp (elysium.kecl.ntt.co.jp [129.60.192.193]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id DAA01681 for ; Fri, 4 Apr 1997 03:04:10 -0800 (PST) Received: from localhost by caleche.kecl.ntt.co.jp (8.8.5/kecl2.0/r8v7-M2-nishio) with ESMTP id UAA12786; Fri, 4 Apr 1997 20:03:23 +0900 (JST) To: freebsd-smp@freebsd.org Subject: APIC_IO problem on Tyan S1668 Mime-Version: 1.0 Content-Type: Multipart/Mixed; boundary="--Next_Part(Fri_Apr__4_19:25:45_1997)--" Content-Transfer-Encoding: 7bit X-Dispatcher: impost version 0.95+ (Nov. 26, 1996) Lines: 461 X-Mailer: Mew version 1.54 on Emacs 19.34.1, Mule 2.3 Message-Id: <19970404200322X.nishio@elysium.kecl.ntt.co.jp> Date: Fri, 04 Apr 1997 20:03:22 +0900 From: NISHIO Shuichi Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk ----Next_Part(Fri_Apr__4_19:25:45_1997)-- Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Hello, I today tried installing the SMP kernel on my machine, with Tyan S1668 ATX and 2 200MHz Pentium Pro processors. The kernel created without APIC_IO and SMP_INVLTLB seems to be working fine, both with 1 and 2 CPUs (it's been up for about hours, and I'm writing this main on it). However, when I tried the kernel with APIC_IO and SMP_INVLTLB, the following problem occured: (a) one (out of two) ethernet interface won't work on booting, kernel says > de0: transmission timeout and this interface (10BaseT) won't even respond to pings, although ifconfig says it's up. The other interface, de1(100BaseT), seemed to be working. (b) system freezes on NFS I tried using NFS on the alive interface de1 (100BaseTX), but while copying files, the system suddenly freezed, and I had to do a hardware reset. What I did is: (1) Installed 3.0-970209-SNAP (worked fine) (2) cvsup-ed the SMP kernel source (3) did "cvs update -Pd -D '02/10/97 00:00:00 GMT" (4) applied the recent patch to exception.s (from <199703281714.KAA25923@Ilsa.StevesCafe.com>) (5) compiled the kernel, with options from mptable output So, my question is: (a) Am I missing something in creating the kernel? (b) Do I need to recompile everything with the SMP kernel headers? At least, dmesg didn't work, with the message kvm_read: Bad address Attached below is (1) extracts from /var/log/messages, while booting the kernel without APIC_IO (dmesg didn't work) (2) difference in messages for kernels with and without APIC_IO (3) output of "mptable -verbose -dmesg" (4) kernel configuration file I used My machine contains Tyan S1668 ATX Motherboard 200MHz Pentium Pro (256K cache) x 2 256MB memory Adaptec AHA-2940 x1 Adaptec AHA-2940U x1 Adaptec AHA-2940UW x1 DEC Fast EtherWORKS PCI 10/100 x 2 (one connected to 10BaseT Hub, another to 100BaseT Hub) ISA video card SCSI HDD x 6 DAT drive x 1 I am not using X-window on this, nor am I using any mouse. Thank you for any help, Nishio Shuichi ----Next_Part(Fri_Apr__4_19:25:45_1997)-- Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Copyright (c) 1992-1996 FreeBSD Inc. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. FreeBSD 3.0-SMP #0: Fri Apr 4 12:41:24 JST 1997 root@elysium.kecl.ntt.co.jp:/mnt2/cvs/sys-MP/compile/SMP FreeBSD/SMP: Multiprocessor motherboard cpu0 (BSP): apic id: 1, version: 0x00040011 cpu1 (AP): apic id: 0, version: 0x00040011 Warning: APIC I/O disabled Calibrating clock(s) relative to mc146818A clock ... i8254 clock: 1193054 Hz CPU: Pentium Pro (686-class CPU) Origin = "GenuineIntel" Id = 0x619 Stepping=9 Features=0xfbff,MTRR,PGE,MCA,CMOV> real memory = 268435456 (262144K bytes) avail memory = 261296128 (255172K bytes) Probing for devices on PCI bus 0: chip0 rev 2 on pci0:0 chip1 rev 1 on pci0:7:0 chip2 rev 0 on pci0:7:1 de0 rev 32 int a irq 15 on pci0:10 de0: DE500-AA 21140A [10-100Mb/s] pass 2.0 de0: address 00:00:f8:03:ef:59 de1 rev 32 int a irq 9 on pci0:11 de1: DE500-AA 21140A [10-100Mb/s] pass 2.0 de1: address 00:00:f8:03:e9:ec ahc0 rev 0 int a irq 10 on pci0:12 ahc0: aic7870 Single Channel, SCSI Id=7, 16 SCBs ahc0 waiting for scsi devices to settle (ahc0:0:0): "SEAGATE ST31230N 0250" type 0 fixed SCSI 2 sd0(ahc0:0:0): Direct-Access 1010MB (2069860 512 byte sectors) (ahc0:1:0): "SEAGATE ST31230N 0250" type 0 fixed SCSI 2 sd1(ahc0:1:0): Direct-Access 1010MB (2069860 512 byte sectors) (ahc0:2:0): "SEAGATE ST410800N 0019" type 0 fixed SCSI 2 sd2(ahc0:2:0): Direct-Access 8669MB (17755614 512 byte sectors) (ahc0:4:0): "HP C1533A 9503" type 1 removable SCSI 2 st0(ahc0:4:0): Sequential-Access density code 0x24, drive empty ahc1 rev 0 int a irq 11 on pci0:13 ahc1: aic7880 Single Channel, SCSI Id=7, 16 SCBs ahc1 waiting for scsi devices to settle (ahc1:5:0): "Quantum XP34300 81HB" type 0 fixed SCSI 2 sd3(ahc1:5:0): Direct-Access 4101MB (8399520 512 byte sectors) (ahc1:6:0): "Quantum XP34300 81HB" type 0 fixed SCSI 2 sd4(ahc1:6:0): Direct-Access 4101MB (8399520 512 byte sectors) ahc2 rev 0 int a irq 15 on pci0:14 ahc2: aic7880 Wide Channel, SCSI Id=7, 16 SCBs ahc2 waiting for scsi devices to settle (ahc2:4:0): "QUANTUM XP39100S LXY4" type 0 fixed SCSI 2 sd5(ahc2:4:0): Direct-Access 8682MB (17781520 512 byte sectors) Probing for devices on the ISA bus: sc0 at 0x60-0x6f irq 1 on motherboard sc0: VGA color <16 virtual consoles, flags=0x0> sio0 at 0x3f8-0x3ff irq 4 on isa sio0: type 16550A sio1 at 0x2f8-0x2ff irq 3 on isa sio1: type 16550A fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa fdc0: NEC 72065B fd0: 1.44MB 3.5in npx0 on motherboard npx0: INT 16 interface ccd0-3: Concatenated disk drivers SMP: All idle procs online. de1: link up: enabling 100baseTX port de0: link up: enabling 10baseT port SMP: Starting 1st AP! SMP: AP CPU #1 LAUNCHED!! Starting Scheduling... SMP: TADA! CPU #1 made it into the scheduler!. SMP: All 2 CPU's are online! ----Next_Part(Fri_Apr__4_19:25:45_1997)-- Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit 5,6c5,6 < FreeBSD 3.0-SMP #0: Fri Apr 4 12:41:24 JST 1997 < root@elysium.kecl.ntt.co.jp:/mnt2/cvs/sys-MP/compile/SMP --- > FreeBSD 3.0-SMP #0: Fri Apr 4 13:35:28 JST 1997 > nishio@elysium.kecl.ntt.co.jp:/mnt2/cvs/sys-MP/compile/SMP-APIC 10,11c10,11 < Warning: APIC I/O disabled < Calibrating clock(s) relative to mc146818A clock ... i8254 clock: 1193054 Hz --- > io0 (APIC): apic id: 2, version: 0x00170011 > Calibrating clock(s) relative to mc146818A clock ... i8254 clock: 1193049 Hz 16c16 < avail memory = 261296128 (255172K bytes) --- > avail memory = 261283840 (255160K bytes) 24c24,25 < de1 rev 32 int a irq 9 on pci0:11 --- > de1 rev 32 int a irq 19 on pci0:11 > Freeing (NOT implimented) irq 9 for ISA cards. 27c28,29 < ahc0 rev 0 int a irq 10 on pci0:12 --- > ahc0 rev 0 int a irq 18 on pci0:12 > Freeing (NOT implimented) irq 10 for ISA cards. 38c40,41 < ahc1 rev 0 int a irq 11 on pci0:13 --- > ahc1 rev 0 int a irq 17 on pci0:13 > Freeing (NOT implimented) irq 11 for ISA cards. 45c48,50 < ahc2 rev 0 int a irq 15 on pci0:14 --- > ahc2 rev 0 int a irq 16 on pci0:14 > Freeing (NOT implimented) irq 15 for ISA cards. > pcibus_ihandler_attach: counting pci irq16's as clk0 irqs 61a67 > Enabled INTs: 1, 2, 3, 4, 6, 8, 15, 16, 17, 18, 19, imen: 0x00f07ea1 64d69 < de1: link up: enabling 100baseTX port 65a71,72 > de1: link up: enabling 100baseTX port > de0: transmission timeout ----Next_Part(Fri_Apr__4_19:25:45_1997)-- Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Description: "mptable output" =============================================================================== MPTable, version 2.0.6 looking for EBDA pointer @ 0x040e, NOT found searching CMOS 'top of mem' @ 0x0009fc00 (639K) searching BIOS @ 0x000f0000 MP FPS found in BIOS @ physical addr: 0x000f0920 ------------------------------------------------------------------------------- MP Floating Pointer Structure: location: BIOS physical address: 0x000f0920 signature: '_MP_' length: 16 bytes version: 1.1 checksum: 0x57 mode: Virtual Wire ------------------------------------------------------------------------------- MP Config Table Header: physical address: 0x000f0934 signature: 'PCMP' base table length: 292 version: 1.1 checksum: 0xab OEM ID: 'OEM00000' Product ID: 'PROD00000000' OEM table pointer: 0x00000000 OEM table size: 0 entry count: 28 local APIC address: 0xfee00000 extended table length: 0 extended table checksum: 0 ------------------------------------------------------------------------------- MP Config Base Table Entries: -- Processors: APIC ID Version State Family Model Step Flags 1 0x11 BSP, usable 6 1 9 0xfbff 0 0x11 AP, usable 6 1 9 0xfbff -- Bus: Bus ID Type 0 ISA 1 PCI -- I/O APICs: APIC ID Version State Address 2 0x11 usable 0xfec00000 -- I/O Ints: Type Polarity Trigger Bus ID IRQ APIC ID INT# ExtINT conforms conforms 0 0 2 0 INT conforms conforms 0 1 2 1 INT conforms conforms 0 0 2 2 INT conforms conforms 0 3 2 3 INT conforms conforms 0 4 2 4 INT conforms conforms 0 5 2 5 INT conforms conforms 0 6 2 6 INT conforms conforms 0 7 2 7 INT conforms conforms 0 8 2 8 INT conforms conforms 0 9 2 9 INT conforms conforms 0 10 2 10 INT conforms conforms 0 11 2 11 INT conforms conforms 0 12 2 12 INT conforms conforms 0 13 2 13 INT conforms conforms 0 14 2 14 INT conforms conforms 0 15 2 15 INT active-lo level 1 14:A 2 16 INT active-lo level 1 13:A 2 17 INT active-lo level 1 12:A 2 18 INT active-lo level 1 11:A 2 19 SMI conforms conforms 0 0 2 23 -- Local Ints: Type Polarity Trigger Bus ID IRQ APIC ID INT# ExtINT active-hi edge 0 0 255 0 NMI active-hi edge 0 0 255 1 ------------------------------------------------------------------------------- # SMP kernel config file options: options SMP # Symmetric MultiProcessor Kernel options APIC_IO # Symmetric (APIC) I/O options NCPU=2 # number of CPUs options NBUS=2 # number of busses options NAPIC=1 # number of IO APICs options NINTR=24 # number of INTs options SMP_INVLTLB # #options SMP_PRIVPAGES # BROKEN, DO NOT use! #options SMP_AUTOSTART # BROKEN, DO NOT use! #options SERIAL_DEBUG # com port debug output ------------------------------------------------------------------------------- dmesg output: dmesg: kvm_read: kvm_read: Bad address =============================================================================== ----Next_Part(Fri_Apr__4_19:25:45_1997)-- Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Description: "kernel configuration (without APIC_IO)" # # GENERIC -- Generic machine with WD/AHx/NCR/BTx family disks # # For more information read the handbook part System Administration -> # Configuring the FreeBSD Kernel -> The Configuration File. # The handbook is available in /usr/share/doc/handbook or online as # latest version from the FreeBSD World Wide Web server # # # An exhaustive list of options and more detailed explanations of the # device lines is present in the ./LINT configuration file. If you are # in doubt as to the purpose or necessity of a line, check first in LINT. # # $Id: GENERIC,v 1.82 1996/12/21 02:09:04 se Exp $ machine "i386" #cpu "I386_CPU" #cpu "I486_CPU" #cpu "I586_CPU" cpu "I686_CPU" ident SMP maxusers 10 #options MATH_EMULATE #Support for x87 emulation options INET #InterNETworking options FFS #Berkeley Fast Filesystem options NFS #Network Filesystem options MSDOSFS #MSDOS Filesystem options "CD9660" #ISO 9660 Filesystem options PROCFS #Process filesystem options "COMPAT_43" #Compatible with BSD 4.3 [KEEP THIS!] options SCSI_DELAY=15 #Be pessimistic about Joe SCSI device #options BOUNCE_BUFFERS #include support for DMA bounce buffers options UCONSOLE #Allow users to grab the console options FAILSAFE #Be conservative #options USERCONFIG #boot -c editor #options VISUAL_USERCONFIG #visual boot -c editor config kernel root on sd1 controller isa0 controller eisa0 controller pci0 controller fdc0 at isa? port "IO_FD1" bio irq 6 drq 2 vector fdintr disk fd0 at fdc0 drive 0 # A single entry for any of these controllers (ncr, ahb, ahc, amd) is # sufficient for any number of installed devices. controller ahc0 controller scbus0 device sd0 #device od0 #See LINT for possible `od' options. device st0 device cd0 #Only need one of these, the code dynamically grows # syscons is the default console driver, resembling an SCO console device sc0 at isa? port "IO_KBD" tty irq 1 vector scintr # Enable this and PCVT_FREEBSD for pcvt vt220 compatible console driver #device vt0 at isa? port "IO_KBD" tty irq 1 vector pcrint #options PCVT_FREEBSD=210 # pcvt running on FreeBSD >= 2.0.5 #options XSERVER # include code for XFree86 #options FAT_CURSOR # start with block cursor # If you have a ThinkPAD, uncomment this along with the rest of the PCVT lines #options PCVT_SCANSET=2 # IBM keyboards are non-std # Mandatory, don't remove device npx0 at isa? port "IO_NPX" irq 13 vector npxintr device sio0 at isa? port "IO_COM1" tty irq 4 vector siointr device sio1 at isa? port "IO_COM2" tty irq 3 vector siointr # Order is important here due to intrusive probes, do *not* alphabetize # this list of network interfaces until the probes have been fixed. # Right now it appears that the ie0 must be probed before ep0. See # revision 1.20 of this file. device de0 pseudo-device loop pseudo-device ether pseudo-device log pseudo-device pty 16 pseudo-device gzip # Exec gzipped a.out's pseudo-device bpfilter 32 #Berkeley packet filter pseudo-device snp 3 #Snoop device - to look at pty/vty/etc.. pseudo-device ccd 4 #Concatenated disk driver # KTRACE enables the system-call tracing facility ktrace(2). # This adds 4 KB bloat to your kernel, and slightly increases # the costs of each syscall. options KTRACE #kernel tracing options "MAXMEM=(256*1024)" # SMP kernel config file options: options SMP # Symmetric MultiProcessor Kernel #options APIC_IO # Symmetric (APIC) I/O options NCPU=2 # number of CPUs options NBUS=2 # number of busses options NAPIC=1 # number of IO APICs options NINTR=24 # number of INTs #options SMP_INVLTLB # #options SMP_PRIVPAGES # BROKEN, DO NOT use! #options SMP_AUTOSTART # BROKEN, DO NOT use! #options SERIAL_DEBUG # com port debug output ----Next_Part(Fri_Apr__4_19:25:45_1997)---- From owner-freebsd-smp Fri Apr 4 04:35:16 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id EAA05644 for smp-outgoing; Fri, 4 Apr 1997 04:35:16 -0800 (PST) Received: from samthedog.datacard.com ([205.215.203.135]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id EAA05637 for ; Fri, 4 Apr 1997 04:35:12 -0800 (PST) Received: from localhost (dave@localhost) by samthedog.datacard.com (8.8.5/8.7.3) with SMTP id MAA00961; Fri, 4 Apr 1997 12:28:54 GMT X-Authentication-Warning: samthedog.datacard.com: dave owned process doing -bs Date: Fri, 4 Apr 1997 06:27:59 -0600 (CST) From: dave adkins X-Sender: adkin003@samthedog.datacard.com To: NISHIO Shuichi cc: freebsd-smp@freebsd.org Subject: Re: APIC_IO problem on Tyan S1668 In-Reply-To: <19970404200322X.nishio@elysium.kecl.ntt.co.jp> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk On Fri, 4 Apr 1997, NISHIO Shuichi wrote: > > > de0: transmission timeout > > and this interface (10BaseT) won't even respond to pings, although > ifconfig says it's up. The other interface, de1(100BaseT), seemed to > be working. > > > (b) system freezes on NFS > > I tried using NFS on the alive interface de1 (100BaseTX), but while > copying files, the system suddenly freezed, and I had to do a hardware > reset. > > > What I did is: > (1) Installed 3.0-970209-SNAP (worked fine) > (2) cvsup-ed the SMP kernel source > (3) did "cvs update -Pd -D '02/10/97 00:00:00 GMT" > (4) applied the recent patch to exception.s > (from <199703281714.KAA25923@Ilsa.StevesCafe.com>) > (5) compiled the kernel, with options from mptable output > > > So, my question is: > (a) Am I missing something in creating the kernel? > (b) Do I need to recompile everything with the SMP kernel headers? > At least, dmesg didn't work, with the message > kvm_read: Bad address > > > Attached below is > (1) extracts from /var/log/messages, while booting the kernel > without APIC_IO (dmesg didn't work) > (2) difference in messages for kernels with and without APIC_IO > (3) output of "mptable -verbose -dmesg" > (4) kernel configuration file I used > > > My machine contains > Tyan S1668 ATX Motherboard > 200MHz Pentium Pro (256K cache) x 2 > 256MB memory > Adaptec AHA-2940 x1 > Adaptec AHA-2940U x1 > Adaptec AHA-2940UW x1 > DEC Fast EtherWORKS PCI 10/100 x 2 > (one connected to 10BaseT Hub, another to 100BaseT Hub) > ISA video card > SCSI HDD x 6 > DAT drive x 1 > > I am not using X-window on this, nor am I using any mouse. > > > Thank you for any help, > > > Nishio Shuichi > > I had similar problems with the AT version of the board until I upgraded the de and the aic drivers to current. the de driver in the SNAP has problems with 21140 cards such as SMC933 that uses an 21140-AC. I experienced a series of AHA-2940 lockups under heavy load both uni and multprocessor, running multiprocessor the lockups happened quite a bit faster than uniprocessor. It looks like you don't have a 21140AC but the media select might be better in the newer driver. I do recommend upgrading the aic driver, it really improved the stability. Check the commits on the aic driver. dave adkins From owner-freebsd-smp Fri Apr 4 09:12:20 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id JAA20153 for smp-outgoing; Fri, 4 Apr 1997 09:12:20 -0800 (PST) Received: from cs.utah.edu (cs.utah.edu [128.110.4.21]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id JAA20148 for ; Fri, 4 Apr 1997 09:12:18 -0800 (PST) Received: from fast.cs.utah.edu by cs.utah.edu (8.8.4/utah-2.21-cs) id KAA01013; Fri, 4 Apr 1997 10:11:23 -0700 (MST) Received: by fast.cs.utah.edu (8.6.10/utah-2.15-leaf) id KAA03841; Fri, 4 Apr 1997 10:11:22 -0700 Date: Fri, 4 Apr 1997 10:11:22 -0700 From: vanmaren@fast.cs.utah.edu (Kevin Van Maren) Message-Id: <199704041711.KAA03841@fast.cs.utah.edu> To: freebsd-smp@freebsd.org, nishio@caleche.kecl.ntt.co.jp Subject: Re: APIC_IO problem on Tyan S1668 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk >de0 rev 32 int a irq 15 on pci0:10 ... INT conforms conforms 0 15 2 15 INT active-lo level 1 14:A 2 16 INT active-lo level 1 13:A 2 17 INT active-lo level 1 12:A 2 18 INT active-lo level 1 11:A 2 19 SMI conforms conforms 0 0 2 23 ======= Notice that slot 10 isn't in the table. Kevin From owner-freebsd-smp Fri Apr 4 11:10:14 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id LAA27222 for smp-outgoing; Fri, 4 Apr 1997 11:10:14 -0800 (PST) Received: from Ilsa.StevesCafe.com (sc-gw.StevesCafe.com [205.168.119.191]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id LAA27212 for ; Fri, 4 Apr 1997 11:10:08 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by Ilsa.StevesCafe.com (8.7.5/8.6.12) with SMTP id MAA16658; Fri, 4 Apr 1997 12:09:55 -0700 (MST) Message-Id: <199704041909.MAA16658@Ilsa.StevesCafe.com> X-Authentication-Warning: Ilsa.StevesCafe.com: Host localhost [127.0.0.1] didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: NISHIO Shuichi cc: freebsd-smp@freebsd.org Subject: Re: APIC_IO problem on Tyan S1668 In-reply-to: Your message of "Fri, 04 Apr 1997 20:03:22 +0900." <19970404200322X.nishio@elysium.kecl.ntt.co.jp> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 04 Apr 1997 12:09:55 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, >(1) Installed 3.0-970209-SNAP (worked fine) >(2) cvsup-ed the SMP kernel source >(3) did "cvs update -Pd -D '02/10/97 00:00:00 GMT" actually Peter recommends: cvs -q update -Pd -D '02/09/97 00:00:00 GMT' in reality I would recommend using the 3.0-970209-SNAP code without any cvs update. this is what I am running here. --- >(4) applied the recent patch to exception.s > (from <199703281714.KAA25923@Ilsa.StevesCafe.com>) this shouldn't have applied cleanly, since it should have already been applied to the source you supped. --- >(b) Do I need to recompile everything with the SMP kernel headers? >At least, dmesg didn't work, with the message > kvm_read: Bad address you definately have something out of sync here. I saw this same message when I tried running an older SMP kernel on the 3.0-970209-SNAP. till you can get rid of this don't expect anything to be reliable. --- > DEC Fast EtherWORKS PCI 10/100 x 2 > (one connected to 10BaseT Hub, another to 100BaseT Hub) evidently a PCI-PCI bridge card > de0 rev 32 int a irq 15 on pci0:10 > de1 rev 32 int a irq 9 on pci0:11 > de1 rev 32 int a irq 19 on pci0:11 > Freeing (NOT implimented) irq 9 for ISA cards. > INT active-lo level 1 14:A 2 16 > INT active-lo level 1 13:A 2 17 > INT active-lo level 1 12:A 2 18 > INT active-lo level 1 11:A 2 19 as someone else already pointed out the de0 half of the card is absent from the mptable. This is a common bug in the current generation of SMP boards, they don't have a clue about PCI-PCI bridging. once you get past the other problems I can whip up a bandaid that will allow the pci code to "see" this card. so in summary: revert to a 3.0-970209-SNAP system. use the SMP soure as is, no patches should be necessary. eliminate the kvm error before proceeding. -- Steve Passe | powered by smp@csn.net | Symmetric MultiProcessor FreeBSD From owner-freebsd-smp Fri Apr 4 12:27:23 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id MAA01659 for smp-outgoing; Fri, 4 Apr 1997 12:27:23 -0800 (PST) Received: from cs.utah.edu (cs.utah.edu [128.110.4.21]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id MAA01654 for ; Fri, 4 Apr 1997 12:27:20 -0800 (PST) Received: from fast.cs.utah.edu by cs.utah.edu (8.8.4/utah-2.21-cs) id NAA00730; Fri, 4 Apr 1997 13:26:43 -0700 (MST) Received: by fast.cs.utah.edu (8.6.10/utah-2.15-leaf) id NAA20578; Fri, 4 Apr 1997 13:26:43 -0700 Date: Fri, 4 Apr 1997 13:26:43 -0700 From: vanmaren@fast.cs.utah.edu (Kevin Van Maren) Message-Id: <199704042026.NAA20578@fast.cs.utah.edu> To: nishio@caleche.kecl.ntt.co.jp, smp@csn.net Subject: Re: APIC_IO problem on Tyan S1668 Cc: freebsd-smp@freebsd.org Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > kvm_read: Bad address You probably already know this, but you can also get this if the binary isn't set-uid. (xload lost its bit when I `upgraded' using sysinstall). Kevin From owner-freebsd-smp Fri Apr 4 13:24:20 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id NAA04724 for smp-outgoing; Fri, 4 Apr 1997 13:24:20 -0800 (PST) Received: from cs.utah.edu (cs.utah.edu [128.110.4.21]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id NAA04717 for ; Fri, 4 Apr 1997 13:24:15 -0800 (PST) Received: from fast.cs.utah.edu by cs.utah.edu (8.8.4/utah-2.21-cs) id OAA03058; Fri, 4 Apr 1997 14:24:06 -0700 (MST) Received: by fast.cs.utah.edu (8.6.10/utah-2.15-leaf) id NAA20216; Fri, 4 Apr 1997 13:23:54 -0700 Date: Fri, 4 Apr 1997 13:23:54 -0700 From: vanmaren@fast.cs.utah.edu (Kevin Van Maren) Message-Id: <199704042023.NAA20216@fast.cs.utah.edu> To: nishio@caleche.kecl.ntt.co.jp, smp@csn.net Subject: Re: APIC_IO problem on Tyan S1668 Cc: freebsd-smp@freebsd.org Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > > DEC Fast EtherWORKS PCI 10/100 x 2 > > (one connected to 10BaseT Hub, another to 100BaseT Hub) > > evidently a PCI-PCI bridge card No. Read below. > > > de0 rev 32 int a irq 15 on pci0:10 > > de1 rev 32 int a irq 9 on pci0:11 > > > de1 rev 32 int a irq 19 on pci0:11 > > Freeing (NOT implimented) irq 9 for ISA cards. > > > INT active-lo level 1 14:A 2 16 > > INT active-lo level 1 13:A 2 17 > > INT active-lo level 1 12:A 2 18 > > INT active-lo level 1 11:A 2 19 > > as someone else already pointed out the de0 half of the card is absent > from the mptable. This is a common bug in the current generation of > SMP boards, they don't have a clue about PCI-PCI bridging. once > you get past the other problems I can whip up a bandaid that will allow > the pci code to "see" this card. > Nope! There are two cards in the 5-PCI slot MotherBoard (NO bridge chips). Note that they are both on PCI bus #0. The problem is that there are 5 PCI slots, 4 PCI/ISA redirection registers, and 4 MP table entries. Yes, it is a problem with the Tyan MB that it doesn't have all the PCI slots in the table. Kevin From owner-freebsd-smp Fri Apr 4 13:56:21 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id NAA06157 for smp-outgoing; Fri, 4 Apr 1997 13:56:21 -0800 (PST) Received: from Ilsa.StevesCafe.com (sc-gw.StevesCafe.com [205.168.119.191]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id NAA06150 for ; Fri, 4 Apr 1997 13:56:17 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by Ilsa.StevesCafe.com (8.7.5/8.6.12) with SMTP id OAA18505; Fri, 4 Apr 1997 14:55:52 -0700 (MST) Message-Id: <199704042155.OAA18505@Ilsa.StevesCafe.com> X-Authentication-Warning: Ilsa.StevesCafe.com: Host localhost [127.0.0.1] didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: vanmaren@fast.cs.utah.edu (Kevin Van Maren) cc: nishio@caleche.kecl.ntt.co.jp, freebsd-smp@freebsd.org Subject: Re: APIC_IO problem on Tyan S1668 In-reply-to: Your message of "Fri, 04 Apr 1997 13:23:54 MST." <199704042023.NAA20216@fast.cs.utah.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 04 Apr 1997 14:55:52 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, > Nope! There are two cards in the 5-PCI slot MotherBoard (NO bridge chips). > Note that they are both on PCI bus #0. The problem is that there are > 5 PCI slots, 4 PCI/ISA redirection registers, and 4 MP table entries. > Yes, it is a problem with the Tyan MB that it doesn't have all the PCI > slots in the table. I'll buy that. The PCI code is probably confused by a shared INT where one card is being handled via a lower INT and the other an upper APIC INT! The new gigabyte dual 686DX has five slots and its mptable looks like: INT active-lo level 0 8:A 2 16 INT active-lo level 0 9:A 2 17 INT active-lo level 0 10:A 2 18 INT active-lo level 0 11:A 2 19 INT active-lo level 0 7:A 2 19 INT active-lo level 0 12:A 2 16 why 6 entries I have no idea... this is with 2 PCI cards installed: ahc0 rev 0 int a irq 16 on pci0:8:0 de0 rev 17 int a irq 19 on pci0:11:0 --- I still believe it can be fixed with a bandaid in the kernel. This fix unfortunately will be SLOT specific, ie once installed in the kernel, the SMC card will have to stay in that specific slot to work. I think the correct long-term solution is going to be kernel code that rebuilds the mptable after PCI discovery has finished., ie we will toss the mptable provided by the MB. -- Steve Passe | powered by smp@csn.net | Symmetric MultiProcessor FreeBSD From owner-freebsd-smp Sat Apr 5 06:52:03 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id GAA11058 for smp-outgoing; Sat, 5 Apr 1997 06:52:03 -0800 (PST) Received: from caleche.kecl.ntt.co.jp (elysium.kecl.ntt.co.jp [129.60.192.193]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id GAA11040 for ; Sat, 5 Apr 1997 06:51:59 -0800 (PST) Received: from localhost by caleche.kecl.ntt.co.jp (8.8.5/kecl2.0/r8v7-M2-nishio) with ESMTP id XAA24829; Sat, 5 Apr 1997 23:49:36 +0900 (JST) To: smp@csn.net Cc: freebsd-smp@freebsd.org Subject: Re: APIC_IO problem on Tyan S1668 In-Reply-To: Your message of "Fri, 04 Apr 1997 12:09:55 -0700" References: <199704041909.MAA16658@Ilsa.StevesCafe.com> Mime-Version: 1.0 X-Mailer: Mew version 1.54 on Emacs 19.34.1, Mule 2.3 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-Id: <19970405234935F.nishio@elysium.kecl.ntt.co.jp> Date: Sat, 05 Apr 1997 23:49:35 +0900 From: NISHIO Shuichi X-Dispatcher: impost version 0.95+ (Nov. 26, 1996) Lines: 63 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hello, thank you for your reply. From: Steve Passe Subject: Re: APIC_IO problem on Tyan S1668 Date: Fri, 04 Apr 1997 12:09:55 -0700 Message-ID: <199704041909.MAA16658@Ilsa.StevesCafe.com> > >(3) did "cvs update -Pd -D '02/10/97 00:00:00 GMT" > actually Peter recommends: > cvs -q update -Pd -D '02/09/97 00:00:00 GMT' I'm sorry, it's a typo: what I actually did was for 02/09/97. > in reality I would recommend using the 3.0-970209-SNAP code without > any cvs update. this is what I am running here. I have only updated the kernel: Besides the kernel, I am using the 3.0-970209-SNAP binaries distributed by ftp. Should I recompile everything with the SMP kernel? > >(4) applied the recent patch to exception.s > > (from <199703281714.KAA25923@Ilsa.StevesCafe.com>) > this shouldn't have applied cleanly, since it should have already been > applied to the source you supped. After doing > cvs -q update -Pd -D '02/09/97 00:00:00 GMT' that modification went away, so I added the following 3 lines manually. > > pushl $0 /* dummy unit to finish building intr frame */ > > + #ifdef SMP > > + call _get_mplock > > + #endif /* SMP */ > > incl _cnt+V_TRAP > >(b) Do I need to recompile everything with the SMP kernel headers? > >At least, dmesg didn't work, with the message > > kvm_read: Bad address > you definately have something out of sync here. I saw this same message > when I tried running an older SMP kernel on the 3.0-970209-SNAP. > till you can get rid of this don't expect anything to be reliable. I don't know why, but dmesg shows an output now. However, It contains a lot of binaries at the head of its output (before "Copyright (c) 1992-1996 FreeBSD Inc."). Moreover, every time I run dmegs, it gives a different length of binaries. > revert to a 3.0-970209-SNAP system. > use the SMP soure as is, no patches should be necessary. > eliminate the kvm error before proceeding. I'll try a kernel without the patch to exception.s, recompile libkvm.a and dmesg, and see what happens. NISHIO Shuichi From owner-freebsd-smp Sat Apr 5 07:11:08 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id HAA12177 for smp-outgoing; Sat, 5 Apr 1997 07:11:08 -0800 (PST) Received: from caleche.kecl.ntt.co.jp (elysium.kecl.ntt.co.jp [129.60.192.193]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id HAA12166 for ; Sat, 5 Apr 1997 07:11:04 -0800 (PST) Received: from localhost by caleche.kecl.ntt.co.jp (8.8.5/kecl2.0/r8v7-M2-nishio) with ESMTP id AAA24907; Sun, 6 Apr 1997 00:08:26 +0900 (JST) To: vanmaren@fast.cs.utah.edu Cc: freebsd-smp@freebsd.org Subject: Re: APIC_IO problem on Tyan S1668 In-Reply-To: Your message of "Fri, 4 Apr 1997 13:23:54 -0700" References: <199704042023.NAA20216@fast.cs.utah.edu> X-Mailer: Mew version 1.54 on Emacs 19.34.1, Mule 2.3 Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-Id: <19970406000826J.nishio@elysium.kecl.ntt.co.jp> Date: Sun, 06 Apr 1997 00:08:26 +0900 From: NISHIO Shuichi X-Dispatcher: impost version 0.95+ (Nov. 26, 1996) Lines: 20 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk From: vanmaren@fast.cs.utah.edu (Kevin Van Maren) Subject: Re: APIC_IO problem on Tyan S1668 Date: Fri, 4 Apr 1997 13:23:54 -0700 Message-ID: <199704042023.NAA20216@fast.cs.utah.edu> > Nope! There are two cards in the 5-PCI slot MotherBoard (NO bridge chips). > Note that they are both on PCI bus #0. The problem is that there are > 5 PCI slots, 4 PCI/ISA redirection registers, and 4 MP table entries. > Yes, it is a problem with the Tyan MB that it doesn't have all the PCI > slots in the table. The DEC ether card that didn't work is placed in the slot shared with ISA, and today, I read, somewhere from AltaVista's output, that S1668 only supports bus mastering on 4 PCI slots out of 5 (I'm not sure whether this is true or not: Tyan's manual says nothing about this). Is this related to the problem? Nishio Shuichi From owner-freebsd-smp Sat Apr 5 07:37:41 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id HAA13066 for smp-outgoing; Sat, 5 Apr 1997 07:37:41 -0800 (PST) Received: from nlsystems.com (nlsys.demon.co.uk [158.152.125.33]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id HAA13056 for ; Sat, 5 Apr 1997 07:37:20 -0800 (PST) Received: from herring.nlsystems.com (herring.nlsystems.com [10.0.0.2]) by nlsystems.com (8.8.5/8.8.5) with SMTP id QAA23025 for ; Sat, 5 Apr 1997 16:37:15 +0100 (BST) Date: Sat, 5 Apr 1997 16:37:15 +0100 (BST) From: Doug Rabson Reply-To: Doug Rabson To: smp@freebsd.org Subject: Any chance of sync with current? Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Does anyone know if/when the next sync-with-current thing is going to happen for the smp tree? I want to try out smp on my new machine (2xP6 on SuperMicro P6DNE motherboard) but it will be a lot of hassle with the current smp tree since I need various fixes to the ahc driver from current to make the machine work at all. On the other hand, what about merging in the other direction SMP->current. Last time the subject came up, I am sure someone said this would be easier than merging lite2 and that has already happened... -- Doug Rabson Mail: dfr@nlsystems.com Nonlinear Systems Ltd. Phone: +44 181 951 1891 From owner-freebsd-smp Sat Apr 5 07:54:21 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id HAA13704 for smp-outgoing; Sat, 5 Apr 1997 07:54:21 -0800 (PST) Received: from spinner.DIALix.COM (root@spinner.dialix.com [192.203.228.67]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id HAA13696 for ; Sat, 5 Apr 1997 07:54:11 -0800 (PST) Received: from spinner.DIALix.COM (peter@localhost.DIALix.oz.au [127.0.0.1]) by spinner.DIALix.COM (8.8.5/8.8.5) with ESMTP id XAA17993; Sat, 5 Apr 1997 23:53:48 +0800 (WST) Message-Id: <199704051553.XAA17993@spinner.DIALix.COM> X-Mailer: exmh version 2.0gamma 1/27/96 To: Doug Rabson cc: smp@freebsd.org Subject: Re: Any chance of sync with current? In-reply-to: Your message of "Sat, 05 Apr 1997 16:37:15 +0100." Date: Sat, 05 Apr 1997 23:53:47 +0800 From: Peter Wemm Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Doug Rabson wrote: > Does anyone know if/when the next sync-with-current thing is going to > happen for the smp tree? I want to try out smp on my new machine (2xP6 on > SuperMicro P6DNE motherboard) but it will be a lot of hassle with the > current smp tree since I need various fixes to the ahc driver from current > to make the machine work at all. I've been thinking about doing it sometime soon, the -current kernel seems to have settled down a fair bit. > On the other hand, what about merging in the other direction SMP->current. > Last time the subject came up, I am sure someone said this would be easier > than merging lite2 and that has already happened... The biggest problem that I can see so far is that we have too many #ifdef's, causing interesting problems like #include "opt_smp.h" to be seen by user programs that #include stuff from . I must have another look again and see how it looks at the moment. It'd be nice to bring it online to the mainstream. > -- > Doug Rabson Mail: dfr@nlsystems.com > Nonlinear Systems Ltd. Phone: +44 181 951 1891 > Cheers, -Peter From owner-freebsd-smp Sat Apr 5 08:17:32 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id IAA15121 for smp-outgoing; Sat, 5 Apr 1997 08:17:32 -0800 (PST) Received: from corona.jcmax.com (corona.jcmax.com [204.69.248.2]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id IAA15115 for ; Sat, 5 Apr 1997 08:17:28 -0800 (PST) Received: by corona.jcmax.com (5.65/2.49G/4.1.3_U1) id AA05092; Sat, 5 Apr 97 11:17:25 -0500 Date: Sat, 5 Apr 97 11:17:25 -0500 From: cr@jcmax.com (Cyrus Rahman) Message-Id: <9704051617.AA05092@corona.jcmax.com> To: smp@freebsd.org Subject: Questions about mp_lock Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Could someone who had a hand in implementing the SMP kernel give me a hint about why the mp_lock count gets stored in the proc/user structure and switched out in cpu_switch()? Seems kind of weird, since I would expect that a process getting switched in or out would always posses exactly one lock, and that any others would be the result of interrupts. But it does appear that something more complicated is going on, and I can't exactly figure out what it is. Cyrus From owner-freebsd-smp Sat Apr 5 08:28:57 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id IAA15421 for smp-outgoing; Sat, 5 Apr 1997 08:28:57 -0800 (PST) Received: from cs.utah.edu (cs.utah.edu [128.110.4.21]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id IAA15414 for ; Sat, 5 Apr 1997 08:28:51 -0800 (PST) Received: from fast.cs.utah.edu by cs.utah.edu (8.8.4/utah-2.21-cs) id JAA22724; Sat, 5 Apr 1997 09:28:36 -0700 (MST) Received: by fast.cs.utah.edu (8.6.10/utah-2.15-leaf) id JAA17141; Sat, 5 Apr 1997 09:28:36 -0700 Date: Sat, 5 Apr 1997 09:28:36 -0700 From: vanmaren@fast.cs.utah.edu (Kevin Van Maren) Message-Id: <199704051628.JAA17141@fast.cs.utah.edu> To: nishio@caleche.kecl.ntt.co.jp, vanmaren@fast.cs.utah.edu Subject: Re: APIC_IO problem on Tyan S1668 Cc: freebsd-smp@freebsd.org Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk >I read, somewhere from AltaVista's output, that S1668 >only supports bus mastering on 4 PCI slots out of 5 (I'm not sure >whether this is true or not: Tyan's manual says nothing about this). > >Is this related to the problem? Tyan tech support WON'T be any help. We did have 5 PCI cards in a S1662 (non-ATX version) that worked fine under FreeBSD 2.1.5 but I never tried SMP with 5 cards. (This is why I knew about the SMP table entries and the 5 slot stuff). I believe they were all bus-mastering, as we used an ISA video board. Kevin From owner-freebsd-smp Sat Apr 5 08:33:09 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id IAA15518 for smp-outgoing; Sat, 5 Apr 1997 08:33:09 -0800 (PST) Received: from corona.jcmax.com (corona.jcmax.com [204.69.248.2]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id IAA15513 for ; Sat, 5 Apr 1997 08:33:06 -0800 (PST) Received: by corona.jcmax.com (5.65/2.49G/4.1.3_U1) id AA05399; Sat, 5 Apr 97 11:33:04 -0500 Date: Sat, 5 Apr 97 11:33:04 -0500 From: cr@jcmax.com (Cyrus Rahman) Message-Id: <9704051633.AA05399@corona.jcmax.com> To: smp@freebsd.org Subject: Deadlocking in SMP kernel Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk There appears to be a situation in which the SMP kernel deadlocks on mp_lock. With much help from Steve Passe, I've come up with the following (still tentative) scenario: A process, running on cpu1, enters the kernel and obtains a lock. While it has the lock, but before interrupts are redirected to cpu1 (or any time, if TEST_LOPRIO isn't defined), an interrupt goes to cpu0, blocking (until it obtains the lock) all lower priority interrupts. If for some reason the kernel now waits for an interrupt, there will be a deadlock. Are there any places where the kernel waits for an interrupt to occur? There are three places I found where software interrupts are generated by the kernel - but I don't think any of them are relevant (two in icu.s, one in locore.s). I suspect that understanding my previous question about why mp_lock needs to be stored during cpu_switch() might be helpful - for there's clearly some reason why mp_lock isn't always 1 in that routine, but I can't figure it out. For some reason the deadlock only seems to occur with APIC_IO defined, if that provides any additional clues. Cyrus From owner-freebsd-smp Sat Apr 5 09:05:27 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id JAA16776 for smp-outgoing; Sat, 5 Apr 1997 09:05:27 -0800 (PST) Received: from spinner.DIALix.COM (root@spinner.dialix.com [192.203.228.67]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id JAA16767 for ; Sat, 5 Apr 1997 09:05:17 -0800 (PST) Received: from spinner.DIALix.COM (peter@localhost.DIALix.oz.au [127.0.0.1]) by spinner.DIALix.COM (8.8.5/8.8.5) with ESMTP id BAA18422; Sun, 6 Apr 1997 01:04:44 +0800 (WST) Message-Id: <199704051704.BAA18422@spinner.DIALix.COM> X-Mailer: exmh version 2.0gamma 1/27/96 To: cr@jcmax.com (Cyrus Rahman) cc: smp@freebsd.org Subject: Re: Questions about mp_lock In-reply-to: Your message of "Sat, 05 Apr 1997 11:17:25 EST." <9704051617.AA05092@corona.jcmax.com> Date: Sun, 06 Apr 1997 01:04:44 +0800 From: Peter Wemm Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Cyrus Rahman wrote: > Could someone who had a hand in implementing the SMP kernel give me a hint > about why the mp_lock count gets stored in the proc/user structure and > switched out in cpu_switch()? > > Seems kind of weird, since I would expect that a process getting switched in > or out would always posses exactly one lock, and that any others would be > the result of interrupts. But it does appear that something more complicated > is going on, and I can't exactly figure out what it is. The main problem is that the kernel can be recursively entered while flow of execution is still "in the kernel". One interrupt can interrupt another's handler, a process can take a page fault while doing a copyin, causing the kernel to be reentered via the trap handlers and end up in the vm system. The catch is that when the kernel takes a page fault on a process's behalf, the odds are that the process is going to sleep while waiting for a block to be read from the disk etc. When we context switch, the kernel stack goes with it. If we switch from a context that's three levels deep to another one that's only two deep, we're going to return to user mode while holding the kernel lock, or if we switch from a 2-deep to a 3-deep context, the last part of the unwind in the new context is going to run in the kernel without the lock, and the other cpu can enter the kernel. So, we switch the nest count with the process. It's far from ideal, but it works reasonably well on two cpus. However, there's plenty of scope for improvement.. Moving the kernel locking up a layer and having a seperate entry/exit lock in the trap/syscall/interupt area would be a major win without too much cost. What we'd gain by that would be that we could then gradually move to a per-subsystem locking system perhaps based initially on which syscall or trap type. It'd be quite possible to have one cpu in the kernel doing IP checksumming on a packet, another in the vfs system somewhere, another doing some copy-on-write page copies in the vm system and so on. Things like getpid() would need no locking whatsoever. But that's for later once the basics are working. > Cyrus Cheers, -Peter From owner-freebsd-smp Sat Apr 5 09:14:08 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id JAA17144 for smp-outgoing; Sat, 5 Apr 1997 09:14:08 -0800 (PST) Received: from mail.webspan.net (mail.webspan.net [206.154.70.7]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id JAA17137 for ; Sat, 5 Apr 1997 09:14:04 -0800 (PST) Received: from orion.webspan.net (orion.webspan.net [206.154.70.5]) by mail.webspan.net (WEBSPAN/970116) with ESMTP id MAA09013; Sat, 5 Apr 1997 12:13:42 -0500 (EST) Received: from orion.webspan.net (localhost [127.0.0.1]) by orion.webspan.net (WEBSPN/970116) with ESMTP id MAA12970; Sat, 5 Apr 1997 12:13:41 -0500 (EST) To: Peter Wemm cc: cr@jcmax.com (Cyrus Rahman), smp@freebsd.org From: "Gary Palmer" Subject: Re: Questions about mp_lock In-reply-to: Your message of "Sun, 06 Apr 1997 01:04:44 +0800." <199704051704.BAA18422@spinner.DIALix.COM> Date: Sat, 05 Apr 1997 12:13:41 -0500 Message-ID: <12968.860260421@orion.webspan.net> Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Peter Wemm wrote in message ID <199704051704.BAA18422@spinner.DIALix.COM>: > Moving the kernel locking up a layer and having a seperate entry/exit lock > in the trap/syscall/interupt area would be a major win without too much > cost. What we'd gain by that would be that we could then gradually move > to a per-subsystem locking system perhaps based initially on which syscall > or trap type. It'd be quite possible to have one cpu in the kernel doing > IP checksumming on a packet, another in the vfs system somewhere, another > doing some copy-on-write page copies in the vm system and so on. Things > like getpid() would need no locking whatsoever. But that's for later once > the basics are working. Question if you would: define`basics'? Thanks, Gary -- Gary Palmer FreeBSD Core Team Member FreeBSD: Turning PC's into workstations. See http://www.FreeBSD.ORG/ for info From owner-freebsd-smp Sat Apr 5 09:14:39 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id JAA17167 for smp-outgoing; Sat, 5 Apr 1997 09:14:39 -0800 (PST) Received: from critter.dk.tfs.com (phk.cybercity.dk [195.8.133.247]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id JAA17157; Sat, 5 Apr 1997 09:14:33 -0800 (PST) Received: from critter (localhost [127.0.0.1]) by critter.dk.tfs.com (8.8.5/8.8.5) with ESMTP id TAA03845; Sat, 5 Apr 1997 19:13:25 +0200 (CEST) To: cr@jcmax.com (Cyrus Rahman) cc: smp@freebsd.org Subject: Re: Questions about mp_lock In-reply-to: Your message of "Sat, 05 Apr 1997 11:17:25 CDT." <9704051617.AA05092@corona.jcmax.com> Date: Sat, 05 Apr 1997 19:13:25 +0200 Message-ID: <3843.860260405@critter> From: Poul-Henning Kamp Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk In message <9704051617.AA05092@corona.jcmax.com>, Cyrus Rahman writes: >Could someone who had a hand in implementing the SMP kernel give me a hint >about why the mp_lock count gets stored in the proc/user structure and >switched out in cpu_switch()? Because it has to match the sequence of calls on the kernelstack. Remember: we can enter the protected space by syscall, (page-)fault or interrupt, and one doesn't prevent the others. >Seems kind of weird, since I would expect that a process getting switched in >or out would always posses exactly one lock, and that any others would be >the result of interrupts. But it does appear that something more complicated >is going on, and I can't exactly figure out what it is. It >is< weird, but it works great. No, it's not that simple. We could probably have done it that way too, but it would cost more time & code in various already not very nice pieces of assembler code. -- Poul-Henning Kamp | phk@FreeBSD.ORG FreeBSD Core-team. http://www.freebsd.org/~phk | phk@login.dknet.dk Private mailbox. whois: [PHK] | phk@tfs.com TRW Financial Systems, Inc. Power and ignorance is a disgusting cocktail. From owner-freebsd-smp Sat Apr 5 09:21:28 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id JAA17487 for smp-outgoing; Sat, 5 Apr 1997 09:21:28 -0800 (PST) Received: from spinner.DIALix.COM (root@spinner.dialix.com [192.203.228.67]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id JAA17474 for ; Sat, 5 Apr 1997 09:21:18 -0800 (PST) Received: from spinner.DIALix.COM (peter@localhost.DIALix.oz.au [127.0.0.1]) by spinner.DIALix.COM (8.8.5/8.8.5) with ESMTP id BAA18561; Sun, 6 Apr 1997 01:20:40 +0800 (WST) Message-Id: <199704051720.BAA18561@spinner.DIALix.COM> X-Mailer: exmh version 2.0gamma 1/27/96 To: cr@jcmax.com (Cyrus Rahman) cc: smp@freebsd.org Subject: Re: Deadlocking in SMP kernel In-reply-to: Your message of "Sat, 05 Apr 1997 11:33:04 EST." <9704051633.AA05399@corona.jcmax.com> Date: Sun, 06 Apr 1997 01:20:40 +0800 From: Peter Wemm Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Cyrus Rahman wrote: > There appears to be a situation in which the SMP kernel deadlocks on > mp_lock. With much help from Steve Passe, I've come up with the following > (still tentative) scenario: > > A process, running on cpu1, enters the kernel and obtains a lock. While it > has the lock, but before interrupts are redirected to cpu1 (or any time, if > TEST_LOPRIO isn't defined), an interrupt goes to cpu0, blocking (until it > obtains the lock) all lower priority interrupts. No.. Lower priority interupts will get into the kernel because the kernel entry lock is recursive. cpu0 will get and see the interupt. If it's masked at the time, it will be recorded for later. As cpu#0 lowers the masking, if the interrupt then becomes visible it will be serviced. cpu#0 will not release the lock until all known interrupts are unmasked and serviced. If the interupt is switched to cpu#1, there will be a problem since it will block waiting for cpu#0 to finish. This means that cpu#0 could service an interrupt of lower priority before the one that cpu#1 is aware of. It's been a while since I worked on the smp kernel, from memory the lowpri mode was to try and arrange for the cpu that has the kernel lock to be preferable to recieve interrupts from the apic[s], in order to get better irq latency. I don't remember if it was ever finished, I seem to recall Steve telling me that there was a major flaw in what we had in mind for some reason. > If for some reason the kernel now waits for an interrupt, there will be a > deadlock. I don't know anywhere that this happens, but yes, it could be a problem if it happens. What normally happens is that the kernel will sleep and switch out to another process on return to user mode. > Are there any places where the kernel waits for an interrupt to occur? > There are three places I found where software interrupts are generated by > the kernel - but I don't think any of them are relevant (two in icu.s, one > in locore.s). > > I suspect that understanding my previous question about why mp_lock needs > to be stored during cpu_switch() might be helpful - for there's clearly some > reason why mp_lock isn't always 1 in that routine, but I can't figure it out. > > For some reason the deadlock only seems to occur with APIC_IO defined, if > that provides any additional clues. Hmm.. several possibilities spring to mind... 1: There's a race somehow that we've missed in the apic masking code. This is not exceptionally unlikely since there is lazy masking happening and the i386 icu code is extremely 8259-pic aligned and doesn't really map to the apic very well. 2: There's a problem with having two cpu's taking an IRQ at very close intervals. 3: There's other cases where the enter/leave kernel locking is botched (eg: the fpu one that was missed up until a week or so ago). 4: You're using floating point.. I have my doubts about the fpu context switching and operating mode control, but others seem to have it working in spite of my grim expectations.. :-] > Cyrus Cheers, -Peter From owner-freebsd-smp Sat Apr 5 09:27:21 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id JAA17693 for smp-outgoing; Sat, 5 Apr 1997 09:27:21 -0800 (PST) Received: from spinner.DIALix.COM (root@spinner.dialix.com [192.203.228.67]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id JAA17687; Sat, 5 Apr 1997 09:27:09 -0800 (PST) Received: from spinner.DIALix.COM (peter@localhost.DIALix.oz.au [127.0.0.1]) by spinner.DIALix.COM (8.8.5/8.8.5) with ESMTP id BAA18615; Sun, 6 Apr 1997 01:26:51 +0800 (WST) Message-Id: <199704051726.BAA18615@spinner.DIALix.COM> X-Mailer: exmh version 2.0gamma 1/27/96 To: "Gary Palmer" cc: cr@jcmax.com (Cyrus Rahman), smp@freebsd.org Subject: Re: Questions about mp_lock In-reply-to: Your message of "Sat, 05 Apr 1997 12:13:41 EST." <12968.860260421@orion.webspan.net> Date: Sun, 06 Apr 1997 01:26:50 +0800 From: Peter Wemm Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk "Gary Palmer" wrote: > Peter Wemm wrote in message ID > <199704051704.BAA18422@spinner.DIALix.COM>: > > Moving the kernel locking up a layer and having a seperate entry/exit lock > > in the trap/syscall/interupt area would be a major win without too much > > cost. What we'd gain by that would be that we could then gradually move > > to a per-subsystem locking system perhaps based initially on which syscall > > or trap type. It'd be quite possible to have one cpu in the kernel doing > > IP checksumming on a packet, another in the vfs system somewhere, another > > doing some copy-on-write page copies in the vm system and so on. Things > > like getpid() would need no locking whatsoever. But that's for later once > > the basics are working. > > Question if you would: define`basics'? Umm, things like having it understand the pci bus in apic mode on new machines without having to tweak the code, or having IPI's sent to evict processes from other cpu's when the trying to give it a kill -9, having the system boot into multi-cpu mode without sysctl, being able to build world with /sys pointing to the smp tree, having per-cpu scratch space, and a zillion other things that would be nice. Oh, and having a working smp-capable lock manager would be nice too. :-) > Thanks, > > Gary > -- > Gary Palmer FreeBSD Core Team Member > FreeBSD: Turning PC's into workstations. See http://www.FreeBSD.ORG/ for info Cheers, -Peter From owner-freebsd-smp Sat Apr 5 11:15:26 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id LAA22667 for smp-outgoing; Sat, 5 Apr 1997 11:15:26 -0800 (PST) Received: from Ilsa.StevesCafe.com (sc-gw.StevesCafe.com [205.168.119.191]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id LAA22656 for ; Sat, 5 Apr 1997 11:15:21 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by Ilsa.StevesCafe.com (8.7.5/8.6.12) with SMTP id MAA10328; Sat, 5 Apr 1997 12:15:12 -0700 (MST) Message-Id: <199704051915.MAA10328@Ilsa.StevesCafe.com> X-Authentication-Warning: Ilsa.StevesCafe.com: Host localhost [127.0.0.1] didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: NISHIO Shuichi cc: freebsd-smp@freebsd.org Subject: Re: APIC_IO problem on Tyan S1668 In-reply-to: Your message of "Sat, 05 Apr 1997 23:49:35 +0900." <19970405234935F.nishio@elysium.kecl.ntt.co.jp> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Sat, 05 Apr 1997 12:15:12 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, >>What I did is: >>(1) Installed 3.0-970209-SNAP (worked fine) >>(2) cvsup-ed the SMP kernel source >>(3) did "cvs update -Pd -D '02/10/97 00:00:00 GMT" >After doing >> cvs -q update -Pd -D '02/09/97 00:00:00 GMT' > >that modification went away, so I added the following 3 lines manually. > ... There's the problem, I should have caught it sooner. The SMP kernel sources are mutually exclusive with the mainline kernel sources. When you did the cvsup not only did that modification go away, but almost everything else SMP specific must have gone away! reload the SMP kernel sources into a seperate directory, say /usr/src/smpsys. then cd to /usr/src/smpsys/i386/conf, config, make, install, etc., all in the /usr/src/smpsys tree. When the "include" headers are somewhat out of sync between the mainline code and the SMP code it sometimes becomes necessary to play games where you keep the SMP src in the same tree (ie /usr/src/sys), but since the SMP src and the 3.0-970209-SNAP are in sync you don't need to bother. --- > Should I recompile everything with the SMP kernel? no need. --- >I'll try a kernel without the patch to exception.s, recompile libkvm.a >and dmesg, and see what happens. again, no need. once you get the correct kernel source compiled your problems will go away. -- Steve Passe | powered by smp@csn.net | Symmetric MultiProcessor FreeBSD From owner-freebsd-smp Sat Apr 5 11:34:49 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id LAA23244 for smp-outgoing; Sat, 5 Apr 1997 11:34:49 -0800 (PST) Received: from Ilsa.StevesCafe.com (sc-gw.StevesCafe.com [205.168.119.191]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id LAA23239 for ; Sat, 5 Apr 1997 11:34:47 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by Ilsa.StevesCafe.com (8.7.5/8.6.12) with SMTP id MAA10562; Sat, 5 Apr 1997 12:34:39 -0700 (MST) Message-Id: <199704051934.MAA10562@Ilsa.StevesCafe.com> X-Authentication-Warning: Ilsa.StevesCafe.com: Host localhost [127.0.0.1] didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: vanmaren@fast.cs.utah.edu (Kevin Van Maren) cc: nishio@caleche.kecl.ntt.co.jp, freebsd-smp@freebsd.org Subject: Re: APIC_IO problem on Tyan S1668 In-reply-to: Your message of "Sat, 05 Apr 1997 09:28:36 MST." <199704051628.JAA17141@fast.cs.utah.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Sat, 05 Apr 1997 12:34:38 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, my explanation for this is the bad mptable causes ONE of the 2 cards to be serviced via the ISA INT, while the other is serviced via an upper IO APIC INT slot. This is probably confusing the PCI INT sharing code to the nth degree. Once Nishio gets past his other src code problems a quick kernel hack should fix this. -- Steve Passe | powered by smp@csn.net | Symmetric MultiProcessor FreeBSD From owner-freebsd-smp Sat Apr 5 11:51:47 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id LAA24113 for smp-outgoing; Sat, 5 Apr 1997 11:51:47 -0800 (PST) Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.50]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id LAA24107 for ; Sat, 5 Apr 1997 11:51:40 -0800 (PST) Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id MAA23338; Sat, 5 Apr 1997 12:33:01 -0700 From: Terry Lambert Message-Id: <199704051933.MAA23338@phaeton.artisoft.com> Subject: Re: APIC_IO problem on Tyan S1668 To: vanmaren@fast.cs.utah.edu (Kevin Van Maren) Date: Sat, 5 Apr 1997 12:33:01 -0700 (MST) Cc: nishio@caleche.kecl.ntt.co.jp, vanmaren@fast.cs.utah.edu, freebsd-smp@freebsd.org In-Reply-To: <199704051628.JAA17141@fast.cs.utah.edu> from "Kevin Van Maren" at Apr 5, 97 09:28:36 am X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > >I read, somewhere from AltaVista's output, that S1668 > >only supports bus mastering on 4 PCI slots out of 5 (I'm not sure > >whether this is true or not: Tyan's manual says nothing about this). > > > >Is this related to the problem? > > Tyan tech support WON'T be any help. We did have 5 PCI cards in > a S1662 (non-ATX version) that worked fine under FreeBSD 2.1.5 > but I never tried SMP with 5 cards. (This is why I knew about > the SMP table entries and the 5 slot stuff). I believe they > were all bus-mastering, as we used an ISA video board. This is probably the problem (the 5th slot, not the bus mastering). It has to do with PCI interrupt sharing. The PCI INT's are normally daisy-chained, and slots 4 and 5 (if 5 is present) are expected to share: slot 1 slot 2 slot 3 slot 4 slot 5 ,-. ,-. ,-. ,-. ,-. INT A --|A|-. ,------|B|-. ,------|C|-. ,------|D|----------|D| | | X | | X | | X | | | | INT B --| |-' \ ,----| |-' \ ,----| |-' \ ,----| |----------| | | | X | | X | | X | | | | INT C --| |---' \ ,--| |---' \ ,--| |---' \ ,--| |----------| | | | X | | X | | X | | | | INT D --| |-----' `--| |-----' `--| |-----' `--| |----------| | `-' `-' `-' `-' `-' By default, each PCI card will use the first interrupt connector, which will be A, B, C, or D, depending on the slot (note: old PCI hardware will *NOT* chain... it expects the boards to be jumper configurable, or all boards to share INT A). This problem is especially bad if you install an Adaptec 3940, which takes two interrupt lines, one per channel. It "shares" interrupts with whatever the adjacent card is; if you put it in: slot 1 slot 2 slot 3 slot 4 slot 5 ,-. ,-. ,-. ,-. ,-. |A| |B| |C| |D| |D| | | | | | | | | | | |B| |C| |D| |A| |A| Typically, this was "worked around" in early UP FreeBSD PCI support by placing the 3940 in slot 1 or slot 2, and a non-interrupting card (like a video board) in slot 2 or slot 3 (respectively). If the machine has 5 slots (3 "standard" and 2 "shared"), the worst possible place to put the 3940 is slot 3, 4, or 5, since it will conflict 3 slots instead of 2. The slot 5 in a "shared" PCI slot design will *always* require that PCI interrupt sharing be supported by the host OS. Probably you are running into a problem with SMP FreeBSD not being able to properly share PCI interrupts. Regards, Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. From owner-freebsd-smp Sat Apr 5 12:06:38 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id MAA25042 for smp-outgoing; Sat, 5 Apr 1997 12:06:38 -0800 (PST) Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.50]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id MAA25036; Sat, 5 Apr 1997 12:06:32 -0800 (PST) Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id MAA23368; Sat, 5 Apr 1997 12:49:04 -0700 From: Terry Lambert Message-Id: <199704051949.MAA23368@phaeton.artisoft.com> Subject: Re: Questions about mp_lock To: gpalmer@freebsd.org (Gary Palmer) Date: Sat, 5 Apr 1997 12:49:04 -0700 (MST) Cc: peter@spinner.dialix.com, cr@jcmax.com, smp@freebsd.org In-Reply-To: <12968.860260421@orion.webspan.net> from "Gary Palmer" at Apr 5, 97 12:13:41 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > Peter Wemm wrote in message ID > <199704051704.BAA18422@spinner.DIALix.COM>: > > Moving the kernel locking up a layer and having a seperate entry/exit lock > > in the trap/syscall/interupt area would be a major win without too much > > cost. What we'd gain by that would be that we could then gradually move > > to a per-subsystem locking system perhaps based initially on which syscall > > or trap type. It'd be quite possible to have one cpu in the kernel doing > > IP checksumming on a packet, another in the vfs system somewhere, another > > doing some copy-on-write page copies in the vm system and so on. Things > > like getpid() would need no locking whatsoever. But that's for later once > > the basics are working. > > Question if you would: define`basics'? This is what I have called "stage one" in implementing fine grain parallelism. This is the first stage to a lock "push down" on a per subsystem basis (subsystems being accessed by system calls). It's not quite correct that you would go to a different lock for the system call entrancy vs. fault and interrupt entrancy of the kernel... doing that would probably make the job *much* harder because you would have to fine-grain protect all of the interrupt and fault code instantly to make it safe. Really, the trap code needs to be reeentrant to the point of the system call dispatch. Initially, you would add a per syscall flag into sysent[]. If the flag were not present, you would hold the global lock around the call; otherwise, you would expect the global lock to be held by the system call code, and the call head to be reentrant. You would then pick a divisable subsystem (my choice would be FS, but FS would be too large and you should pick something else if you aren't an FS geek), and "push" the reentrancy as far down the call graph as you could. To accomplish this, you would need to hold the global lock around context manipulation: kernel global pool manipulation, data objects like vnodes for which the value you are setting/getting must be accomplished atomically with the compare that proceeds or follows it, etc.. This will divide the call handling code into "safed" and "not safed" regions. Eventually, you will want to split out per subsystem locks with an "intention exclusive" mode held against the global lock. This will let you lock against subsystem reentrancy via system calls, and against fault and interrupt reentrancy otherwise. This implies a single lock hierarchy in which the global mutex is at the top, so that you can compute transitive closure over the hierarchy (to compute transitive closure just means that you will detect when a deadlock would occur if a lock were granted, and block the requestor until a deadlock was no longer a possibility). So the first step is probably a push-down on the global lock, not the creation of a seperate global lock. Regards, Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. From owner-freebsd-smp Sat Apr 5 12:14:55 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id MAA25294 for smp-outgoing; Sat, 5 Apr 1997 12:14:55 -0800 (PST) Received: from Ilsa.StevesCafe.com (sc-gw.StevesCafe.com [205.168.119.191]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id MAA25285 for ; Sat, 5 Apr 1997 12:14:50 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by Ilsa.StevesCafe.com (8.7.5/8.6.12) with SMTP id NAA10988; Sat, 5 Apr 1997 13:14:26 -0700 (MST) Message-Id: <199704052014.NAA10988@Ilsa.StevesCafe.com> X-Authentication-Warning: Ilsa.StevesCafe.com: Host localhost [127.0.0.1] didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: Peter Wemm cc: cr@jcmax.com (Cyrus Rahman), Poul-Henning Kamp , smp@freebsd.org Subject: Re: Questions about mp_lock In-reply-to: Your message of "Sun, 06 Apr 1997 01:04:44 +0800." <199704051704.BAA18422@spinner.DIALix.COM> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Sat, 05 Apr 1997 13:14:26 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi everyone, It was great to see so many responses to the question, the list has been somewhat inactive for awhile... Summary of the problem: code: 3-0.970209-SNAP, -current SMP src APIC_IO and all recommended options for same. symptom: heavily loaded system (ie lots of INTs happening) "freezes" reason: cpu0 is trying to service an INT, spin-locks attempting to get the mp_lock, which evidently is perminately held by some process on cpu1. the lock count that is being held is usually 2, but sometimes only 1. open question: is the other cpu running a process that is somehow dead-locked waiting for a resource, ie is the lock value of 2 really true? OR is the lock count hosed, and the process on cpu1 not really holding the lock? reproducing the problem: although I have never seen this before, I can easily reproduce it by disabling the loprio code by changing TEST_LOPRIO to TEST_LOPRIO_NOT in smptests.h. The effect of this is to cause ALL INTs to be serviced by cpu0. apply this patch that Cyrus created: ------------------------------------- cut ------------------------------------- - *** mplock.s.dist Wed Dec 4 17:32:57 1996 --- mplock.s Thu Apr 3 08:20:23 1997 *************** *** 71,79 **** movl %eax, APIC_TPR(%ecx) /* set it */ #endif /** TEST_LOPRIO */ ret ! 3: cmpl $0xffffffff, (%edx) /* Wait for it to become free */ ! jne 3b ! jmp 2b /* XXX 1b ? */ /*********************************************************************** * int MPtrylock(unsigned int *lock) --- 71,88 ---- movl %eax, APIC_TPR(%ecx) /* set it */ #endif /** TEST_LOPRIO */ ret ! 3: movl $2000000000, %eax /* Timer */ ! 4: decl %eax ! jnz 5f ! pushl (%edx) ! pushl $pstrin ! movl $0xffffffff, (%edx) /* Let the panic grab a cpu for ddb */ ! call _panic ! 5: cmpl $0xffffffff, (%edx) /* Wait for it to become free */ ! jne 4b ! jmp 1b /* XXX 1b ? */ ! ! pstrin: .asciz "mplock: deadlock on %x" /*********************************************************************** * int MPtrylock(unsigned int *lock) *************** *** 128,134 **** ret 1: movl 4(%esp), %edx /* Get the address of the lock */ movl (%edx), %eax /* - get the value */ ! movl %eax,%ecx decl %ecx /* - new count is one less */ testl $0x00ffffff, %ecx /* - Unless it's zero... */ jnz 2f --- 137,149 ---- ret 1: movl 4(%esp), %edx /* Get the address of the lock */ movl (%edx), %eax /* - get the value */ ! ! cmpl $0xffffffff, %eax /* If it's free, we have a problem */ ! jne 3f ! pushl $rls_free ! call _panic ! ! 3: movl %eax,%ecx decl %ecx /* - new count is one less */ testl $0x00ffffff, %ecx /* - Unless it's zero... */ jnz 2f *************** *** 146,151 **** --- 161,169 ---- cmpxchg %ecx, (%edx) /* - try it atomically */ jne 1b /* ...do not collect $200 */ ret + + rls_free: + .asciz "mplock: releasing free lock" /*********************************************************************** * void get_mplock() ------------------------------------- cut ------------------------------------- - start a kernel build, then open a file for edit in another window or otherwise busy the system. the machine locks, and the patch drops you out in 30 seconds to several minute, be patient. when it does you see something like: 'panic (cpu#0): mplock: deadlock on 1000001' or 'panic (cpu#0): mplock: deadlock on 1000002', but mostly the latter. by disabling the TEST_LOPRIO code we guarantee a high frequency of hits where the cpu servicing the INT is NOT the one currently holding the lock. The loprio code *ATTEMPTS* to steer the INT to the cpu holding the lock (if any). BUT it will fail to do so a small percentage of the time since it isn't an atomic operation with reguards to whats happening on the other cpu(s). I didn't consider this to be fatal, just an inefficiency that we could live with. However that might not be the case.... With the loprio code in place this bug happens so seldom as to not affect most systems, but it IS still lurking there on all APIC_IO systemws, we need to find it!!! theroies, testers, etc. all welcome! -- Steve Passe | powered by smp@csn.net | Symmetric MultiProcessor FreeBSD From owner-freebsd-smp Sat Apr 5 21:34:19 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id VAA25411 for smp-outgoing; Sat, 5 Apr 1997 21:34:19 -0800 (PST) Received: from mail.MCESTATE.COM (mail.MCESTATE.COM [207.211.200.50]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id VAA25378 for ; Sat, 5 Apr 1997 21:34:16 -0800 (PST) Received: from localhost (vince@localhost) by mail.MCESTATE.COM (8.8.5/8.8.5) with SMTP id VAA07766 for ; Sat, 5 Apr 1997 21:34:09 -0800 (PST) Date: Sat, 5 Apr 1997 21:34:08 -0800 (PST) From: Vincent Poy To: freebsd-smp@FreeBSD.ORG Subject: Dual Pentium Motherboards Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Greetings everyone, We need to replace our crappy Dual P5-100 Pentium Motherboards with something reliable. What are some good motherboards with the model number that would work with the SMP stuff in FreeBSD reliably? Thanks. Cheers, Vince - vince@MCESTATE.COM - vince@GAIANET.NET ________ __ ____ Unix Networking Operations - FreeBSD-Real Unix for Free / / / / | / |[__ ] GaiaNet Corporation - M & C Estate / / / / | / | __] ] Beverly Hills, California USA 90210 / / / / / |/ / | __] ] HongKong Stars/Gravis UltraSound Mailing Lists Admin /_/_/_/_/|___/|_|[____] From owner-freebsd-smp Sat Apr 5 22:15:36 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id WAA27064 for smp-outgoing; Sat, 5 Apr 1997 22:15:36 -0800 (PST) Received: from mail0.iij.ad.jp (mail0.iij.ad.jp [202.232.2.113]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id WAA27058 for ; Sat, 5 Apr 1997 22:15:34 -0800 (PST) Received: from uucp2.iij.ad.jp (uucp2.iij.ad.jp [202.232.2.202]) by mail0.iij.ad.jp (8.8.5+2.7Wbeta5/3.5Wpl4-MAIL) with SMTP id PAA06103; Sun, 6 Apr 1997 15:15:13 +0900 (JST) Received: (from uucp@localhost) by uucp2.iij.ad.jp (8.6.12+2.4W/3.3W9-UUCP) with UUCP id PAA11933; Sun, 6 Apr 1997 15:15:13 +0900 Received: from tyd1.tydfam.iijnet.or.jp (tyd1.tydfam.iijnet.or.jp [192.168.1.2]) by tydfam.iijnet.or.jp (8.8.5/3.4W2-uucp) with ESMTP id KAA13052; Sun, 6 Apr 1997 10:22:25 +0900 (JST) Received: from localhost.tydfam.iijnet.or.jp (localhost.tydfam.iijnet.or.jp [127.0.0.1]) by tyd1.tydfam.iijnet.or.jp (8.8.5/3.4Wnomx) with SMTP id KAA00429; Sun, 6 Apr 1997 10:22:24 +0900 (JST) Message-Id: <199704060122.KAA00429@tyd1.tydfam.iijnet.or.jp> X-Authentication-Warning: tyd1.tydfam.iijnet.or.jp: localhost.tydfam.iijnet.or.jp [127.0.0.1] didn't use HELO protocol To: peter@spinner.dialix.com Cc: dfr@nlsystems.com, smp@freebsd.org Subject: Re: Any chance of sync with current? Reply-To: ken@tydfam.iijnet.or.jp In-Reply-To: Your message of "Sat, 05 Apr 1997 23:53:47 +0800" References: <199704051553.XAA17993@spinner.DIALix.COM> X-Mailer: Mew version 1.55 on Emacs 19.34.2, Mule 2.3 Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Date: Sun, 06 Apr 1997 10:22:24 +0900 From: Takeshi Yamada Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk I really appreciate if it is merged into -current tree which is now working without any significant problem at least with my machine ASUS P/I-P6NP5. And if you wait a month or so, the kernel part of -current tree might become messy again with modular kernel thing which sounds to me under discussion with very likely to happen. Again, I eagerly ask you to consider merging it to present -current tree. Regards,