From owner-freebsd-smp Fri Jan 1 13:14:55 1999 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id NAA24925 for freebsd-smp-outgoing; Fri, 1 Jan 1999 13:14:55 -0800 (PST) (envelope-from owner-freebsd-smp@FreeBSD.ORG) Received: from panzer.plutotech.com (panzer.plutotech.com [206.168.67.125]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id NAA24906 for ; Fri, 1 Jan 1999 13:14:52 -0800 (PST) (envelope-from ken@panzer.plutotech.com) Received: (from ken@localhost) by panzer.plutotech.com (8.9.1/8.8.5) id OAA43959; Fri, 1 Jan 1999 14:14:26 -0700 (MST) From: "Kenneth D. Merry" Message-Id: <199901012114.OAA43959@panzer.plutotech.com> Subject: Re: ASUS P65UP5 Dual PPro problems In-Reply-To: from Guy Helmer at "Dec 31, 98 12:01:25 pm" To: ghelmer@scl.ameslab.gov (Guy Helmer) Date: Fri, 1 Jan 1999 14:14:26 -0700 (MST) Cc: freebsd-smp@FreeBSD.ORG X-Mailer: ELM [version 2.4ME+ PL28s (25)] MIME-Version: 1.0 Content-Type: multipart/mixed; boundary=ELM915225266-43870-0_ Content-Transfer-Encoding: 7bit Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org --ELM915225266-43870-0_ Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Guy Helmer wrote... > We are having trouble with a bunch of ASUS P65UP5 machines with dual > 200MHz PPro's; each machine is configured exactly the same, with 256MB > RAM, a Tulip Fast Ethernet interface and IDE disk drive. Symptoms are > that the machine will either freeze solid without any console message, or > (according to top(1)) a process is running on CPU1 (and never changes from > CPU1) but is not getting any CPU time (WCPU and CPU are both 0%). This > seems to happen randomly, but usually when the processes are doing network > communication. > > The kernel is FreeBSD SMP built from sources dated Nov 19 1998. The > machine's BIOS is set to MP spec 1.4. The machines work fine under > uniprocessor Linux 2.0.3x, but exhibit similar behavior with SMP Linux > 2.0.3x or 2.1.x. > > I'm including a sample mptable(8) output below, in the hope that someone > can help diagnose this. Our group has been working on this problem for > some time and searching the net for info, but we have found nothing yet > that would help. If dmesg output would be useful, please let me know (I > neglected to capture the boot messages :-(, and now the machines are back > into uniprocessor production use... Well, FWIW, I have the same sort of machine (same motherboard and processors), but running with -current from early December. I haven't had any trouble. I've got SCSI disks, though, not IDE. I do have two DEC Tulip based SMC cards, though. One question I have, though, is what kind of RAM you have in the machine? i.e., what configuration. I tried putting 256MB in my machine, using 8 32MB (parity) SIMMs, but I wasn't able to keep it like that. I got random NMIs with 8 SIMMs on board. I reduced it to 6 SIMMs (192MB), and the NMIs stopped. I'm fairly certain they weren't parity errors, since I've had bad memory on other machines and FreeBSD would actually panic with a "ram parity error" NMI. The NMI panic message I got with these errors didn't state a specific problem. The SIMMs I have all have 24 chips on board, so they're within ASUS' stated specs, but my guess is that I exceeded the load that the memory subsystem could take. The NMIs generally only occurred under high memory load. It certainly sounds suspicious that these machines fail under SMP with both Linux and FreeBSD. In case it helps, I've attached the output of mptable -verbose -dmesg from my machine. Ken -- Kenneth Merry ken@plutotech.com --ELM915225266-43870-0_ Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: attachment; filename=panzer.mptable.010199 Content-Description: panzer.mptable.010199 Content-Transfer-Encoding: 7bit =============================================================================== MPTable, version 2.0.15 looking for EBDA pointer @ 0x040e, found, searching EBDA @ 0x0009fc00 searching CMOS 'top of mem' @ 0x0009f800 (638K) searching default 'top of mem' @ 0x0009fc00 (639K) searching BIOS @ 0x000f0000 MP FPS found in BIOS @ physical addr: 0x000f60b0 ------------------------------------------------------------------------------- MP Floating Pointer Structure: location: BIOS physical address: 0x000f60b0 signature: '_MP_' length: 16 bytes version: 1.4 checksum: 0x8b mode: Virtual Wire ------------------------------------------------------------------------------- MP Config Table Header: physical address: 0x000f5caa signature: 'PCMP' base table length: 292 version: 1.4 checksum: 0x4e OEM ID: 'OEM00000' Product ID: 'PROD00000000' OEM table pointer: 0x00000000 OEM table size: 0 entry count: 28 local APIC address: 0xfee00000 extended table length: 0 extended table checksum: 0 ------------------------------------------------------------------------------- MP Config Base Table Entries: -- Processors: APIC ID Version State Family Model Step Flags 1 0x11 BSP, usable 6 1 7 0xfbff 0 0x11 AP, usable 6 1 7 0xfbff -- Bus: Bus ID Type 0 PCI 1 PCI 2 ISA -- I/O APICs: APIC ID Version State Address 2 0x11 usable 0xfec00000 -- I/O Ints: Type Polarity Trigger Bus ID IRQ APIC ID PIN# ExtINT conforms conforms 2 0 2 0 INT conforms conforms 2 1 2 1 INT conforms conforms 2 0 2 2 INT conforms conforms 2 3 2 3 INT conforms conforms 2 4 2 4 INT conforms conforms 2 5 2 5 INT conforms conforms 2 6 2 6 INT conforms conforms 2 7 2 7 INT conforms conforms 2 8 2 8 INT conforms conforms 2 9 2 9 INT conforms conforms 2 10 2 10 INT conforms conforms 2 11 2 11 INT conforms conforms 2 12 2 12 INT conforms conforms 2 15 2 15 INT active-lo level 1 4:A 2 19 INT active-lo level 1 5:A 2 16 INT active-lo level 0 10:A 2 18 INT active-lo level 0 12:A 2 16 INT active-lo level 0 11:A 2 17 INT active-lo level 0 13:A 2 19 -- Local Ints: Type Polarity Trigger Bus ID IRQ APIC ID PIN# ExtINT active-hi edge 2 0 255 0 NMI active-hi edge 2 0 255 1 ------------------------------------------------------------------------------- # SMP kernel config file options: # Required: options SMP # Symmetric MultiProcessor Kernel options APIC_IO # Symmetric (APIC) I/O # Optional (built-in defaults will work in most cases): #options NCPU=2 # number of CPUs #options NBUS=3 # number of busses #options NAPIC=1 # number of IO APICs #options NINTR=24 # number of INTs ------------------------------------------------------------------------------- dmesg output: Copyright (c) 1992-1998 FreeBSD Inc. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. FreeBSD 3.0-CURRENT #0: Tue Dec 8 19:05:13 MST 1998 ken@panzer.plutotech.com:/usr/home/ken/perforce/cam/sys/compile/panzer Timecounter "i8254" frequency 1193182 Hz CPU: Pentium Pro (686-class CPU) Origin = "GenuineIntel" Id = 0x617 Stepping=7 Features=0xfbff real memory = 201326592 (196608K bytes) avail memory = 192573440 (188060K bytes) Programming 24 pins in IOAPIC #0 FreeBSD/SMP: Multiprocessor motherboard cpu0 (BSP): apic id: 1, version: 0x00040011, at 0xfee00000 cpu1 (AP): apic id: 0, version: 0x00040011, at 0xfee00000 io0 (APIC): apic id: 2, version: 0x00170011, at 0xfec00000 DEVFS: ready for devices Probing for devices on PCI bus 0: chip0: rev 0x02 on pci0.0.0 chip1: rev 0x01 on pci0.1.0 chip2: rev 0x02 on pci0.9.0 de0: rev 0x11 int a irq 18 on pci0.10.0 de0: SMC 9332DST 21140 [10-100Mb/s] pass 1.1 de0: address 00:00:c0:5c:d2:be de0: enabling 10baseT port bktr0: rev 0x11 int a irq 17 on pci0.11.0 bti2c0: iicbb0 on bti2c0 iicbus0 on iicbb0 master-only smbus0 on bti2c0 smb0: on smbus0 addr 0x92 Hauppauge WinCast/TV, Philips NTSC tuner, dbx stereo. de1: rev 0x12 int a irq 16 on pci0.12.0 de1: SMC 9332DST 21140 [10-100Mb/s] pass 1.2 de1: address 00:00:c0:53:3d:e7 de1: enabling 10baseT port vga0: rev 0x01 int a irq 19 on pci0.13.0 Probing for devices on PCI bus 1: ahc0: rev 0x00 int a irq 19 on pci1.4.0 ahc0: aic7880 Wide Channel A, SCSI Id=7, 16/255 SCBs ahc1: rev 0x00 int a irq 16 on pci1.5.0 ahc1: aic7880 Wide Channel B, SCSI Id=7, 16/255 SCBs Probing for PnP devices: CSN 1 Vendor ID: GRV0001 [0x0100561e] Serial 0x00000001 Comp ID: PNPb02f [0x2fb0d041] mss_attach 1 at 0x328 irq 11 dma 6:5 flags 0x15 pcm1 (GusPnP sn 0x00000001) at 0x328-0x32f irq 11 drq 6 flags 0x15 on isa Probing for devices on the ISA bus: sc0 at 0x60-0x6f irq 1 on motherboard sc0: VGA color <16 virtual consoles, flags=0x0> sio0 at 0x3f8-0x3ff irq 4 flags 0x30 on isa sio0: type 16550A, console sio1 at 0x2f8-0x2ff irq 3 on isa sio1: type 16550A lpt0 at 0x378-0x37f irq 7 on isa lpt0: Interrupt-driven port lp0: TCP/IP capable interface psm0 at 0x60-0x64 irq 12 on motherboard psm0: model Generic PS/2 mouse, device ID 0 fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa fdc0: FIFO enabled, 8 bytes threshold fd0: 1.44MB 3.5in pcm0 not found npx0 on motherboard npx0: INT 16 interface DEVFS: ready to run APIC_IO: Testing 8254 interrupt delivery APIC_IO: routing 8254 via pin 2 IP packet filtering initialized, divert enabled, rule-based forwarding disabled, logging limited to 100 packets/entry Waiting 2 seconds for SCSI devices to settle SMP: AP CPU #1 Launched! Sending WDTR! (probe16:ahc1:0:1:0): Sending SDTR!! da0 at ahc0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI2 device da0: 40.0MB/s transfers (20.0MHz, offset 8, 16bit), Tagged Queueing Enabled da0: 4341MB (8890760 512 byte sectors: 255H 63S/T 553C) da1 at ahc1 bus 0 target 1 lun 0 da1: Fixed Direct Access SCSI3 device da1: 40.0MB/s transfers (20.0MHz, offset 8, 16bit), Tagged Queueing Enabled da1: 8705MB (17829870 512 byte sectors: 255H 63S/T 1109C) changing root device to da0s2a ffs_mountfs: superblock updated for soft updates ffs_mountfs: superblock updated for soft updates ffs_mountfs: superblock updated for soft updates cd0 at ahc1 bus 0 target 4 lun 0 cd0: Removable CD-ROM SCSI2 device cd0: 10.0MB/s transfers (10.0MHz, offset 15) cd0: cd present [213385 x 2048 byte records] pid 250 (Xaccel): trap 12 with interrupts disabled (da0:ahc0:0:0:0): tagged openings now 31 (da0:ahc0:0:0:0): tagged openings now 30 (da0:ahc0:0:0:0): tagged openings now 29 (da0:ahc0:0:0:0): tagged openings now 28 (da0:ahc0:0:0:0): tagged openings now 27 (da0:ahc0:0:0:0): tagged openings now 26 (da0:ahc0:0:0:0): tagged openings now 25 (da0:ahc0:0:0:0): tagged openings now 24 pid 2347 (Xaccel): trap 12 with interrupts disabled (da1:ahc1:0:1:0): tagged openings now 64 pid 65091 (communicator-4.5), uid 1000: exited on signal 10 (core dumped) pid 65981 (Xaccel): trap 12 with interrupts disabled =============================================================================== --ELM915225266-43870-0_-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message