From owner-freebsd-stable@FreeBSD.ORG Mon Jan 17 11:01:43 2005 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C9BF116A568; Mon, 17 Jan 2005 11:01:43 +0000 (GMT) Received: from mailgate1.zdv.Uni-Mainz.DE (mailgate1.zdv.Uni-Mainz.DE [134.93.178.129]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5AE0843D1F; Mon, 17 Jan 2005 11:01:42 +0000 (GMT) (envelope-from ohartman@uni-mainz.de) Received: from [134.93.180.218] (edda.Physik.Uni-Mainz.DE [134.93.180.218]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mailgate1.zdv.Uni-Mainz.DE (Postfix) with ESMTP id 318D830008EB; Mon, 17 Jan 2005 12:01:40 +0100 (CET) Message-ID: <41EB9B0C.90305@uni-mainz.de> Date: Mon, 17 Jan 2005 12:01:32 +0100 From: "O. Hartmann" Organization: Institut =?ISO-8859-15?Q?f=FCr_Geophysik?= User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; de-AT; rv:1.7.5) Gecko/20050102 X-Accept-Language: de-de, en MIME-Version: 1.0 To: freebsd-smp@freebsd.org, freebsd-stable@freebsd.org Content-Type: multipart/mixed; boundary="------------000308020804060806030307" X-Virus-Scanned: by amavisd-new at uni-mainz.de Subject: FreeBSD 5.3-SMP/IRQ problems (again) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Jan 2005 11:01:44 -0000 This is a multi-part message in MIME format. --------------000308020804060806030307 Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit Dear Sirs. I reported very strange behaviours of FreeBSD 5.3 on a suspicous hardware platform of mine and now I would like to repeat this and hope someone can offer me some help. The reason why I repeat this suspected bug is because I do not really beliefe in a hardware fault due to some very strange behaviours of FreeBSD 5.3 on this box. So, hope you can follow off my non-engeneer-English. I utilize FreeBSD now for about 8 years on several plattforms especially SMP plattforms as they became available for FreeBSD. These boxes were uitilized for scientif-server-services and as high-performance-desktop plattforms as FreeBSD was prior to 5.X as solid as a rock (with the exeception of the start off of 4.0). Today I work with my private plattform at my lab because our computer center prefers Linux and I do not. So my hardware plattform seems to be a little bit 'ancient', but I think a lot of ours still use these boxes for high-duty tasks. All right, here the facts. Hardware: Mainboard ASUS CUR-DLS, two 1GHz PIII, 2x 512MB ECC/reg, 3x SCSI U160 harddrives, 1x Intel 1000/Pro 64Bit PCI server 1GBit server NIC, DVD-RW NEC ND3500AG/2.18 attached to ATA controler. Please keep in mind that this mobo has a built in PCI VGA controler. BIOS has been updated to the latest avaiable BIOS and after this hasn't fixed any problem I updated to the latest BETA BIOS available for this mobo. After POST I get a summary screen and realized, that VGA controler is not attached to an IRQ (and I suspect FreeBSD 5.3 having much troubles with IRQ routing, but I will report more later on). Firt thing I suspect courios: Booting machine in single user mode for maintainance-purposes remains box buggy! After a while of heavy screen output as done via compiling a kernel or building world or 'find'-ing all files of the file system or showing the contents of a file via more or similar freezes the box/the screen. I can then type 'return' and watch the blank lines filled in, but there is nothing more, the box seems to be stuck. Only rescue is a reboot. This happens to all variants of booting off single user mode, SMP enabled/disabled in the kernel, apic enabled/disabled, acpi enabled/disabled. The only way to get rid of this is to plugg in a separate PCI VGA card into another slot!! Then machine boots correctly into single user mode - but dies immediately when booting off multi-user mode anyway. With UP and multiuser mode, this box is with X11 GUI (Xorg or XFree86) very stable (using built in VGA). Next harsh problem is plugging in sound- or VGA-cards. It seems to be highly dependend on which slot such cards get plugged in. Different sound cards, different PCI VGA cards - same problem. FreeBSD 5.3 starts off and then freezes after init of the SCSI controller. The same thing happens when disabling both serial ports! FreeBSD dies after init of the SCSI controler. The mobo has two 64 Bit PCI slots and I use one of them for the Intel 1000/Pro GBit NIC. At the now choosen slot usage of a sound card is impossible (most near 64 bit slot to the CPUs). FreeBSD dies on every combination (w/ or w/o SCPI) I tried. SMP is impossible, w/ or w/o ACPI. FreeBSD 5.3 dies after a while of doing graphical output with the X11 GUI. I stressed the machine today with buildworld and build kernel and building openoffice 1.1.4 with console output and for 15 hours the box was stable as a rock. But switching to the X11 GUI doing some FireFox jobs (simply surfing the www) let the system die within minutes. Sometimes I can 'feel' when a crash is arising, the box is a kind of 'calm' and I can switch to the console sometimes catching the error message from the debugger, but sometimes not. This let me suspect the operating system 'waiting' for something related to the second CPU or similar and waiting forever. I'm not familiar with kernel programming and development, so my report seems to be a bit of 'strange', sorry for that. Another couriosity is booting the box in 'safe mode', I thing APIC is off, SMP is off and acpi is off. The box becomes slow, grapical output 'hangs' for seconds, freezes and defreezes and the system remains a kind of 'not useable'. The utilized mobo ASUS CUR-DLS uses the RCC LE 3.0 champion chipset. The IRQ problems (using of PCI cards in any PCI slot impossible) are similar to problems I had in the past with FreeBSD 5.0 on a TYAN Thunder 2500 box, were I wasn't able plugging the AMI RAID controler in any of the 6 64Bit slots. FreeBSD got stuck at the same place: when the built in SCSI controler (LSI logic 1010-33 or 894-33) get initialised. This makes me very courios. Now I build akernel with debug options and I will try to catch a kernel dump. maybe someone of yours is interested in that. I will attach mptabel -dmesg output, hope it is of your convenience. Oliver --------------000308020804060806030307 Content-Type: text/plain; name="mptable" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="mptable" =============================================================================== MPTable, version 2.0.15 ------------------------------------------------------------------------------- MP Floating Pointer Structure: location: BIOS physical address: 0x000f5270 signature: '_MP_' length: 16 bytes version: 1.4 checksum: 0xe3 mode: Virtual Wire ------------------------------------------------------------------------------- MP Config Table Header: physical address: 0x000f4e60 signature: 'PCMP' base table length: 276 version: 1.4 checksum: 0x12 OEM ID: 'OEM00000' Product ID: 'PROD00000000' OEM table pointer: 0x00000000 OEM table size: 0 entry count: 26 local APIC address: 0xfee00000 extended table length: 124 extended table checksum: 198 ------------------------------------------------------------------------------- MP Config Base Table Entries: -- Processors: APIC ID Version State Family Model Step Flags 3 0x11 BSP, usable 6 8 6 0x387fbff 0 0x11 AP, usable 6 8 6 0x387fbff -- Bus: Bus ID Type 0 PCI 1 PCI 2 ISA -- I/O APICs: APIC ID Version State Address 2 0x11 usable 0xfec00000 3 0x11 usable 0xfec01000 -- I/O Ints: Type Polarity Trigger Bus ID IRQ APIC ID PIN# ExtINT conforms conforms 2 0 2 0 INT conforms conforms 2 1 2 1 INT conforms conforms 2 0 2 2 INT conforms conforms 2 3 2 3 INT conforms conforms 2 4 2 4 INT conforms conforms 2 6 2 6 INT conforms conforms 2 7 2 7 INT conforms conforms 2 8 2 8 INT conforms conforms 2 12 2 12 INT conforms conforms 2 13 2 13 INT conforms conforms 2 14 2 14 INT conforms conforms 2 15 2 15 INT active-lo level 0 15:A 3 14 INT active-lo level 2 9 2 9 INT active-lo level 1 2:A 3 5 INT active-lo level 1 5:A 3 8 INT active-lo level 1 5:B 3 9 -- Local Ints: Type Polarity Trigger Bus ID IRQ APIC ID PIN# ExtINT active-hi edge 2 0 255 0 NMI active-hi edge 2 0 255 1 ------------------------------------------------------------------------------- MP Config Extended Table Entries: -- System Address Space bus ID: 0 address type: I/O address address base: 0x0 address range: 0x10000 -- System Address Space bus ID: 0 address type: memory address address base: 0x40000000 address range: 0xbebe0000 -- System Address Space bus ID: 0 address type: prefetch address address base: 0xfebe0000 address range: 0xe9420000 -- System Address Space bus ID: 0 address type: memory address address base: 0xe8000000 address range: 0x18000000 -- System Address Space bus ID: 0 address type: memory address address base: 0xa0000 address range: 0x20000 -- Bus Heirarchy bus ID: 2 bus info: 0x01 parent bus ID: 0 -- Compatibility Bus Address bus ID: 0 address modifier: add predefined range: 0x00000000 -- Compatibility Bus Address bus ID: 0 address modifier: add predefined range: 0x00000001 ------------------------------------------------------------------------------- dmesg output: Copyright (c) 1992-2005 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.3-STABLE #35: Sun Jan 16 17:27:11 UTC 2005 root@edda.physik.uni-mainz.de:/usr/obj/usr/src/sys/EDDA ACPI APIC Table: Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel Pentium III (1000.04-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x686 Stepping = 6 Features=0x387fbff real memory = 1073721344 (1023 MB) avail memory = 1041166336 (992 MB) ioapic0 irqs 0-15 on motherboard ioapic1 irqs 16-31 on motherboard netsmb_dev: loaded npx0: [FAST] npx0: on motherboard npx0: INT 16 interface acpi0: on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000 acpi_timer0: <32-bit timer at 3.579545MHz> port 0xe408-0xe40b on acpi0 cpu0: on acpi0 acpi_button0: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pci0: at device 7.0 (no driver attached) isab0: port 0xe800-0xe80f at device 15.0 on pci0 isa0: on isab0 atapci0: port 0xd400-0xd40f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 15.1 on pci0 ata0: channel #0 on atapci0 ata1: channel #1 on atapci0 ohci0: mem 0xfc000000-0xfc000fff irq 9 at device 15.2 on pci0 ohci0: [GIANT-LOCKED] usb0: OHCI version 1.0, legacy support usb0: on ohci0 usb0: USB revision 1.0 uhub0: (0x1166) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 4 ports with 4 removable, self powered ugen0: OmniVision OV511+ Camera, rev 1.00/1.00, addr 2 pcib1: on acpi0 pci1: on pcib1 em0: port 0xd000-0xd03f mem 0xfb800000-0xfb81ffff irq 21 at device 2.0 on pci1 em0: Ethernet address: 00:07:e9:14:8f:7b em0: Speed:N/A Duplex:N/A sym0: <1010-33> port 0xb800-0xb8ff mem 0xfa800000-0xfa801fff,0xfb000000-0xfb0003ff irq 24 at device 5.0 on pci1 sym0: Symbios NVRAM, ID 7, Fast-80, LVD, parity checking sym0: open drain IRQ line driver, using on-chip SRAM sym0: using LOAD/STORE-based firmware. sym0: handling phase mismatch from SCRIPTS. sym0: [GIANT-LOCKED] sym1: <1010-33> port 0xb400-0xb4ff mem 0xf9800000-0xf9801fff,0xfa000000-0xfa0003ff irq 25 at device 5.1 on pci1 sym1: Symbios NVRAM, ID 7, Fast-80, LVD, parity checking sym1: open drain IRQ line driver, using on-chip SRAM sym1: using LOAD/STORE-based firmware. sym1: handling phase mismatch from SCRIPTS. sym1: [GIANT-LOCKED] atkbdc0: port 0x64,0x60 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] psm0: irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: model IntelliMouse, device ID 3 sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A ppc0: port 0x778-0x77a,0x378-0x37f irq 7 drq 3 on acpi0 ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/8 bytes threshold ppbus0: on ppc0 lpt0: on ppbus0 lpt0: Interrupt-driven port fdc0: port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0 fdc0: [FAST] fd0: <1440-KB 3.5" drive> on fdc0 drive 0 orm0: at iomem 0xc8000-0xcbfff,0xc0000-0xc7fff on isa0 pmtimer0 on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <8 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 fb0 at vga0 Timecounter "TSC" frequency 1000040215 Hz quality 800 Timecounters tick every 1.250 msec Fast IPsec: Initialized Security Association Processing. acd0: DVDR at ata0-master UDMA33 Waiting 3 seconds for SCSI devices to settle (noperiph:sym0:0:-1:-1): SCSI BUS reset delivered. (noperiph:sym1:0:-1:-1): SCSI BUS reset delivered. da0 at sym0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-3 device da0: 160.000MB/s transfers (80.000MHz, offset 62, 16bit), Tagged Queueing Enabled da0: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C) da1 at sym0 bus 0 target 1 lun 0 da1: Fixed Direct Access SCSI-3 device da1: 160.000MB/s transfers (80.000MHz, offset 62, 16bit), Tagged Queueing Enabled da1: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C) da2 at sym0 bus 0 target 2 lun 0 da2: Fixed Direct Access SCSI-3 device da2: 160.000MB/s transfers (80.000MHz, offset 62, 16bit), Tagged Queueing Enabled da2: 17429MB (35694904 512 byte sectors: 255H 63S/T 2221C) cd0 at ata0 bus 0 target 0 lun 0 cd0: <_NEC DVD_RW ND-3500AG 2.18> Removable CD-ROM SCSI-0 device cd0: 33.000MB/s transfers cd0: Attempt to query device size failed: NOT READY, Medium not present GEOM_LABEL: Label for provider da0s1d is ufs/var. GEOM_LABEL: Label for provider da0s1e is ufs/compat. GEOM_LABEL: Label for provider da0s1f is ufs/src. GEOM_LABEL: Label for provider da0s1g is ufs/usr. GEOM_LABEL: Label for provider da0s1h is ufs/local. GEOM_LABEL: Label for provider da1s1d is ufs/obj. GEOM_LABEL: Label for provider da1s1e is ufs/ports. GEOM_LABEL: Label for provider da1s1f is ufs/scratch. GEOM_LABEL: Label for provider da1s1g is ufs/data. Mounting root from ufs:/dev/da0s1a GEOM_LABEL: Label for provider da0s1e is ufs/compat. GEOM_LABEL: Label for provider da1s1g is ufs/data. GEOM_LABEL: Label for provider da1s1d is ufs/obj. GEOM_LABEL: Label for provider da0s1g is ufs/usr. GEOM_LABEL: Label for provider da1s1e is ufs/ports. GEOM_LABEL: Label for provider da1s1f is ufs/scratch. GEOM_LABEL: Label for provider da0s1h is ufs/local. GEOM_LABEL: Label for provider da0s1f is ufs/src. GEOM_LABEL: Label for provider da0s1d is ufs/var. pflog0: promiscuous mode enabled em0: Link is up 100 Mbps Full Duplex =============================================================================== --------------000308020804060806030307--