From owner-freebsd-smp@FreeBSD.ORG Mon Jan 17 11:01:43 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C9BF116A568; Mon, 17 Jan 2005 11:01:43 +0000 (GMT) Received: from mailgate1.zdv.Uni-Mainz.DE (mailgate1.zdv.Uni-Mainz.DE [134.93.178.129]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5AE0843D1F; Mon, 17 Jan 2005 11:01:42 +0000 (GMT) (envelope-from ohartman@uni-mainz.de) Received: from [134.93.180.218] (edda.Physik.Uni-Mainz.DE [134.93.180.218]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mailgate1.zdv.Uni-Mainz.DE (Postfix) with ESMTP id 318D830008EB; Mon, 17 Jan 2005 12:01:40 +0100 (CET) Message-ID: <41EB9B0C.90305@uni-mainz.de> Date: Mon, 17 Jan 2005 12:01:32 +0100 From: "O. Hartmann" Organization: Institut =?ISO-8859-15?Q?f=FCr_Geophysik?= User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; de-AT; rv:1.7.5) Gecko/20050102 X-Accept-Language: de-de, en MIME-Version: 1.0 To: freebsd-smp@freebsd.org, freebsd-stable@freebsd.org Content-Type: multipart/mixed; boundary="------------000308020804060806030307" X-Virus-Scanned: by amavisd-new at uni-mainz.de Subject: FreeBSD 5.3-SMP/IRQ problems (again) X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Jan 2005 11:01:44 -0000 This is a multi-part message in MIME format. --------------000308020804060806030307 Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit Dear Sirs. I reported very strange behaviours of FreeBSD 5.3 on a suspicous hardware platform of mine and now I would like to repeat this and hope someone can offer me some help. The reason why I repeat this suspected bug is because I do not really beliefe in a hardware fault due to some very strange behaviours of FreeBSD 5.3 on this box. So, hope you can follow off my non-engeneer-English. I utilize FreeBSD now for about 8 years on several plattforms especially SMP plattforms as they became available for FreeBSD. These boxes were uitilized for scientif-server-services and as high-performance-desktop plattforms as FreeBSD was prior to 5.X as solid as a rock (with the exeception of the start off of 4.0). Today I work with my private plattform at my lab because our computer center prefers Linux and I do not. So my hardware plattform seems to be a little bit 'ancient', but I think a lot of ours still use these boxes for high-duty tasks. All right, here the facts. Hardware: Mainboard ASUS CUR-DLS, two 1GHz PIII, 2x 512MB ECC/reg, 3x SCSI U160 harddrives, 1x Intel 1000/Pro 64Bit PCI server 1GBit server NIC, DVD-RW NEC ND3500AG/2.18 attached to ATA controler. Please keep in mind that this mobo has a built in PCI VGA controler. BIOS has been updated to the latest avaiable BIOS and after this hasn't fixed any problem I updated to the latest BETA BIOS available for this mobo. After POST I get a summary screen and realized, that VGA controler is not attached to an IRQ (and I suspect FreeBSD 5.3 having much troubles with IRQ routing, but I will report more later on). Firt thing I suspect courios: Booting machine in single user mode for maintainance-purposes remains box buggy! After a while of heavy screen output as done via compiling a kernel or building world or 'find'-ing all files of the file system or showing the contents of a file via more or similar freezes the box/the screen. I can then type 'return' and watch the blank lines filled in, but there is nothing more, the box seems to be stuck. Only rescue is a reboot. This happens to all variants of booting off single user mode, SMP enabled/disabled in the kernel, apic enabled/disabled, acpi enabled/disabled. The only way to get rid of this is to plugg in a separate PCI VGA card into another slot!! Then machine boots correctly into single user mode - but dies immediately when booting off multi-user mode anyway. With UP and multiuser mode, this box is with X11 GUI (Xorg or XFree86) very stable (using built in VGA). Next harsh problem is plugging in sound- or VGA-cards. It seems to be highly dependend on which slot such cards get plugged in. Different sound cards, different PCI VGA cards - same problem. FreeBSD 5.3 starts off and then freezes after init of the SCSI controller. The same thing happens when disabling both serial ports! FreeBSD dies after init of the SCSI controler. The mobo has two 64 Bit PCI slots and I use one of them for the Intel 1000/Pro GBit NIC. At the now choosen slot usage of a sound card is impossible (most near 64 bit slot to the CPUs). FreeBSD dies on every combination (w/ or w/o SCPI) I tried. SMP is impossible, w/ or w/o ACPI. FreeBSD 5.3 dies after a while of doing graphical output with the X11 GUI. I stressed the machine today with buildworld and build kernel and building openoffice 1.1.4 with console output and for 15 hours the box was stable as a rock. But switching to the X11 GUI doing some FireFox jobs (simply surfing the www) let the system die within minutes. Sometimes I can 'feel' when a crash is arising, the box is a kind of 'calm' and I can switch to the console sometimes catching the error message from the debugger, but sometimes not. This let me suspect the operating system 'waiting' for something related to the second CPU or similar and waiting forever. I'm not familiar with kernel programming and development, so my report seems to be a bit of 'strange', sorry for that. Another couriosity is booting the box in 'safe mode', I thing APIC is off, SMP is off and acpi is off. The box becomes slow, grapical output 'hangs' for seconds, freezes and defreezes and the system remains a kind of 'not useable'. The utilized mobo ASUS CUR-DLS uses the RCC LE 3.0 champion chipset. The IRQ problems (using of PCI cards in any PCI slot impossible) are similar to problems I had in the past with FreeBSD 5.0 on a TYAN Thunder 2500 box, were I wasn't able plugging the AMI RAID controler in any of the 6 64Bit slots. FreeBSD got stuck at the same place: when the built in SCSI controler (LSI logic 1010-33 or 894-33) get initialised. This makes me very courios. Now I build akernel with debug options and I will try to catch a kernel dump. maybe someone of yours is interested in that. I will attach mptabel -dmesg output, hope it is of your convenience. Oliver --------------000308020804060806030307 Content-Type: text/plain; name="mptable" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="mptable" =============================================================================== MPTable, version 2.0.15 ------------------------------------------------------------------------------- MP Floating Pointer Structure: location: BIOS physical address: 0x000f5270 signature: '_MP_' length: 16 bytes version: 1.4 checksum: 0xe3 mode: Virtual Wire ------------------------------------------------------------------------------- MP Config Table Header: physical address: 0x000f4e60 signature: 'PCMP' base table length: 276 version: 1.4 checksum: 0x12 OEM ID: 'OEM00000' Product ID: 'PROD00000000' OEM table pointer: 0x00000000 OEM table size: 0 entry count: 26 local APIC address: 0xfee00000 extended table length: 124 extended table checksum: 198 ------------------------------------------------------------------------------- MP Config Base Table Entries: -- Processors: APIC ID Version State Family Model Step Flags 3 0x11 BSP, usable 6 8 6 0x387fbff 0 0x11 AP, usable 6 8 6 0x387fbff -- Bus: Bus ID Type 0 PCI 1 PCI 2 ISA -- I/O APICs: APIC ID Version State Address 2 0x11 usable 0xfec00000 3 0x11 usable 0xfec01000 -- I/O Ints: Type Polarity Trigger Bus ID IRQ APIC ID PIN# ExtINT conforms conforms 2 0 2 0 INT conforms conforms 2 1 2 1 INT conforms conforms 2 0 2 2 INT conforms conforms 2 3 2 3 INT conforms conforms 2 4 2 4 INT conforms conforms 2 6 2 6 INT conforms conforms 2 7 2 7 INT conforms conforms 2 8 2 8 INT conforms conforms 2 12 2 12 INT conforms conforms 2 13 2 13 INT conforms conforms 2 14 2 14 INT conforms conforms 2 15 2 15 INT active-lo level 0 15:A 3 14 INT active-lo level 2 9 2 9 INT active-lo level 1 2:A 3 5 INT active-lo level 1 5:A 3 8 INT active-lo level 1 5:B 3 9 -- Local Ints: Type Polarity Trigger Bus ID IRQ APIC ID PIN# ExtINT active-hi edge 2 0 255 0 NMI active-hi edge 2 0 255 1 ------------------------------------------------------------------------------- MP Config Extended Table Entries: -- System Address Space bus ID: 0 address type: I/O address address base: 0x0 address range: 0x10000 -- System Address Space bus ID: 0 address type: memory address address base: 0x40000000 address range: 0xbebe0000 -- System Address Space bus ID: 0 address type: prefetch address address base: 0xfebe0000 address range: 0xe9420000 -- System Address Space bus ID: 0 address type: memory address address base: 0xe8000000 address range: 0x18000000 -- System Address Space bus ID: 0 address type: memory address address base: 0xa0000 address range: 0x20000 -- Bus Heirarchy bus ID: 2 bus info: 0x01 parent bus ID: 0 -- Compatibility Bus Address bus ID: 0 address modifier: add predefined range: 0x00000000 -- Compatibility Bus Address bus ID: 0 address modifier: add predefined range: 0x00000001 ------------------------------------------------------------------------------- dmesg output: Copyright (c) 1992-2005 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.3-STABLE #35: Sun Jan 16 17:27:11 UTC 2005 root@edda.physik.uni-mainz.de:/usr/obj/usr/src/sys/EDDA ACPI APIC Table: Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel Pentium III (1000.04-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x686 Stepping = 6 Features=0x387fbff real memory = 1073721344 (1023 MB) avail memory = 1041166336 (992 MB) ioapic0 irqs 0-15 on motherboard ioapic1 irqs 16-31 on motherboard netsmb_dev: loaded npx0: [FAST] npx0: on motherboard npx0: INT 16 interface acpi0: on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000 acpi_timer0: <32-bit timer at 3.579545MHz> port 0xe408-0xe40b on acpi0 cpu0: on acpi0 acpi_button0: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pci0: at device 7.0 (no driver attached) isab0: port 0xe800-0xe80f at device 15.0 on pci0 isa0: on isab0 atapci0: port 0xd400-0xd40f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 15.1 on pci0 ata0: channel #0 on atapci0 ata1: channel #1 on atapci0 ohci0: mem 0xfc000000-0xfc000fff irq 9 at device 15.2 on pci0 ohci0: [GIANT-LOCKED] usb0: OHCI version 1.0, legacy support usb0: on ohci0 usb0: USB revision 1.0 uhub0: (0x1166) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 4 ports with 4 removable, self powered ugen0: OmniVision OV511+ Camera, rev 1.00/1.00, addr 2 pcib1: on acpi0 pci1: on pcib1 em0: port 0xd000-0xd03f mem 0xfb800000-0xfb81ffff irq 21 at device 2.0 on pci1 em0: Ethernet address: 00:07:e9:14:8f:7b em0: Speed:N/A Duplex:N/A sym0: <1010-33> port 0xb800-0xb8ff mem 0xfa800000-0xfa801fff,0xfb000000-0xfb0003ff irq 24 at device 5.0 on pci1 sym0: Symbios NVRAM, ID 7, Fast-80, LVD, parity checking sym0: open drain IRQ line driver, using on-chip SRAM sym0: using LOAD/STORE-based firmware. sym0: handling phase mismatch from SCRIPTS. sym0: [GIANT-LOCKED] sym1: <1010-33> port 0xb400-0xb4ff mem 0xf9800000-0xf9801fff,0xfa000000-0xfa0003ff irq 25 at device 5.1 on pci1 sym1: Symbios NVRAM, ID 7, Fast-80, LVD, parity checking sym1: open drain IRQ line driver, using on-chip SRAM sym1: using LOAD/STORE-based firmware. sym1: handling phase mismatch from SCRIPTS. sym1: [GIANT-LOCKED] atkbdc0: port 0x64,0x60 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] psm0: irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: model IntelliMouse, device ID 3 sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A ppc0: port 0x778-0x77a,0x378-0x37f irq 7 drq 3 on acpi0 ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/8 bytes threshold ppbus0: on ppc0 lpt0: on ppbus0 lpt0: Interrupt-driven port fdc0: port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0 fdc0: [FAST] fd0: <1440-KB 3.5" drive> on fdc0 drive 0 orm0: at iomem 0xc8000-0xcbfff,0xc0000-0xc7fff on isa0 pmtimer0 on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <8 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 fb0 at vga0 Timecounter "TSC" frequency 1000040215 Hz quality 800 Timecounters tick every 1.250 msec Fast IPsec: Initialized Security Association Processing. acd0: DVDR at ata0-master UDMA33 Waiting 3 seconds for SCSI devices to settle (noperiph:sym0:0:-1:-1): SCSI BUS reset delivered. (noperiph:sym1:0:-1:-1): SCSI BUS reset delivered. da0 at sym0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-3 device da0: 160.000MB/s transfers (80.000MHz, offset 62, 16bit), Tagged Queueing Enabled da0: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C) da1 at sym0 bus 0 target 1 lun 0 da1: Fixed Direct Access SCSI-3 device da1: 160.000MB/s transfers (80.000MHz, offset 62, 16bit), Tagged Queueing Enabled da1: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C) da2 at sym0 bus 0 target 2 lun 0 da2: Fixed Direct Access SCSI-3 device da2: 160.000MB/s transfers (80.000MHz, offset 62, 16bit), Tagged Queueing Enabled da2: 17429MB (35694904 512 byte sectors: 255H 63S/T 2221C) cd0 at ata0 bus 0 target 0 lun 0 cd0: <_NEC DVD_RW ND-3500AG 2.18> Removable CD-ROM SCSI-0 device cd0: 33.000MB/s transfers cd0: Attempt to query device size failed: NOT READY, Medium not present GEOM_LABEL: Label for provider da0s1d is ufs/var. GEOM_LABEL: Label for provider da0s1e is ufs/compat. GEOM_LABEL: Label for provider da0s1f is ufs/src. GEOM_LABEL: Label for provider da0s1g is ufs/usr. GEOM_LABEL: Label for provider da0s1h is ufs/local. GEOM_LABEL: Label for provider da1s1d is ufs/obj. GEOM_LABEL: Label for provider da1s1e is ufs/ports. GEOM_LABEL: Label for provider da1s1f is ufs/scratch. GEOM_LABEL: Label for provider da1s1g is ufs/data. Mounting root from ufs:/dev/da0s1a GEOM_LABEL: Label for provider da0s1e is ufs/compat. GEOM_LABEL: Label for provider da1s1g is ufs/data. GEOM_LABEL: Label for provider da1s1d is ufs/obj. GEOM_LABEL: Label for provider da0s1g is ufs/usr. GEOM_LABEL: Label for provider da1s1e is ufs/ports. GEOM_LABEL: Label for provider da1s1f is ufs/scratch. GEOM_LABEL: Label for provider da0s1h is ufs/local. GEOM_LABEL: Label for provider da0s1f is ufs/src. GEOM_LABEL: Label for provider da0s1d is ufs/var. pflog0: promiscuous mode enabled em0: Link is up 100 Mbps Full Duplex =============================================================================== --------------000308020804060806030307-- From owner-freebsd-smp@FreeBSD.ORG Mon Jan 17 16:52:50 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0EC5916A4CE; Mon, 17 Jan 2005 16:52:50 +0000 (GMT) Received: from mail.foolishgames.com (mail.foolishgames.com [216.55.178.45]) by mx1.FreeBSD.org (Postfix) with ESMTP id C07A443D3F; Mon, 17 Jan 2005 16:52:49 +0000 (GMT) (envelope-from luke@foolishgames.com) Received: from [192.168.0.49] (24.247.120.6.kzo.mi.chartermi.net [24.247.120.6]) (authenticated bits=0)j0HHuKkh008764 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NO); Mon, 17 Jan 2005 09:56:21 -0800 (PST) (envelope-from luke@foolishgames.com) X-Authentication-Warning: mail.foolishgames.com: Host 24.247.120.6.kzo.mi.chartermi.net [24.247.120.6] claimed to be [192.168.0.49] Message-Id: <71F0D3ED-689E-11D9-BF7D-000A95EFF4CA@foolishgames.com> X-Habeas-Swe-6: email in exchange for a license for this Habeas X-Habeas-Swe-3: like Habeas SWE (tm) Date: Mon, 17 Jan 2005 10:42:50 -0500 X-Habeas-Swe-8: Message (HCM) and not spam. Please report use of this From: Lucas Holt X-Habeas-Swe-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-Swe-2: brightly anticipated In-Reply-To: <41EB9B0C.90305@uni-mainz.de> References: <41EB9B0C.90305@uni-mainz.de> To: "O. Hartmann" X-Habeas-Swe-7: warrant mark warrants that this is a Habeas Compliant Mime-Version: 1.0 (Apple Message framework v619) X-Habeas-Swe-4: Copyright 2002 Habeas (tm) Content-Type: text/plain; charset=US-ASCII; format=flowed X-Habeas-Swe-1: winter into spring Content-Transfer-Encoding: 7bit X-Habeas-Swe-9: mark in spam to . X-Mailer: Apple Mail (2.619) cc: freebsd-smp@freebsd.org cc: freebsd-stable@freebsd.org Subject: RE: FreeBSD 5.3-SMP/IRQ problems (again) X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Jan 2005 16:52:50 -0000 On Jan 17, 2005, at 6:01 AM, O. Hartmann wrote: > Dear Sirs. > > I reported very strange behaviours of FreeBSD 5.3 on a suspicous > hardware platform of mine and now I would like to repeat this and hope > someone can offer me some help. The reason why I repeat this suspected > bug is because I do not really beliefe in a hardware fault due to some > very strange behaviours of FreeBSD 5.3 on this box. > After reading your post, I find it rather odd that you do not suspect hardware problems at least with compatibility with your hardware. I have a Dell Precision 650 dual xeon 2.0 ghz system with an onboard version of your Intel NIC. The network driver has known issues and takes a very long time to negotiate the speed at which it operates. Most of the time my system is almost completely booted before the NIC interface is up. Aside from that issue, my system behaves normally and seems very stable. I am using FreeBSD 5.3 Release P5. I've had problems with several asus motherboards in the past. I used to love them, but after I built a 733mhz P3 system for a friend, i soon realized that their bioses were not good. I believe the decline occured when they switched to "soft bios" instead of jumpers. (I miss jumpers!) Seriously, the system had problems with video and sound cards in WINDOWS depending on the slot they were placed in. Its the same problem you describe only in windows from a mobo thats probably from about the same time period. The 733 was one of the lowest chips that would go in the board at the time. In addition, i've had problems with FreeBSD 5.2.1 with patchlevel greater than I think 4 but below 5.3 release on an asus nforce2 AMD Athelon XP 2000+ based system. My point is that not all freebsd SMP systems have problems, so its not a global fault in the OS as your post may suggest. (note: i have hyperthreading disabled on my system) Also, ASUS boards are known to have problems. I asked about my reboot problem on freebsd-questions at the time and I was told many horror stories about asus boards and non windows/linux oses. Finally, I'd like to ask why you are running stable? Did someone suggest that you upgrade to stable to try to resolve an issue? Stable occasionally has new features or changes that are not tested thoroughly. It might be more reliable with 5.3 Release P(whatever). I'll close with a summary of the hardware in my system which is known to work in 5.3 release. Dual Xeon 2.0 ghz (dual cpu working great, no HTT) 1 gig ecc memory Intel gigabit nic onboard using em0 driver (slow to negotiate speed, fine afterword unless i'm stress testing) Creative Sound Blaster Audigy Gamer ATI AIW Radeon 9600 XT (2d acceleration only since dri isn't supported on this one) AGP 8x ATA100 IDE disks using onboard IDE controller DVD reader, cd burner firewire 400, USB 2/1.1 ports I have the onboard sound, onboard LSI scsi ultra 320, serial ports, and paralell port disabled to make windows happy. :) Also, I have gentoo installed and quite frankly its very slow! Its running on the 2.6.9 linux kernel w/ smp enabled. If only I didn't need it for my CS class, i'd be running 6 current too :) (i'm not subscribed to the stable mailing list, but am on SMP) Lucas Holt Luke@FoolishGames.com ________________________________________________________ FoolishGames.com (Jewel Fan Site) JustJournal.com (Free blogging) FoolishGames.net (Enemy Territory IoM site) From owner-freebsd-smp@FreeBSD.ORG Mon Jan 17 18:05:46 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C413216A4CE for ; Mon, 17 Jan 2005 18:05:46 +0000 (GMT) Received: from mail.whitecube.ch (ns3.whoswe.ch [62.2.109.243]) by mx1.FreeBSD.org (Postfix) with ESMTP id D269343D2F for ; Mon, 17 Jan 2005 18:05:40 +0000 (GMT) (envelope-from info@bluemambo.ch) Received: from server1.privat.bluemambo (server1.privat.bluemambo [192.168.1.201]) by mail.whitecube.ch (Postfix) with SMTP id 517C31DE for ; Mon, 17 Jan 2005 19:05:38 +0100 (CET) Received: FROM mail.whitecube.ch BY server1.privat.bluemambo ; Mon Jan 17 19:05:54 2005 +0100 Received: from wswhoswe (62-2-109-246.webcom.cablecom.ch [62.2.109.246]) by mail.whitecube.ch (Postfix) with ESMTP id D599B1DD for ; Mon, 17 Jan 2005 19:05:32 +0100 (CET) From: "Blue Mambo" To: Date: Mon, 17 Jan 2005 19:05:35 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Office Outlook, Build 11.0.6353 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 thread-index: AcT8tRx2acKsraG4TnWRCRM/FRvljwACTNAg In-Reply-To: <71F0D3ED-689E-11D9-BF7D-000A95EFF4CA@foolishgames.com> Message-Id: <20050117180532.D599B1DD@mail.whitecube.ch> Subject: AW: FreeBSD 5.3-SMP/IRQ problems (again) X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Jan 2005 18:05:46 -0000 Hello I`v got a dual xeon Server (2x2.4GHZ), 2 GB RAM and an Adaptec 29160N = and that all on a Asus Server Board. With FreeBSD 4.11 Stable it won`t work anymore. If the kernel wants to = start up the SCSI device, he crash`s.=20 I read on the mailinglists that some other people have the same problem. = A help could be to dissable the USB support of the OS but in my case, = the Problem will still be there. I`he also dissabled the HT Support of the mainboard, but it still = doesent work. Maybee someone could help me...=20 Nice Evening Andreas -----Urspr=FCngliche Nachricht----- Von: owner-freebsd-smp@freebsd.org = [mailto:owner-freebsd-smp@freebsd.org] Im Auftrag von Lucas Holt Gesendet: Montag, 17. Januar 2005 16:43 An: O. Hartmann Cc: freebsd-smp@freebsd.org; freebsd-stable@freebsd.org Betreff: RE: FreeBSD 5.3-SMP/IRQ problems (again) On Jan 17, 2005, at 6:01 AM, O. Hartmann wrote: > Dear Sirs. > > I reported very strange behaviours of FreeBSD 5.3 on a suspicous=20 > hardware platform of mine and now I would like to repeat this and hope = > someone can offer me some help. The reason why I repeat this suspected = > bug is because I do not really beliefe in a hardware fault due to some = > very strange behaviours of FreeBSD 5.3 on this box. > After reading your post, I find it rather odd that you do not suspect=20 hardware problems at least with compatibility with your hardware. I=20 have a Dell Precision 650 dual xeon 2.0 ghz system with an onboard=20 version of your Intel NIC. The network driver has known issues and=20 takes a very long time to negotiate the speed at which it operates. =20 Most of the time my system is almost completely booted before the NIC=20 interface is up. Aside from that issue, my system behaves normally and=20 seems very stable. I am using FreeBSD 5.3 Release P5. I've had problems with several asus motherboards in the past. I used=20 to love them, but after I built a 733mhz P3 system for a friend, i soon=20 realized that their bioses were not good. I believe the decline=20 occured when they switched to "soft bios" instead of jumpers. (I miss=20 jumpers!) Seriously, the system had problems with video and sound=20 cards in WINDOWS depending on the slot they were placed in. Its the=20 same problem you describe only in windows from a mobo thats probably=20 from about the same time period. The 733 was one of the lowest chips=20 that would go in the board at the time. In addition, i've had problems=20 with FreeBSD 5.2.1 with patchlevel greater than I think 4 but below 5.3=20 release on an asus nforce2 AMD Athelon XP 2000+ based system. My point is that not all freebsd SMP systems have problems, so its not=20 a global fault in the OS as your post may suggest. (note: i have=20 hyperthreading disabled on my system) Also, ASUS boards are known to=20 have problems. I asked about my reboot problem on freebsd-questions at=20 the time and I was told many horror stories about asus boards and non=20 windows/linux oses. Finally, I'd like to ask why you are running stable? Did someone=20 suggest that you upgrade to stable to try to resolve an issue? Stable=20 occasionally has new features or changes that are not tested=20 thoroughly. It might be more reliable with 5.3 Release P(whatever). I'll close with a summary of the hardware in my system which is known=20 to work in 5.3 release. Dual Xeon 2.0 ghz (dual cpu working great, no HTT) 1 gig ecc memory Intel gigabit nic onboard using em0 driver (slow to negotiate speed,=20 fine afterword unless i'm stress testing) Creative Sound Blaster Audigy Gamer ATI AIW Radeon 9600 XT (2d acceleration only since dri isn't supported=20 on this one) AGP 8x ATA100 IDE disks using onboard IDE controller DVD reader, cd burner firewire 400, USB 2/1.1 ports I have the onboard sound, onboard LSI scsi ultra 320, serial ports, and=20 paralell port disabled to make windows happy. :) Also, I have gentoo installed and quite frankly its very slow! Its=20 running on the 2.6.9 linux kernel w/ smp enabled. If only I didn't=20 need it for my CS class, i'd be running 6 current too :) (i'm not subscribed to the stable mailing list, but am on SMP) Lucas Holt Luke@FoolishGames.com ________________________________________________________ FoolishGames.com (Jewel Fan Site) JustJournal.com (Free blogging) FoolishGames.net (Enemy Territory IoM site) _______________________________________________ freebsd-smp@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-smp To unsubscribe, send any mail to "freebsd-smp-unsubscribe@freebsd.org" From owner-freebsd-smp@FreeBSD.ORG Tue Jan 18 02:48:07 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 71B3D16A4E4 for ; Tue, 18 Jan 2005 02:48:02 +0000 (GMT) Received: from mail13.speakeasy.net (mail21.sea5.speakeasy.net [69.17.117.23]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1106843D2F for ; Tue, 18 Jan 2005 02:48:02 +0000 (GMT) (envelope-from jhb@FreeBSD.org) Received: (qmail 32482 invoked from network); 18 Jan 2005 02:48:01 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) encrypted SMTP for ; 18 Jan 2005 02:48:01 -0000 Received: from slimer.baldwin.cx (slimer.baldwin.cx [192.168.0.16]) (authenticated bits=0) by server.baldwin.cx (8.13.1/8.13.1) with ESMTP id j0I2lqaI092340; Mon, 17 Jan 2005 21:47:58 -0500 (EST) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: freebsd-smp@FreeBSD.org Date: Mon, 17 Jan 2005 21:30:02 -0500 User-Agent: KMail/1.6.2 References: <200501141418.18587.jhb@FreeBSD.org> <200501151103.30642.pvtrifonov@mail.ru> In-Reply-To: <200501151103.30642.pvtrifonov@mail.ru> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit Message-Id: <200501172130.02407.jhb@FreeBSD.org> X-Spam-Status: No, score=-2.8 required=4.2 tests=ALL_TRUSTED autolearn=failed version=3.0.2 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on server.baldwin.cx cc: Peter Trifonov cc: kris@obsecurity.org Subject: Re: Lost interrupts on SMP systems X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Jan 2005 02:48:07 -0000 On Saturday 15 January 2005 03:03 am, Peter Trifonov wrote: > Hello John, > On Friday 14 January 2005 22:18, John Baldwin wrote: > > Among those bug reports the followup submitted by cguthrie@clubphoto.co > (http://www.freebsd.org/cgi/query-pr.cgi?pr=i386/40274) > looks like the most close one to my situation. > > > > > I've gone ahead and committed the fix for the MPTable global > > > > entries btw. I don't think there is a routing or edge/level > > > > problem though because the devices do work until you do a > > > > ping flood. One thing we can try is that Linux has a > > > > > > IMPORTANT: I can do flood ping over either of them without any problems > > > (at least, if the system is booted with -p -v, I don't know why). > > > They break down ONLY if flood ping is SIMULTANEOUSLY performed over > > > both of them. > > Another observation: doing simultaneous flood ping over xl0 AND xl1, xl0 > AND xl2 also causes xl1 or xl2 respectively (but not both of them) to say > "watchdog timeout". In both cases they can be fixed by doing > ifconfig xl1 down > ifconfig xl2 down > ifconfig xl1 up > ifconfig xl2 up > i.e. even if flood ping has not been done over xl2, it still has to be > brought down& up. > > xl0 works fine in all cases. > flood ping over just one interface (either of them) always works fine. > > > More interrupt load that way, which would indicate maybe the bug Linux > > tries to work around except that your intpins are edge triggered. :( > > Just a guess: > Maybe also there is some kind of race condition in the interrupt handling > system, so that if too many interrupts are coming from different sources, > some of them are not properly processed? However, this should be somehow > related to IRQ sharing. > > > I've included a little test program below that you can run as root to do > > arbitrary port reads (inb). Please compile it and mail me the output of: > > > > inb 0x4d0 > > inb 0x4d1 > > Here is what it says: > # ./inb 0x4d0 > inb(0x4d0) = 0x0 = 0d = '^@' > # ./inb 0x4d1 > inb(0x4d1) = 0xe = 14d = '^N' Ok, this is good, it means you do have an ELCR. Let me give you a quick patch to try. This will be relative to your existing mptable.c file since i've committed the first mptable patch to current already. --- //depot/vendor/freebsd/src/sys/i386/i386/io_apic.c 2004/08/02 15:35:28 +++ //depot/user/jhb/acpipci/i386/i386/io_apic.c 2005/01/18 02:26:39 @@ -423,7 +423,7 @@ * them to be set to active low. * * XXX: Should we write to the ELCR if the trigger mode changes for - * an EISA IRQ? + * an EISA IRQ or an ISA IRQ with the ELCR present? */ if (intpin->io_bus == APIC_BUS_EISA) pol = INTR_POLARITY_HIGH; --- //depot/vendor/freebsd/src/sys/i386/i386/machdep.c 2004/11/27 06:55:50 +++ //depot/user/jhb/acpipci/i386/i386/machdep.c 2005/01/18 02:26:39 @@ -2098,6 +2098,7 @@ printf("WARNING: loader(8) metadata is missing!\n"); #ifdef DEV_ISA + elcr_probe(); atpic_startup(); #endif --- //depot/vendor/freebsd/src/sys/i386/i386/mptable.c 2005/01/12 18:25:23 +++ //depot/user/jhb/acpipci/i386/i386/mptable.c 2005/01/18 02:26:39 @@ -580,12 +580,18 @@ KASSERT(src_bus <= mptable_maxbusid, ("bus id %d too large", src_bus)); switch (busses[src_bus].bus_type) { case ISA: - return (INTR_TRIGGER_EDGE); +#ifndef PC98 + if (elcr_found) + return (elcr_read_trigger(src_bus_irq)); + else +#endif + return (INTR_TRIGGER_EDGE); case PCI: return (INTR_TRIGGER_LEVEL); #ifndef PC98 case EISA: KASSERT(src_bus_irq < 16, ("Invalid EISA IRQ %d", src_bus_irq)); + KASSERT(elcr_found, ("Missing ELCR")); return (elcr_read_trigger(src_bus_irq)); #endif default: --- //depot/vendor/freebsd/src/sys/i386/include/intr_machdep.h 2004/12/23 20:35:42 +++ //depot/user/jhb/acpipci/i386/include/intr_machdep.h 2005/01/18 02:26:39 @@ -84,6 +84,7 @@ struct intrframe; extern struct mtx icu_lock; +extern int elcr_found; /* XXX: The elcr_* prototypes probably belong somewhere else. */ int elcr_probe(void); --- //depot/vendor/freebsd/src/sys/i386/isa/atpic.c 2004/08/02 15:35:28 +++ //depot/user/jhb/acpipci/i386/isa/atpic.c 2005/01/18 02:26:39 @@ -112,9 +112,6 @@ static void atpic_init(void *dummy); unsigned int imen; /* XXX */ -#ifndef PC98 -static int using_elcr; -#endif inthand_t IDTVEC(atpic_intr0), IDTVEC(atpic_intr1), IDTVEC(atpic_intr2), @@ -313,7 +310,7 @@ if (ai->at_irq == 0) { i8259_init(ap, ap == &atpics[SLAVE]); #ifndef PC98 - if (ap == &atpics[SLAVE] && using_elcr) + if (ap == &atpics[SLAVE] && elcr_found) elcr_resume(); #endif } @@ -369,7 +366,7 @@ vector); return (EINVAL); } - if (!using_elcr) { + if (!elcr_found) { if (bootverbose) printf("atpic: No ELCR to configure IRQ%u as %s\n", vector, trig == INTR_TRIGGER_EDGE ? "edge/high" : @@ -492,8 +489,7 @@ * assume level trigger for any interrupt that we aren't sure is * edge triggered. */ - if (elcr_probe() == 0) { - using_elcr = 1; + if (elcr_found) { for (i = 0, ai = atintrs; i < NUM_ISA_IRQS; i++, ai++) ai->at_trigger = elcr_read_trigger(i); } else { --- //depot/vendor/freebsd/src/sys/i386/isa/elcr.c 2004/05/04 20:10:24 +++ //depot/user/jhb/acpipci/i386/isa/elcr.c 2005/01/18 02:26:39 @@ -57,9 +57,7 @@ #define ELCR_MASK(irq) (1 << (irq)) static int elcr_status; -#ifdef INVARIANTS -static int elcr_found; -#endif +int elcr_found; /* * Check to see if we have what looks like a valid ELCR. We do this by @@ -88,9 +86,7 @@ } if (resource_disabled("elcr", 0)) return (ENXIO); -#ifdef INVARIANTS elcr_found = 1; -#endif return (0); } -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org From owner-freebsd-smp@FreeBSD.ORG Tue Jan 18 02:48:07 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 475D316A51A for ; Tue, 18 Jan 2005 02:48:04 +0000 (GMT) Received: from mail14.speakeasy.net (mail22.sea5.speakeasy.net [69.17.117.24]) by mx1.FreeBSD.org (Postfix) with ESMTP id E749243D2F for ; Tue, 18 Jan 2005 02:48:03 +0000 (GMT) (envelope-from jhb@FreeBSD.org) Received: (qmail 6824 invoked from network); 18 Jan 2005 02:48:03 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) encrypted SMTP for ; 18 Jan 2005 02:48:03 -0000 Received: from slimer.baldwin.cx (slimer.baldwin.cx [192.168.0.16]) (authenticated bits=0) by server.baldwin.cx (8.13.1/8.13.1) with ESMTP id j0I2lqaJ092340; Mon, 17 Jan 2005 21:48:00 -0500 (EST) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: freebsd-smp@FreeBSD.org Date: Mon, 17 Jan 2005 21:38:05 -0500 User-Agent: KMail/1.6.2 References: <41EB9B0C.90305@uni-mainz.de> In-Reply-To: <41EB9B0C.90305@uni-mainz.de> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200501172138.05599.jhb@FreeBSD.org> X-Spam-Status: No, score=-2.0 required=4.2 tests=ALL_TRUSTED,DEAR_SOMETHING autolearn=failed version=3.0.2 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on server.baldwin.cx cc: "O. Hartmann" Subject: Re: FreeBSD 5.3-SMP/IRQ problems (again) X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Jan 2005 02:48:07 -0000 On Monday 17 January 2005 06:01 am, O. Hartmann wrote: > Dear Sirs. > > I reported very strange behaviours of FreeBSD 5.3 on a suspicous > hardware platform of mine and now I would like to repeat this and hope > someone can offer me some help. The reason why I repeat this suspected > bug is because I do not really beliefe in a hardware fault due to some > very strange behaviours of FreeBSD 5.3 on this box. > > So, hope you can follow off my non-engeneer-English. > > I utilize FreeBSD now for about 8 years on several plattforms especially > SMP plattforms as they became available for FreeBSD. These boxes were > uitilized for scientif-server-services and as high-performance-desktop > plattforms as FreeBSD was prior to 5.X as solid as a rock (with the > exeception of the start off of 4.0). > Today I work with my private plattform at my lab because our computer > center prefers Linux and I do not. So my hardware plattform seems to be > a little bit 'ancient', but I think a lot of ours still use these boxes > for high-duty tasks. > All right, here the facts. > > Hardware: > Mainboard ASUS CUR-DLS, two 1GHz PIII, 2x 512MB ECC/reg, 3x SCSI U160 > harddrives, 1x Intel 1000/Pro 64Bit PCI server 1GBit server NIC, DVD-RW > NEC ND3500AG/2.18 attached to ATA controler. > Please keep in mind that this mobo has a built in PCI VGA controler. > BIOS has been updated to the latest avaiable BIOS and after this hasn't > fixed any problem I updated to the latest BETA BIOS available for this > mobo. > After POST I get a summary screen and realized, that VGA controler is > not attached to an IRQ (and I suspect FreeBSD 5.3 having much troubles > with IRQ routing, but I will report more later on). > > Firt thing I suspect courios: > Booting machine in single user mode for maintainance-purposes remains > box buggy! After a while of heavy screen output as done via compiling a > kernel or building world or 'find'-ing all files of the file system or > showing the contents of a file via more or similar freezes the box/the > screen. I can then type 'return' and watch the blank lines filled in, > but there is nothing more, the box seems to be stuck. Only rescue is a > reboot. This happens to all variants of booting off single user mode, > SMP enabled/disabled in the kernel, apic enabled/disabled, acpi > enabled/disabled. The only way to get rid of this is to plugg in a > separate PCI VGA card into another slot!! Then machine boots correctly > into single user mode - but dies immediately when booting off multi-user > mode anyway. With UP and multiuser mode, this box is with X11 GUI (Xorg > or XFree86) very stable (using built in VGA). > > Next harsh problem is plugging in sound- or VGA-cards. It seems to be > highly dependend on which slot such cards get plugged in. Different > sound cards, different PCI VGA cards - same problem. FreeBSD 5.3 starts > off and then freezes after init of the SCSI controller. The same thing > happens when disabling both serial ports! FreeBSD dies after init of the > SCSI controler. > > The mobo has two 64 Bit PCI slots and I use one of them for the Intel > 1000/Pro GBit NIC. At the now choosen slot usage of a sound card is > impossible (most near 64 bit slot to the CPUs). FreeBSD dies on every > combination (w/ or w/o SCPI) I tried. > > SMP is impossible, w/ or w/o ACPI. FreeBSD 5.3 dies after a while of > doing graphical output with the X11 GUI. I stressed the > machine today with buildworld and build kernel and building openoffice > 1.1.4 with console output and for 15 hours the box was stable as a rock. > But switching to the X11 GUI doing some FireFox jobs (simply surfing the > www) let the system die within minutes. Sometimes I can 'feel' when a > crash is arising, the box is a kind of 'calm' and I can switch to the > console sometimes catching the error message from the debugger, but > sometimes not. This let me suspect the operating system 'waiting' for > something related to the second CPU or similar and waiting forever. I'm > not familiar with kernel programming and development, so my report seems > to be a bit of 'strange', sorry for that. > > Another couriosity is booting the box in 'safe mode', I thing APIC is > off, SMP is off and acpi is off. The box becomes slow, grapical output > 'hangs' for seconds, freezes and defreezes and the system remains a kind > of 'not useable'. > > The utilized mobo ASUS CUR-DLS uses the RCC LE 3.0 champion chipset. The > IRQ problems (using of PCI cards in any PCI slot impossible) are similar > to problems I had in the past with FreeBSD 5.0 on a TYAN Thunder 2500 > box, were I wasn't able plugging the AMI RAID controler in any of the 6 > 64Bit slots. FreeBSD got stuck at the same place: when the built in SCSI > controler (LSI logic 1010-33 or 894-33) get initialised. This makes me > very courios. > > Now I build akernel with debug options and I will try to catch a kernel > dump. maybe someone of yours is interested in that. I will attach > mptabel -dmesg output, hope it is of your convenience. > > > Oliver Erm, your mptable claims that both the secondary processor (AP) and the second I/O APIC have an APIC ID of 3. That can't be right. It may not matter since I don't think P3's even use the APIC bus anymore (I think they route APIC messages over the system bus), but if it does use a separate APIC bus then that would be a definite problem. Also, a dmesg from a boot -v would be most helpful. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org From owner-freebsd-smp@FreeBSD.ORG Tue Jan 18 18:14:14 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BA50616A4CE for ; Tue, 18 Jan 2005 18:14:14 +0000 (GMT) Received: from mx1.mail.ru (mx1.mail.ru [194.67.23.121]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7906643D41 for ; Tue, 18 Jan 2005 18:14:14 +0000 (GMT) (envelope-from pvtrifonov@mail.ru) Received: from [195.209.229.106] (port=3242 helo=tank) by mx1.mail.ru with esmtp id 1Cqxrp-00082W-00 for freebsd-smp@FreeBSD.org; Tue, 18 Jan 2005 21:14:13 +0300 From: "Peter Trifonov" To: Date: Tue, 18 Jan 2005 21:18:17 +0300 MIME-Version: 1.0 Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.5510 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 thread-index: AcT9CCN67PE0NYhtTM+Ytbuu9uCKVAAgdyug In-Reply-To: <200501172130.02407.jhb@FreeBSD.org> Message-Id: X-Spam: Not detected Subject: RE: Lost interrupts on SMP systems X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Jan 2005 18:14:14 -0000 Hello John, > > Here is what it says: > > # ./inb 0x4d0 > > inb(0x4d0) = 0x0 = 0d = '^@' > > # ./inb 0x4d1 > > inb(0x4d1) = 0xe = 14d = '^N' > > Ok, this is good, it means you do have an ELCR. Let me give you a > quick patch to try. This will be relative to your existing mptable.c > file since i've committed the first mptable patch to current already. Thanks a lot!!! Now the box works perfectly! With best regards, P. Trifonov From owner-freebsd-smp@FreeBSD.ORG Tue Jan 18 18:39:04 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DA69B16A4CE for ; Tue, 18 Jan 2005 18:39:04 +0000 (GMT) Received: from mail27.sea5.speakeasy.net (mail27.sea5.speakeasy.net [69.17.117.29]) by mx1.FreeBSD.org (Postfix) with ESMTP id 997C543D5D for ; Tue, 18 Jan 2005 18:39:04 +0000 (GMT) (envelope-from jhb@FreeBSD.org) Received: (qmail 16139 invoked from network); 18 Jan 2005 18:39:04 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) AES256-SHA encrypted SMTP for ; 18 Jan 2005 18:39:04 -0000 Received: from [10.50.40.202] (gw1.twc.weather.com [216.133.140.1]) (authenticated bits=0) by server.baldwin.cx (8.13.1/8.13.1) with ESMTP id j0IIcxhg097972; Tue, 18 Jan 2005 13:39:00 -0500 (EST) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: freebsd-smp@FreeBSD.org Date: Tue, 18 Jan 2005 13:39:38 -0500 User-Agent: KMail/1.6.2 References: In-Reply-To: MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit Message-Id: <200501181339.38078.jhb@FreeBSD.org> X-Spam-Status: No, score=-102.8 required=4.2 tests=ALL_TRUSTED, USER_IN_WHITELIST autolearn=failed version=3.0.2 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on server.baldwin.cx cc: Peter Trifonov Subject: Re: Lost interrupts on SMP systems X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Jan 2005 18:39:05 -0000 On Tuesday 18 January 2005 01:18 pm, Peter Trifonov wrote: > Hello John, > > > > Here is what it says: > > > # ./inb 0x4d0 > > > inb(0x4d0) = 0x0 = 0d = '^@' > > > # ./inb 0x4d1 > > > inb(0x4d1) = 0xe = 14d = '^N' > > > > Ok, this is good, it means you do have an ELCR. Let me give you a > > quick patch to try. This will be relative to your existing mptable.c > > file since i've committed the first mptable patch to current already. > > Thanks a lot!!! > Now the box works perfectly! Excellent. Can you try this additional change and see if it still works or if it breaks things? In past experience, ISA interrupts haven't ever been programmed as level/hi, but always either edge/hi or level/lo, so I want to try using lo polarity based on the ELCR as well. (I'm trying to avoid possibly breaking other boxes in the field.) ---- //depot/user/jhb/acpipci/i386/i386/mptable.c#79 +++ /home/john/work/p4/acpipci/i386/i386/mptable.c @@ -560,6 +560,13 @@ KASSERT(src_bus <= mptable_maxbusid, ("bus id %d too large", src_bus)); switch (busses[src_bus].bus_type) { case ISA: +#ifndef PC98 + if (elcr_found && + elcr_read_trigger(src_bus_irq) == INTR_TRIGGER_LEVEL) + return (INTR_POLARITY_LOW); + else +#endif + return (INTR_POLARITY_HIGH); case EISA: return (INTR_POLARITY_HIGH); case PCI: -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org From owner-freebsd-smp@FreeBSD.ORG Tue Jan 18 19:05:58 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EB22216A4CE; Tue, 18 Jan 2005 19:05:58 +0000 (GMT) Received: from dcn.infos.ru (dcn.infos.ru [195.209.228.109]) by mx1.FreeBSD.org (Postfix) with ESMTP id 842C643D1F; Tue, 18 Jan 2005 19:05:58 +0000 (GMT) (envelope-from petert@dcn.infos.ru) Received: from dcn (localhost [127.0.0.1]) by dcn (Postfix) with SMTP id B81EE2E9E3; Tue, 18 Jan 2005 22:05:56 +0300 (MSK) Received: by smtp.xj.dcn (Postfix, from userid 65534) id 87AF82E9E9; Tue, 18 Jan 2005 22:05:56 +0300 (MSK) Received: from tank-ls.xj.dcn (unknown [10.0.103.154]) by smtp.xj.dcn (Postfix) with ESMTP id 824462E9DE; Tue, 18 Jan 2005 22:05:50 +0300 (MSK) From: Peter Trifonov To: freebsd-smp@freebsd.org Date: Tue, 18 Jan 2005 22:09:59 +0300 User-Agent: KMail/1.6.2 References: <200501181339.38078.jhb@FreeBSD.org> In-Reply-To: <200501181339.38078.jhb@FreeBSD.org> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit Message-Id: <200501182209.59517.petert@dcn.infos.ru> X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on dcn.xj.dcn X-Spam-Level: X-Spam-Status: No, score=-5.5 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00, DNS_FROM_AHBL_RHSBL autolearn=ham version=3.0.1 cc: Peter Trifonov cc: John Baldwin Subject: Re: Lost interrupts on SMP systems X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Jan 2005 19:05:59 -0000 Hello John, On Tuesday 18 January 2005 21:39, John Baldwin wrote: > > Thanks a lot!!! > > Now the box works perfectly! > > Excellent. Can you try this additional change and see if it still works or > if it breaks things? In past experience, ISA interrupts haven't ever been > programmed as level/hi, but always either edge/hi or level/lo, so I want to > try using lo polarity based on the ELCR as well. (I'm trying to avoid > possibly breaking other boxes in the field.) Now it does not work. It complains about interrupt storm on IRQ 10 (xl0), IRQ 11 ( xl1,xl2) and IRQ 9 (ahc0). After this many errors were reported by the SCSI controller and the box was rebooted with old kernel. -- With best regards, P. Trifonov From owner-freebsd-smp@FreeBSD.ORG Tue Jan 18 19:53:33 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9CB0316A4CE for ; Tue, 18 Jan 2005 19:53:33 +0000 (GMT) Received: from mail27.sea5.speakeasy.net (mail27.sea5.speakeasy.net [69.17.117.29]) by mx1.FreeBSD.org (Postfix) with ESMTP id 697BE43D58 for ; Tue, 18 Jan 2005 19:53:33 +0000 (GMT) (envelope-from jhb@FreeBSD.org) Received: (qmail 32293 invoked from network); 18 Jan 2005 19:53:33 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) AES256-SHA encrypted SMTP for ; 18 Jan 2005 19:53:32 -0000 Received: from [10.50.40.202] (gw1.twc.weather.com [216.133.140.1]) (authenticated bits=0) by server.baldwin.cx (8.13.1/8.13.1) with ESMTP id j0IJrRCV098493; Tue, 18 Jan 2005 14:53:27 -0500 (EST) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: freebsd-smp@FreeBSD.org Date: Tue, 18 Jan 2005 14:52:08 -0500 User-Agent: KMail/1.6.2 References: <200501181339.38078.jhb@FreeBSD.org> <200501182209.59517.petert@dcn.infos.ru> In-Reply-To: <200501182209.59517.petert@dcn.infos.ru> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit Message-Id: <200501181452.08578.jhb@FreeBSD.org> X-Spam-Status: No, score=-102.8 required=4.2 tests=ALL_TRUSTED, USER_IN_WHITELIST autolearn=failed version=3.0.2 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on server.baldwin.cx cc: Peter Trifonov cc: Peter Trifonov Subject: Re: Lost interrupts on SMP systems X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Jan 2005 19:53:33 -0000 On Tuesday 18 January 2005 02:09 pm, Peter Trifonov wrote: > Hello John, > > On Tuesday 18 January 2005 21:39, John Baldwin wrote: > > > Thanks a lot!!! > > > Now the box works perfectly! > > > > Excellent. Can you try this additional change and see if it still works > > or if it breaks things? In past experience, ISA interrupts haven't ever > > been programmed as level/hi, but always either edge/hi or level/lo, so I > > want to try using lo polarity based on the ELCR as well. (I'm trying to > > avoid possibly breaking other boxes in the field.) > > Now it does not work. It complains about interrupt storm on IRQ 10 (xl0), > IRQ 11 ( xl1,xl2) and IRQ 9 (ahc0). After this many errors were reported by > the SCSI controller and the box was rebooted with old kernel. Ok. I won't include that change then, but I sure hope I don't break other systems. :-P -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org