From owner-freebsd-smp@FreeBSD.ORG Tue Jan 18 02:48:07 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 475D316A51A for ; Tue, 18 Jan 2005 02:48:04 +0000 (GMT) Received: from mail14.speakeasy.net (mail22.sea5.speakeasy.net [69.17.117.24]) by mx1.FreeBSD.org (Postfix) with ESMTP id E749243D2F for ; Tue, 18 Jan 2005 02:48:03 +0000 (GMT) (envelope-from jhb@FreeBSD.org) Received: (qmail 6824 invoked from network); 18 Jan 2005 02:48:03 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) encrypted SMTP for ; 18 Jan 2005 02:48:03 -0000 Received: from slimer.baldwin.cx (slimer.baldwin.cx [192.168.0.16]) (authenticated bits=0) by server.baldwin.cx (8.13.1/8.13.1) with ESMTP id j0I2lqaJ092340; Mon, 17 Jan 2005 21:48:00 -0500 (EST) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: freebsd-smp@FreeBSD.org Date: Mon, 17 Jan 2005 21:38:05 -0500 User-Agent: KMail/1.6.2 References: <41EB9B0C.90305@uni-mainz.de> In-Reply-To: <41EB9B0C.90305@uni-mainz.de> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200501172138.05599.jhb@FreeBSD.org> X-Spam-Status: No, score=-2.0 required=4.2 tests=ALL_TRUSTED,DEAR_SOMETHING autolearn=failed version=3.0.2 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on server.baldwin.cx cc: "O. Hartmann" Subject: Re: FreeBSD 5.3-SMP/IRQ problems (again) X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Jan 2005 02:48:07 -0000 On Monday 17 January 2005 06:01 am, O. Hartmann wrote: > Dear Sirs. > > I reported very strange behaviours of FreeBSD 5.3 on a suspicous > hardware platform of mine and now I would like to repeat this and hope > someone can offer me some help. The reason why I repeat this suspected > bug is because I do not really beliefe in a hardware fault due to some > very strange behaviours of FreeBSD 5.3 on this box. > > So, hope you can follow off my non-engeneer-English. > > I utilize FreeBSD now for about 8 years on several plattforms especially > SMP plattforms as they became available for FreeBSD. These boxes were > uitilized for scientif-server-services and as high-performance-desktop > plattforms as FreeBSD was prior to 5.X as solid as a rock (with the > exeception of the start off of 4.0). > Today I work with my private plattform at my lab because our computer > center prefers Linux and I do not. So my hardware plattform seems to be > a little bit 'ancient', but I think a lot of ours still use these boxes > for high-duty tasks. > All right, here the facts. > > Hardware: > Mainboard ASUS CUR-DLS, two 1GHz PIII, 2x 512MB ECC/reg, 3x SCSI U160 > harddrives, 1x Intel 1000/Pro 64Bit PCI server 1GBit server NIC, DVD-RW > NEC ND3500AG/2.18 attached to ATA controler. > Please keep in mind that this mobo has a built in PCI VGA controler. > BIOS has been updated to the latest avaiable BIOS and after this hasn't > fixed any problem I updated to the latest BETA BIOS available for this > mobo. > After POST I get a summary screen and realized, that VGA controler is > not attached to an IRQ (and I suspect FreeBSD 5.3 having much troubles > with IRQ routing, but I will report more later on). > > Firt thing I suspect courios: > Booting machine in single user mode for maintainance-purposes remains > box buggy! After a while of heavy screen output as done via compiling a > kernel or building world or 'find'-ing all files of the file system or > showing the contents of a file via more or similar freezes the box/the > screen. I can then type 'return' and watch the blank lines filled in, > but there is nothing more, the box seems to be stuck. Only rescue is a > reboot. This happens to all variants of booting off single user mode, > SMP enabled/disabled in the kernel, apic enabled/disabled, acpi > enabled/disabled. The only way to get rid of this is to plugg in a > separate PCI VGA card into another slot!! Then machine boots correctly > into single user mode - but dies immediately when booting off multi-user > mode anyway. With UP and multiuser mode, this box is with X11 GUI (Xorg > or XFree86) very stable (using built in VGA). > > Next harsh problem is plugging in sound- or VGA-cards. It seems to be > highly dependend on which slot such cards get plugged in. Different > sound cards, different PCI VGA cards - same problem. FreeBSD 5.3 starts > off and then freezes after init of the SCSI controller. The same thing > happens when disabling both serial ports! FreeBSD dies after init of the > SCSI controler. > > The mobo has two 64 Bit PCI slots and I use one of them for the Intel > 1000/Pro GBit NIC. At the now choosen slot usage of a sound card is > impossible (most near 64 bit slot to the CPUs). FreeBSD dies on every > combination (w/ or w/o SCPI) I tried. > > SMP is impossible, w/ or w/o ACPI. FreeBSD 5.3 dies after a while of > doing graphical output with the X11 GUI. I stressed the > machine today with buildworld and build kernel and building openoffice > 1.1.4 with console output and for 15 hours the box was stable as a rock. > But switching to the X11 GUI doing some FireFox jobs (simply surfing the > www) let the system die within minutes. Sometimes I can 'feel' when a > crash is arising, the box is a kind of 'calm' and I can switch to the > console sometimes catching the error message from the debugger, but > sometimes not. This let me suspect the operating system 'waiting' for > something related to the second CPU or similar and waiting forever. I'm > not familiar with kernel programming and development, so my report seems > to be a bit of 'strange', sorry for that. > > Another couriosity is booting the box in 'safe mode', I thing APIC is > off, SMP is off and acpi is off. The box becomes slow, grapical output > 'hangs' for seconds, freezes and defreezes and the system remains a kind > of 'not useable'. > > The utilized mobo ASUS CUR-DLS uses the RCC LE 3.0 champion chipset. The > IRQ problems (using of PCI cards in any PCI slot impossible) are similar > to problems I had in the past with FreeBSD 5.0 on a TYAN Thunder 2500 > box, were I wasn't able plugging the AMI RAID controler in any of the 6 > 64Bit slots. FreeBSD got stuck at the same place: when the built in SCSI > controler (LSI logic 1010-33 or 894-33) get initialised. This makes me > very courios. > > Now I build akernel with debug options and I will try to catch a kernel > dump. maybe someone of yours is interested in that. I will attach > mptabel -dmesg output, hope it is of your convenience. > > > Oliver Erm, your mptable claims that both the secondary processor (AP) and the second I/O APIC have an APIC ID of 3. That can't be right. It may not matter since I don't think P3's even use the APIC bus anymore (I think they route APIC messages over the system bus), but if it does use a separate APIC bus then that would be a definite problem. Also, a dmesg from a boot -v would be most helpful. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org