Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 17 Jan 2005 21:38:05 -0500
From:      John Baldwin <jhb@FreeBSD.org>
To:        freebsd-smp@FreeBSD.org
Cc:        "O. Hartmann" <ohartman@uni-mainz.de>
Subject:   Re: FreeBSD 5.3-SMP/IRQ problems (again)
Message-ID:  <200501172138.05599.jhb@FreeBSD.org>
In-Reply-To: <41EB9B0C.90305@uni-mainz.de>
References:  <41EB9B0C.90305@uni-mainz.de>

next in thread | previous in thread | raw e-mail | index | archive | help
On Monday 17 January 2005 06:01 am, O. Hartmann wrote:
> Dear Sirs.
>
> I reported very strange behaviours of FreeBSD 5.3 on a suspicous
> hardware platform of mine and now I would like to repeat this and hope
> someone can offer me some help. The reason why I repeat this suspected
> bug is because I do not really beliefe in a hardware fault due to some
> very strange behaviours of FreeBSD 5.3 on this box.
>
> So, hope you can follow off my non-engeneer-English.
>
> I utilize FreeBSD now for about 8 years on several plattforms especially
> SMP plattforms as they became available for FreeBSD. These boxes were
> uitilized for scientif-server-services and as high-performance-desktop
> plattforms as FreeBSD was prior to 5.X as solid as a rock (with the
> exeception of the start off of 4.0).
> Today I work with my private plattform at my lab because our computer
> center prefers Linux and I do not. So my hardware plattform seems to be
> a little bit 'ancient', but I think a lot of ours still use these boxes
> for high-duty tasks.
> All right, here the facts.
>
> Hardware:
> Mainboard ASUS CUR-DLS, two 1GHz PIII, 2x 512MB ECC/reg, 3x  SCSI U160
> harddrives, 1x Intel 1000/Pro 64Bit PCI server 1GBit server NIC, DVD-RW
> NEC ND3500AG/2.18 attached to ATA controler.
> Please keep in mind that this mobo has a built in PCI VGA controler.
> BIOS has been updated to the latest avaiable BIOS and after this hasn't
> fixed any problem I updated to the latest BETA BIOS available for this
> mobo.
> After POST I get a summary screen and realized, that VGA controler is
> not attached to an IRQ (and I suspect FreeBSD 5.3 having much troubles
> with IRQ routing, but I will report more later on).
>
> Firt thing I suspect courios:
> Booting machine in single user mode for maintainance-purposes remains
> box buggy! After a while of heavy screen output as done via compiling a
> kernel or building world or 'find'-ing all files of the file system or
> showing the contents of a file via more or similar freezes the box/the
> screen. I can then type 'return' and watch the blank lines filled in,
> but there is nothing more, the box seems to be stuck. Only rescue is a
> reboot. This happens to all variants of booting off single user mode,
> SMP enabled/disabled in the kernel, apic enabled/disabled, acpi
> enabled/disabled. The only way to get rid of this is to plugg in a
> separate PCI VGA card into another slot!! Then machine boots correctly
> into single user mode - but dies immediately when booting off multi-user
> mode anyway. With UP and multiuser mode, this box is with X11 GUI (Xorg
> or XFree86) very stable (using built in VGA).
>
> Next harsh problem is plugging in sound- or VGA-cards. It seems to be
> highly dependend on which slot such cards get plugged in. Different
> sound cards, different PCI VGA cards - same problem. FreeBSD 5.3 starts
> off and then freezes after init of the SCSI controller. The same thing
> happens when disabling both serial ports! FreeBSD dies after init of the
> SCSI controler.
>
> The mobo has two 64 Bit PCI slots and I use one of them for the Intel
> 1000/Pro GBit NIC. At the now choosen slot usage of a sound card is
> impossible (most near 64 bit slot to the CPUs). FreeBSD dies on every
> combination (w/ or w/o SCPI) I tried.
>
> SMP is impossible, w/ or w/o ACPI. FreeBSD 5.3 dies after a while of
> doing graphical output with the X11 GUI. I stressed the
> machine today with buildworld and build kernel and building openoffice
> 1.1.4 with console output and for 15 hours the box was stable as a rock.
> But switching to the X11 GUI doing some FireFox jobs (simply surfing the
> www) let the system die within minutes. Sometimes I can 'feel' when a
> crash is arising, the box is a kind of 'calm' and I can switch to the
> console sometimes catching the error message from the debugger, but
> sometimes not. This let me suspect the operating system 'waiting' for
> something related to the second CPU or similar and waiting forever. I'm
> not familiar with kernel programming and development, so my report seems
> to be a bit of 'strange', sorry for that.
>
> Another couriosity is booting the box in 'safe mode', I thing APIC is
> off, SMP is off and acpi is off. The box becomes slow, grapical output
> 'hangs' for seconds, freezes and defreezes and the system remains a kind
> of 'not useable'.
>
> The utilized mobo ASUS CUR-DLS uses the RCC LE 3.0 champion chipset. The
> IRQ problems (using of PCI cards in any PCI slot impossible) are similar
> to problems I had in the past with FreeBSD 5.0 on a TYAN Thunder 2500
> box, were I wasn't able plugging the AMI RAID controler in any of the 6
> 64Bit slots. FreeBSD got stuck at the same place: when the built in SCSI
> controler (LSI logic 1010-33 or 894-33) get initialised. This makes me
> very courios.
>
> Now I build  akernel with debug options and I will try to catch a kernel
> dump. maybe someone of yours is interested in that. I will attach
> mptabel -dmesg output, hope it is of your convenience.
>
>
> Oliver

Erm, your mptable claims that both the secondary processor (AP) and the second 
I/O APIC have an APIC ID of 3.  That can't be right.  It may not matter since 
I don't think P3's even use the APIC bus anymore (I think they route APIC 
messages over the system bus), but if it does use a separate APIC bus then 
that would be a definite problem.  Also, a dmesg from a boot -v would be most 
helpful.

-- 
John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve"  =  http://www.FreeBSD.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200501172138.05599.jhb>