Date: Wed, 17 Jan 2001 10:52:26 -0800 From: Alfred Perlstein <bright@wintelcom.net> To: Soren Schmidt <sos@freebsd.dk> Cc: Randell Jesup <rjesup@wgate.com>, arch@FreeBSD.ORG, current@FreeBSD.ORG Subject: Re: HEADS-UP: await/asleep removal imminent Message-ID: <20010117105226.V7240@fw.wintelcom.net> In-Reply-To: <200101171842.TAA12276@freebsd.dk>; from sos@freebsd.dk on Wed, Jan 17, 2001 at 07:42:26PM %2B0100 References: <20010117101342.R7240@fw.wintelcom.net> <200101171842.TAA12276@freebsd.dk>
next in thread | previous in thread | raw e-mail | index | archive | help
* Soren Schmidt <sos@freebsd.dk> [010117 10:43] wrote:
> It seems Alfred Perlstein wrote:
> > >
> > > I suggest creative manpower is used to stabilize -current, instead
> > > of fine trimming which API's should stay or not...
> >
> > I started a loop of make -j128 buildworld and buildkernel last
> > night, I still haven't seen anything odd happen on my hardware.
> >
> > You and Poul-Henning have to figure out what's going on, no one
> > else is able to reproduce this instability you're talking about.
>
> Oohh you dont read the mailing lists then, there has been plenty
> of reports of hanging -current boxen since SMPng...
Yes, but none with anything useful. :(
> > There has to be a way for you guys to get us some reasonable
> > tracebacks or diagnostics instead of just saying "it's broke".
>
> Its close to impossible, the two symptoms I see here are either
> spontanous reboots, or solid hangs where only a reset can get
> you out, so I cant say much other than "it's broke".
You probably have a much better understanding of low level programming
than I do, you _should_ be able to figure out what's going on.
> > Perhaps you can explain how you're able to trigger this instability
> > with a test script? Poul-Henning told me he just needed to do a
> > make -j256 world, I did 10 of them without a problem...
>
> Hmm, with a -current kernel from today 1200 CET i just need to
> do a make depend on a GENERIC kernel, and wham it locks up.
Odd, doesn't hang for me.
> > I'd also like to see what hardware you guys are running on and what
> > kernel config. I'm pretty sure that running with a weird value
> > for HZ causes lockups on -stable, dunno about current.
>
> Nothing special, GENERIC kernel with SMP defined will do nicely, running
> without SMP improves matters but on the fastet machine I'm still getting
> lockups, but they are rare...
>
> Hardware it hangs on here include:
>
> 2*PPro@200 192MB FX chipset ATA disks on onboard controller (PIIX3)
>
> 2*PII@350 512MB BX chipset SCSI disks on NCR controller
>
> 2*PIII@1G 512MB ServerWorks chipset ATA disks on onboard + HPT controller.
>
> It seems the faster the machine the faster the lockup/hang..
>
> Need I mention that they all work just fine(tm) under -stable and
> -current back on PRE_SMPNG...
>
> So, we (phk & I) are trying to figure out what is going on, but
> there is little to go on but hunch...
> So there is nothing special to it guys, you just have to try..
> Oh btw using a ccd/vinum/ATA-raid thingy makes the problem worse,
> probably due to the higher interrupt rates.
I will try stacking a vinum over vn striped setup later tonight
to see if this still locks up.
You're still not telling me what combination of vn/vinum does this,
so I guess I'll have to stumble around in the dark for a bit until
I find the magic combination to find the Danish panic/lockup?
I think phk just told me that you need a UP kernel to find this,
but he's being pretty vague about it so I don't know.
> > Basically if you're expecting me or the SMP team to figure out
> > what's going on without more info, you're pretty much out of luck.
>
> See above, not really possible, we have been trying to find some
> (affordable) HW that could be used to preserve a log over a boot,
> but so far I havn't been able to find anything that works, and
> is fast enough to not effect the system too much...
>
> > ...wondering if the box Paul Saab gave me is actually SMP... :)
>
> Yup, that would explain things :)
Well, I do see processes migrating from CPU to CPU and there's the
dmesg:
FreeBSD/SMP: Multiprocessor motherboard
cpu0 (BSP): apic id: 0, version: 0x00040011, at 0xfee00000
cpu1 (AP): apic id: 1, version: 0x00040011, at 0xfee00000
io0 (APIC): apic id: 4, version: 0x000f0011, at 0xfec00000
io1 (APIC): apic id: 5, version: 0x000f0011, at 0xfec01000
SMP: AP CPU #1 Launched!
SMP: CPU1 apic_initialize():
lint0: 0x00010700 lint1: 0x00010400 TPR: 0x00000010 SVR: 0x000001ff
start_init: trying /sbin/init
Dual 750mhz, 1GB RAM, atapci0: <ServerWorks ROSB4 ATA33 controller>
dual disks: ad0: <IBM-DTLA-307030/TX4OA50C> ATA-5 disk at ata0-master
--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010117105226.V7240>
