From owner-freebsd-current Wed Jan 17 10:53:42 2001 Delivered-To: freebsd-current@freebsd.org Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20]) by hub.freebsd.org (Postfix) with ESMTP id 6915537B6B1; Wed, 17 Jan 2001 10:53:14 -0800 (PST) Received: (from bright@localhost) by fw.wintelcom.net (8.10.0/8.10.0) id f0HIqQX26581; Wed, 17 Jan 2001 10:52:26 -0800 (PST) Date: Wed, 17 Jan 2001 10:52:26 -0800 From: Alfred Perlstein To: Soren Schmidt Cc: Randell Jesup , arch@FreeBSD.ORG, current@FreeBSD.ORG Subject: Re: HEADS-UP: await/asleep removal imminent Message-ID: <20010117105226.V7240@fw.wintelcom.net> References: <20010117101342.R7240@fw.wintelcom.net> <200101171842.TAA12276@freebsd.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <200101171842.TAA12276@freebsd.dk>; from sos@freebsd.dk on Wed, Jan 17, 2001 at 07:42:26PM +0100 Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG * Soren Schmidt [010117 10:43] wrote: > It seems Alfred Perlstein wrote: > > > > > > I suggest creative manpower is used to stabilize -current, instead > > > of fine trimming which API's should stay or not... > > > > I started a loop of make -j128 buildworld and buildkernel last > > night, I still haven't seen anything odd happen on my hardware. > > > > You and Poul-Henning have to figure out what's going on, no one > > else is able to reproduce this instability you're talking about. > > Oohh you dont read the mailing lists then, there has been plenty > of reports of hanging -current boxen since SMPng... Yes, but none with anything useful. :( > > There has to be a way for you guys to get us some reasonable > > tracebacks or diagnostics instead of just saying "it's broke". > > Its close to impossible, the two symptoms I see here are either > spontanous reboots, or solid hangs where only a reset can get > you out, so I cant say much other than "it's broke". You probably have a much better understanding of low level programming than I do, you _should_ be able to figure out what's going on. > > Perhaps you can explain how you're able to trigger this instability > > with a test script? Poul-Henning told me he just needed to do a > > make -j256 world, I did 10 of them without a problem... > > Hmm, with a -current kernel from today 1200 CET i just need to > do a make depend on a GENERIC kernel, and wham it locks up. Odd, doesn't hang for me. > > I'd also like to see what hardware you guys are running on and what > > kernel config. I'm pretty sure that running with a weird value > > for HZ causes lockups on -stable, dunno about current. > > Nothing special, GENERIC kernel with SMP defined will do nicely, running > without SMP improves matters but on the fastet machine I'm still getting > lockups, but they are rare... > > Hardware it hangs on here include: > > 2*PPro@200 192MB FX chipset ATA disks on onboard controller (PIIX3) > > 2*PII@350 512MB BX chipset SCSI disks on NCR controller > > 2*PIII@1G 512MB ServerWorks chipset ATA disks on onboard + HPT controller. > > It seems the faster the machine the faster the lockup/hang.. > > Need I mention that they all work just fine(tm) under -stable and > -current back on PRE_SMPNG... > > So, we (phk & I) are trying to figure out what is going on, but > there is little to go on but hunch... > So there is nothing special to it guys, you just have to try.. > Oh btw using a ccd/vinum/ATA-raid thingy makes the problem worse, > probably due to the higher interrupt rates. I will try stacking a vinum over vn striped setup later tonight to see if this still locks up. You're still not telling me what combination of vn/vinum does this, so I guess I'll have to stumble around in the dark for a bit until I find the magic combination to find the Danish panic/lockup? I think phk just told me that you need a UP kernel to find this, but he's being pretty vague about it so I don't know. > > Basically if you're expecting me or the SMP team to figure out > > what's going on without more info, you're pretty much out of luck. > > See above, not really possible, we have been trying to find some > (affordable) HW that could be used to preserve a log over a boot, > but so far I havn't been able to find anything that works, and > is fast enough to not effect the system too much... > > > ...wondering if the box Paul Saab gave me is actually SMP... :) > > Yup, that would explain things :) Well, I do see processes migrating from CPU to CPU and there's the dmesg: FreeBSD/SMP: Multiprocessor motherboard cpu0 (BSP): apic id: 0, version: 0x00040011, at 0xfee00000 cpu1 (AP): apic id: 1, version: 0x00040011, at 0xfee00000 io0 (APIC): apic id: 4, version: 0x000f0011, at 0xfec00000 io1 (APIC): apic id: 5, version: 0x000f0011, at 0xfec01000 SMP: AP CPU #1 Launched! SMP: CPU1 apic_initialize(): lint0: 0x00010700 lint1: 0x00010400 TPR: 0x00000010 SVR: 0x000001ff start_init: trying /sbin/init Dual 750mhz, 1GB RAM, atapci0: dual disks: ad0: ATA-5 disk at ata0-master -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] "I have the heart of a child; I keep it in a jar on my desk." To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message