From owner-freebsd-smp Mon Aug 30 11: 5:49 1999 Delivered-To: freebsd-smp@freebsd.org Received: from alpo.whistle.com (alpo.whistle.com [207.76.204.38]) by hub.freebsd.org (Postfix) with ESMTP id 2E996158F0 for ; Mon, 30 Aug 1999 11:05:32 -0700 (PDT) (envelope-from julian@whistle.com) Received: from current1.whistle.com (current1.whistle.com [207.76.205.22]) by alpo.whistle.com (8.9.1a/8.9.1) with SMTP id LAA63615; Mon, 30 Aug 1999 11:04:38 -0700 (PDT) Date: Mon, 30 Aug 1999 11:05:20 -0700 (PDT) From: Julian Elischer To: Matthew Dillon Cc: Suresh Rajagopalan , freebsd-smp@FreeBSD.ORG Subject: Re: SMP freezes on 3.2-STABLE In-Reply-To: <199908301758.KAA16067@apollo.backplane.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Mon, 30 Aug 1999, Matthew Dillon wrote: > :a) > : > :This time I was able to break into DDB. Here's the output (BTW, I had to > :transcribe this, is there any way to dump the output to a file?) > > No, you have to transcribe it or set the console up on a serial port > (which can be annoying) and log it. The best way is to set flags 0x20 on your sio0, and then hook it to sio1 on another machine. (and do same from the other machine.) then use script to log the activity on it when you 'tip' into it. julian > > :b) > :trace shows the following: > : > :panic (c0241d25,0,cc848cb0,1,c01769d4) at panic 0xa4 > :bsl1 (cc848c40, 2, cbbe6860, cbe3043f, cc848d00) at bs1 > :nfs_lookup (cbf3de30, cbafce00, cbf3dedc, cbf3deb80) at nfs_lookup 0x22f > :lookup(cbf3deb8,cbbe6860,cbbe6860,cbf3df94,cbf3de74) at lookup 0x2c1 > :namei(cbf3deb8,cbbe6860,c0292740,0,8162ce4) at namei 0x133 > :stat(cbbe6860,cbf3df94,13,5,bfbfdd50) at stat 0x44 > :syscall(27,bfbf0027,bfbfdd50,5,bfbfdb28) at syscall 0x107 > :Xint0x80_syscall() at Xint)x80_syscall + 0x4c > : > :--- > :c) > :ps shows a few http and perl processes, some in nfsrecv state. The > :output was too long to copy by hand. Hope this helps. > : > :-- > :d) > : > :On one of the machines, I now also see this message in the syslog: > : > :xl0: command never completed > : > :(at random intervals) > :... > :Thanks for your help. > : > :Suresh > > Hmm. This is very odd. It seems unlikely to be a bug in NFS if > turning off SMP fixes the problem. > > If the machine didn't crash and ctl-alt-esc breaking it shows an > active stack frame from a running process, it could be that the > process is getting stuck in an endless loop somewhere or somehow. > > I'm at a loss at the moment. It sounds like an SMP problem of some > sort. I'm hoping one of the SMP guys will have a brainfart :-) > > If it cannot be resolved soon I would back both machines down to > a single-cpu and wait... having production machines crash is not fun. > > Current ought to work better for SMP then stable, but I hesitate to > suggest it for a production machine. > > -Matt > Matthew Dillon > > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-smp" in the body of the message > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message