From owner-freebsd-alpha  Fri Oct  6  1:52: 2 2000
Delivered-To: freebsd-alpha@freebsd.org
Received: from finch-post-11.mail.demon.net (finch-post-11.mail.demon.net [194.217.242.39])
	by hub.freebsd.org (Postfix) with ESMTP id 7A99F37B66C
	for <freebsd-alpha@freebsd.org>; Fri,  6 Oct 2000 01:52:00 -0700 (PDT)
Received: from nlsys.demon.co.uk ([158.152.125.33] helo=herring.nlsystems.com)
	by finch-post-11.mail.demon.net with esmtp (Exim 2.12 #1)
	id 13hTET-0004T7-0B; Fri, 6 Oct 2000 08:51:58 +0000
Received: from salmon.nlsystems.com (salmon.nlsystems.com [10.0.0.3])
	by herring.nlsystems.com (8.9.3/8.8.8) with ESMTP id JAA21037;
	Fri, 6 Oct 2000 09:58:35 +0100 (BST)
	(envelope-from dfr@nlsystems.com)
Date: Fri, 6 Oct 2000 09:51:44 +0100 (BST)
From: Doug Rabson <dfr@nlsystems.com>
To: Andrew Gallatin <gallatin@cs.duke.edu>
Cc: Kenjiro Cho <kjc@csl.sony.co.jp>, freebsd-alpha@FreeBSD.ORG,
	core@kame.net
Subject: Re: size problems with INVARIANTS/DIAGNOSTIC -current kernels
In-Reply-To: <14812.37571.725840.45245@grasshopper.cs.duke.edu>
Message-ID: <Pine.BSF.4.21.0010060950060.94692-100000@salmon.nlsystems.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-alpha@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Thu, 5 Oct 2000, Andrew Gallatin wrote:

> 
> Doug Rabson writes:
>  > On Wed, 4 Oct 2000, Andrew Gallatin wrote:
>  > 
>  > >  > possibly something is wrong in the loader?
>  > >  > 
>  > > 
>  > > 
>  > > If so, the breakage has not happened recently.  I'm seeing this with a
>  > > 'loader' from late august, a netboot from late august & a netboot
>  > > that's over 1 year old.
>  > > 
>  > > Bear in mind that we seem to run just fine until the first time we
>  > > attempt to call a function from a stack created for us by the
>  > > palcode.    However, that same function is callable when not running
>  > > in an interrupt/trap/etc palcode-created context.
>  > > 
>  > > I've "proved" this to myself by making sure that trap() is actually
>  > > callable from the mainline kernel code (eg, not running out XentMM).
>  > > I put a call to trap() in kern_malloc() & I put a call to printtrap()
>  > > at the top of trap.  I see trap being called from kern_malloc, but
>  > > when its called by XentMM, random stuff happens.
>  > 
>  > Bizarre. We are running on our own stack before mi_startup is
>  > called() which should be before anything substantial is printed. I wonder
>  > if somehow the ksp value in the context has been corrupted.
> 
> Any ideas on how to debug this further?  I'm at my wits end here..
> 
> In case its of any use, here's what the registers look like after one
> of these wacky crashes:

Clearly something bad is happening and we are taking a trap, then fielding
it badly. You could try inserting 'call_pal halt' instructions in XentMM
so that we can see what the state looks like on the first fault.

-- 
Doug Rabson				Mail:  dfr@nlsystems.com
					Phone: +44 20 8348 6160


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-alpha" in the body of the message