From owner-freebsd-alpha  Thu Oct 12  1:39:31 2000
Delivered-To: freebsd-alpha@freebsd.org
Received: from finch-post-10.mail.demon.net (finch-post-10.mail.demon.net [194.217.242.38])
	by hub.freebsd.org (Postfix) with ESMTP id BF6AD37B503
	for <freebsd-alpha@freebsd.org>; Thu, 12 Oct 2000 01:39:28 -0700 (PDT)
Received: from nlsys.demon.co.uk ([158.152.125.33] helo=herring.nlsystems.com)
	by finch-post-10.mail.demon.net with esmtp (Exim 2.12 #1)
	id 13jdte-0002wN-0A; Thu, 12 Oct 2000 08:39:27 +0000
Received: from salmon.nlsystems.com (salmon.nlsystems.com [10.0.0.3])
	by herring.nlsystems.com (8.9.3/8.8.8) with ESMTP id JAA48807;
	Thu, 12 Oct 2000 09:47:01 +0100 (BST)
	(envelope-from dfr@nlsystems.com)
Date: Thu, 12 Oct 2000 09:38:38 +0100 (BST)
From: Doug Rabson <dfr@nlsystems.com>
To: Andrew Gallatin <gallatin@cs.duke.edu>
Cc: freebsd-alpha@freebsd.org
Subject: Re: size problems with INVARIANTS/DIAGNOSTIC -current kernels
In-Reply-To: <14820.26156.761015.912596@grasshopper.cs.duke.edu>
Message-ID: <Pine.BSF.4.21.0010120934180.14648-100000@salmon.nlsystems.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-alpha@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Wed, 11 Oct 2000, Andrew Gallatin wrote:

> 
> Doug Rabson writes:
>  > On Tue, 10 Oct 2000, Andrew Gallatin wrote:
>  > 
>  > > 
>  > > Doug Rabson writes:
>  > > 
>  > >  > I'm sorry, I think I meant *vtopte(kmemusage). I need to look at the pte
>  > >  > itself to see if its sane.
>  > > 
>  > > I'm being extra dense too.   
>  > > 
>  > > Copyright (c) 1992-2000 The FreeBSD Project.
>  > > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
>  > >         The Regents of the University of California. All rights reserved.
>  > > kmem_init: kmemusage = 0xfffffe0000296000
>  > > vtopte (kmemusage) = 0xffffffff80000a58
>  > > *(vtopte (kmemusage)) = 0x54b0003111f
>  > > 
>  > > halted CPU 0
>  > > 
>  > > 
>  > > But -- why should the pte be sane yet?  This is before the fault,
>  > > which I thought should be the one to make it sane..
>  > 
>  > Kernel pages are generally mapped before they are used so that we don't
>  > waste time faulting and patching the pages into the map via vm_fault().
>  > 
>  > This page is managed, wired, read/writable but is set to fault on
>  > read/write/execute. This fault is simply to perform software accounting
>  > for accessed and dirty flags and is probably a red herring. We need to
>  > somehow find the fault which actually kills the machine (assuming that
>  > this isn't the one - we need to check that).
> 
> I'm fairly certain that this *is* the fault that kills the machine.
> The key information is that this is the first fault we take at bootup.
> The problem doesn't have anything to do with the actual fault, the
> problem is is that the XentMM() routine ends up jumping into the data
> segment when it attempts to call trap().
> 
> You should be able to duplicate this fairly easily on any kernel whose 
> size is > 4MB.  Without my little hack, this fault will naturally
> occur inside of malloc the first time its called (by the hints code
> these days).
> 
> I don't think this is a new problem,  as the KAME people are seeing it 
> in 4.1 kernels.   I think we just finally grew huge enough with SMPng
> and assoiciated SMPng debugging goop to get over 4MB kernels.

Hey, I just thought of something. Perhaps the globals segment has grown
too large. The alpha can only support 64k of globals with $gp pointing at
base+32k so that the code can use 16bit signed offsets from $gp to access
it.

The code at XentMM indirects through $gp to find the address of trap() so
either $gp is bad or the globals table itself is bad.

-- 
Doug Rabson				Mail:  dfr@nlsystems.com
					Phone: +44 20 8348 6160


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-alpha" in the body of the message