Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 12 Oct 2000 09:38:38 +0100 (BST)
From:      Doug Rabson <dfr@nlsystems.com>
To:        Andrew Gallatin <gallatin@cs.duke.edu>
Cc:        freebsd-alpha@freebsd.org
Subject:   Re: size problems with INVARIANTS/DIAGNOSTIC -current kernels
Message-ID:  <Pine.BSF.4.21.0010120934180.14648-100000@salmon.nlsystems.com>
In-Reply-To: <14820.26156.761015.912596@grasshopper.cs.duke.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 11 Oct 2000, Andrew Gallatin wrote:

> 
> Doug Rabson writes:
>  > On Tue, 10 Oct 2000, Andrew Gallatin wrote:
>  > 
>  > > 
>  > > Doug Rabson writes:
>  > > 
>  > >  > I'm sorry, I think I meant *vtopte(kmemusage). I need to look at the pte
>  > >  > itself to see if its sane.
>  > > 
>  > > I'm being extra dense too.   
>  > > 
>  > > Copyright (c) 1992-2000 The FreeBSD Project.
>  > > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
>  > >         The Regents of the University of California. All rights reserved.
>  > > kmem_init: kmemusage = 0xfffffe0000296000
>  > > vtopte (kmemusage) = 0xffffffff80000a58
>  > > *(vtopte (kmemusage)) = 0x54b0003111f
>  > > 
>  > > halted CPU 0
>  > > 
>  > > 
>  > > But -- why should the pte be sane yet?  This is before the fault,
>  > > which I thought should be the one to make it sane..
>  > 
>  > Kernel pages are generally mapped before they are used so that we don't
>  > waste time faulting and patching the pages into the map via vm_fault().
>  > 
>  > This page is managed, wired, read/writable but is set to fault on
>  > read/write/execute. This fault is simply to perform software accounting
>  > for accessed and dirty flags and is probably a red herring. We need to
>  > somehow find the fault which actually kills the machine (assuming that
>  > this isn't the one - we need to check that).
> 
> I'm fairly certain that this *is* the fault that kills the machine.
> The key information is that this is the first fault we take at bootup.
> The problem doesn't have anything to do with the actual fault, the
> problem is is that the XentMM() routine ends up jumping into the data
> segment when it attempts to call trap().
> 
> You should be able to duplicate this fairly easily on any kernel whose 
> size is > 4MB.  Without my little hack, this fault will naturally
> occur inside of malloc the first time its called (by the hints code
> these days).
> 
> I don't think this is a new problem,  as the KAME people are seeing it 
> in 4.1 kernels.   I think we just finally grew huge enough with SMPng
> and assoiciated SMPng debugging goop to get over 4MB kernels.

Hey, I just thought of something. Perhaps the globals segment has grown
too large. The alpha can only support 64k of globals with $gp pointing at
base+32k so that the code can use 16bit signed offsets from $gp to access
it.

The code at XentMM indirects through $gp to find the address of trap() so
either $gp is bad or the globals table itself is bad.

-- 
Doug Rabson				Mail:  dfr@nlsystems.com
					Phone: +44 20 8348 6160




To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-alpha" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.21.0010120934180.14648-100000>