From owner-freebsd-alpha Thu Oct 12 1:39:31 2000 Delivered-To: freebsd-alpha@freebsd.org Received: from finch-post-10.mail.demon.net (finch-post-10.mail.demon.net [194.217.242.38]) by hub.freebsd.org (Postfix) with ESMTP id BF6AD37B503 for ; Thu, 12 Oct 2000 01:39:28 -0700 (PDT) Received: from nlsys.demon.co.uk ([158.152.125.33] helo=herring.nlsystems.com) by finch-post-10.mail.demon.net with esmtp (Exim 2.12 #1) id 13jdte-0002wN-0A; Thu, 12 Oct 2000 08:39:27 +0000 Received: from salmon.nlsystems.com (salmon.nlsystems.com [10.0.0.3]) by herring.nlsystems.com (8.9.3/8.8.8) with ESMTP id JAA48807; Thu, 12 Oct 2000 09:47:01 +0100 (BST) (envelope-from dfr@nlsystems.com) Date: Thu, 12 Oct 2000 09:38:38 +0100 (BST) From: Doug Rabson To: Andrew Gallatin Cc: freebsd-alpha@freebsd.org Subject: Re: size problems with INVARIANTS/DIAGNOSTIC -current kernels In-Reply-To: <14820.26156.761015.912596@grasshopper.cs.duke.edu> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-alpha@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Wed, 11 Oct 2000, Andrew Gallatin wrote: > > Doug Rabson writes: > > On Tue, 10 Oct 2000, Andrew Gallatin wrote: > > > > > > > > Doug Rabson writes: > > > > > > > I'm sorry, I think I meant *vtopte(kmemusage). I need to look at the pte > > > > itself to see if its sane. > > > > > > I'm being extra dense too. > > > > > > Copyright (c) 1992-2000 The FreeBSD Project. > > > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > > > The Regents of the University of California. All rights reserved. > > > kmem_init: kmemusage = 0xfffffe0000296000 > > > vtopte (kmemusage) = 0xffffffff80000a58 > > > *(vtopte (kmemusage)) = 0x54b0003111f > > > > > > halted CPU 0 > > > > > > > > > But -- why should the pte be sane yet? This is before the fault, > > > which I thought should be the one to make it sane.. > > > > Kernel pages are generally mapped before they are used so that we don't > > waste time faulting and patching the pages into the map via vm_fault(). > > > > This page is managed, wired, read/writable but is set to fault on > > read/write/execute. This fault is simply to perform software accounting > > for accessed and dirty flags and is probably a red herring. We need to > > somehow find the fault which actually kills the machine (assuming that > > this isn't the one - we need to check that). > > I'm fairly certain that this *is* the fault that kills the machine. > The key information is that this is the first fault we take at bootup. > The problem doesn't have anything to do with the actual fault, the > problem is is that the XentMM() routine ends up jumping into the data > segment when it attempts to call trap(). > > You should be able to duplicate this fairly easily on any kernel whose > size is > 4MB. Without my little hack, this fault will naturally > occur inside of malloc the first time its called (by the hints code > these days). > > I don't think this is a new problem, as the KAME people are seeing it > in 4.1 kernels. I think we just finally grew huge enough with SMPng > and assoiciated SMPng debugging goop to get over 4MB kernels. Hey, I just thought of something. Perhaps the globals segment has grown too large. The alpha can only support 64k of globals with $gp pointing at base+32k so that the code can use 16bit signed offsets from $gp to access it. The code at XentMM indirects through $gp to find the address of trap() so either $gp is bad or the globals table itself is bad. -- Doug Rabson Mail: dfr@nlsystems.com Phone: +44 20 8348 6160 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-alpha" in the body of the message