From owner-freebsd-alpha Fri Oct 6 14:10:47 2000 Delivered-To: freebsd-alpha@freebsd.org Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by hub.freebsd.org (Postfix) with ESMTP id EC40237B502 for ; Fri, 6 Oct 2000 14:10:37 -0700 (PDT) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.9.3/8.9.3) with ESMTP id RAA29425; Fri, 6 Oct 2000 17:10:37 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.11.0/8.9.1) id e96LAb658069; Fri, 6 Oct 2000 17:10:37 -0400 (EDT) (envelope-from gallatin@cs.duke.edu) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Date: Fri, 6 Oct 2000 17:10:37 -0400 (EDT) To: Doug Rabson Cc: freebsd-alpha@FreeBSD.ORG Subject: Re: size problems with INVARIANTS/DIAGNOSTIC -current kernels In-Reply-To: References: <14812.37571.725840.45245@grasshopper.cs.duke.edu> X-Mailer: VM 6.43 under 20.4 "Emerald" XEmacs Lucid Message-ID: <14814.15695.767816.773180@grasshopper.cs.duke.edu> Sender: owner-freebsd-alpha@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Doug Rabson writes: > On Thu, 5 Oct 2000, Andrew Gallatin wrote: > > > > > Doug Rabson writes: > > > On Wed, 4 Oct 2000, Andrew Gallatin wrote: > > > > > > > > possibly something is wrong in the loader? > > > > > > > > > > > > > > > > > If so, the breakage has not happened recently. I'm seeing this with a > > > > 'loader' from late august, a netboot from late august & a netboot > > > > that's over 1 year old. > > > > > > > > Bear in mind that we seem to run just fine until the first time we > > > > attempt to call a function from a stack created for us by the > > > > palcode. However, that same function is callable when not running > > > > in an interrupt/trap/etc palcode-created context. > > > > > > > > I've "proved" this to myself by making sure that trap() is actually > > > > callable from the mainline kernel code (eg, not running out XentMM). > > > > I put a call to trap() in kern_malloc() & I put a call to printtrap() > > > > at the top of trap. I see trap being called from kern_malloc, but > > > > when its called by XentMM, random stuff happens. > > > > > > Bizarre. We are running on our own stack before mi_startup is > > > called() which should be before anything substantial is printed. I wonder > > > if somehow the ksp value in the context has been corrupted. > > > > Any ideas on how to debug this further? I'm at my wits end here.. > > > > In case its of any use, here's what the registers look like after one > > of these wacky crashes: > > Clearly something bad is happening and we are taking a trap, then fielding > it badly. You could try inserting 'call_pal halt' instructions in XentMM > so that we can see what the state looks like on the first fault. I altered XentMM thusly: Index: /home/home1/gallatin/ithreads/sys/alpha/alpha/exception.s =================================================================== RCS file: /home/ncvs/src/sys/alpha/alpha/exception.s,v retrieving revision 1.3 diff -u -r1.3 exception.s --- /home/home1/gallatin/ithreads/sys/alpha/alpha/exception.s 1999/08/28 00:38:26 1.3 +++ /home/home1/gallatin/ithreads/sys/alpha/alpha/exception.s 2000/10/06 20:52:41 @@ -87,7 +87,7 @@ */ PALVECT(XentMM) /* setup frame, save registers */ - +call_pal PAL_halt /* a0, a1, & a2 already set up */ ldiq a3, ALPHA_KENTRY_MM mov sp, a4 ; .loc 1 __LINE__ /ithreads data=0x3e6a78+0x32678 syms=[0x8+0x4ea98+0x8+0x39115] Entering ithreads at 0xfffffc0000330320... XentMM = 0xfffffc00005e1f2c Memory cluster count: 3 MEMC 0: pfn 0x0 cnt 0xa5 usage 0x1 MEMC 1: pfn 0xa5 cnt 0x3ee6 usage 0x0 Cluster 1 contains kernel Loading chunk after kernel: 0x3d3 / 0x3f8b MEMC 2: pfn 0x3f8b cnt 0x75 usage 0x1 Unrecognized boot flag '"'. Unrecognized boot flag '"'. Copyright (c) 1992-2000 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. kmem_init: kmemusage = 0xfffffe0000296000 halted CPU 0 halt code = 5 HALT instruction executed PC = fffffc00005e1f3c >>>e -n 32 r0 gpr: 0 ( R0) 000000000000002A gpr: 1 ( R1) FFFFFE0000296000 gpr: 2 ( R2) FFFFFC000066E005 gpr: 3 ( R3) 000000000000002A gpr: 4 ( R4) 0000000000000001 gpr: 5 ( R5) FFFFFC0007F15FE8 gpr: 6 ( R6) FFFFFC00006B51F8 gpr: 7 ( R7) FFFFFC00006E7240 gpr: 8 ( R8) FFFFFC00007AA000 gpr: 9 ( R9) 0000000000001030 gpr: A ( R10) FFFFFC00006EEAC0 gpr: B ( R11) FFFFFC00006C29E8 gpr: C ( R12) FFFFFC00006CED90 gpr: D ( R13) 0000000000000000 gpr: E ( R14) 000000002004FA30 gpr: F ( R15) 0000000020035C80 gpr: 10 ( R16) FFFFFE0000296000 gpr: 11 ( R17) 0000000000000004 gpr: 12 ( R18) 0000000000000001 gpr: 13 ( R19) FFFFFC00007A9C68 gpr: 14 ( R20) FFFFFC00007A9D40 gpr: 15 ( R21) 0000000000000008 gpr: 16 ( R22) 000000000000001E gpr: 17 ( R23) FFFFFC0000417524 gpr: 18 ( R24) 000000000000000F gpr: 19 ( R25) 0000000000000010 gpr: 1A ( R26) FFFFFC00005E1F3C gpr: 1B ( R27) FFFFFC0000416020 gpr: 1C ( R28) FFFFFC00007A9D70 gpr: 1D ( R29) FFFFFC00006D9008 gpr: 1E ( R30) FFFFFC00007A9C50 gpr: 1F ( R31) 0000000000000000 >>>e ksp ipr: 11 ( KSP) FFFFFC00007A9C50 >>>e -v -n 32 FFFFFC00007A9C50 vmem: 7A9C50 000000000000002A vmem: 7A9C58 FFFFFE0000296000 vmem: 7A9C60 FFFFFC000066E005 vmem: 7A9C68 000000000000002A vmem: 7A9C70 0000000000000001 vmem: 7A9C78 FFFFFC0007F15FE8 vmem: 7A9C80 FFFFFC00006B51F8 vmem: 7A9C88 FFFFFC00006E7240 vmem: 7A9C90 FFFFFC00007AA000 vmem: 7A9C98 0000000000001030 vmem: 7A9CA0 FFFFFC00006EEAC0 vmem: 7A9CA8 FFFFFC00006C29E8 vmem: 7A9CB0 FFFFFC00006CED90 vmem: 7A9CB8 0000000000000000 vmem: 7A9CC0 000000002004FA30 vmem: 7A9CC8 0000000020035C80 vmem: 7A9CD0 FFFFFC00007A9C68 vmem: 7A9CD8 FFFFFC00007A9D40 vmem: 7A9CE0 0000000000000008 vmem: 7A9CE8 000000000000001E vmem: 7A9CF0 FFFFFC0000417524 vmem: 7A9CF8 000000000000000F vmem: 7A9D00 0000000000000010 vmem: 7A9D08 FFFFFC00003F2FD0 vmem: 7A9D10 FFFFFC0000416020 vmem: 7A9D18 0000000000000823 vmem: 7A9D20 FFFFFC00007A9D70 vmem: 7A9D28 0000000000000000 vmem: 7A9D30 0000000000000001 vmem: 7A9D38 FFFFFC00007A9C18 vmem: 7A9D40 0000000000000006 vmem: 7A9D48 FFFFFC00003F2FDC vmem: 7A9D50 FFFFFC00006D9008 vmem: 7A9D58 0000000000000006 vmem: 7A9D60 00000000000003F9 vmem: 7A9D68 0000000000000000 vmem: 7A9D70 FFFFFC00003DFE34 vmem: 7A9D78 FFFFFC00006CEDC0 vmem: 7A9D80 FFFFFC00006E6B08 vmem: 7A9D88 FFFFFC00006AAFF0 vmem: 7A9D90 FFFFFC000033038C vmem: 7A9D98 00000000000003D5 vmem: 7A9DA0 0000000000000001 vmem: 7A9DA8 00000000200505E0 vmem: 7A9DB0 0000000020050780 vmem: 7A9DB8 0000000000000000 vmem: 7A9DC0 0000000000000000 vmem: 7A9DC8 0000000000000000 vmem: 7A9DD0 0000000000000000 vmem: 7A9DD8 0000000000000000 vmem: 7A9DE0 0000000000000000 The faulting address is 0xfffffc00003f2fdc (*(int *)kmemusage = 0x0;) /* code added to fault immediately after allocating kmemusage */ kmemusage is at 0xfffffe0000296000 Awaiting further instructions... ;) Drew To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-alpha" in the body of the message