From owner-freebsd-sparc64@FreeBSD.ORG Sat Oct 6 13:49:01 2007 Return-Path: Delivered-To: freebsd-sparc64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 57A5516A418 for ; Sat, 6 Oct 2007 13:49:01 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (alchemy.franken.de [194.94.249.214]) by mx1.freebsd.org (Postfix) with ESMTP id CC7C213C461 for ; Sat, 6 Oct 2007 13:49:00 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (localhost [127.0.0.1]) by alchemy.franken.de (8.14.1/8.14.1/ALCHEMY.FRANKEN.DE) with ESMTP id l96DQK9v016495; Sat, 6 Oct 2007 15:26:20 +0200 (CEST) (envelope-from marius@alchemy.franken.de) Received: (from marius@localhost) by alchemy.franken.de (8.14.1/8.14.1/Submit) id l96DQKqv016493; Sat, 6 Oct 2007 15:26:20 +0200 (CEST) (envelope-from marius) Date: Sat, 6 Oct 2007 15:26:20 +0200 From: Marius Strobl To: John Baldwin Message-ID: <20071006132620.GF24840@alchemy.franken.de> References: <46FEADFD.8020105@FreeBSD.org> <20071003132944.GA17342@alchemy.franken.de> <200710060222.31023.jhb@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200710060222.31023.jhb@freebsd.org> User-Agent: Mutt/1.4.2.3i Cc: freebsd-sparc64@freebsd.org Subject: Re: 7.0 broken on e4500 X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Oct 2007 13:49:01 -0000 On Sat, Oct 06, 2007 at 02:22:30AM -0400, John Baldwin wrote: > On Wednesday 03 October 2007 09:29:44 am Marius Strobl wrote: > > On Sat, Sep 29, 2007 at 09:56:45PM +0200, Kris Kennaway wrote: > > > I get this early during boot with a CVS kernel (updated from last > December): > > > > > > > FreeBSD/SMP: Multiprocessor System Detected: 10 CPUs > > > > panic: tsb_tte_enter: replacing valid kernel mapping > > > > cpuid = 0 > > > > KDB: enter: panic > > > > [thread pid 0 tid 0 ] > > > > Stopped at kdb_enter+0x68: ta %xcc, 1 > > > > db> wh > > > > Tracing pid 0 tid 0 td 0xc0744f80 > > > > panic() at panic+0x204 > > > > tsb_tte_enter() at tsb_tte_enter+0xdc > > > > pmap_enter_locked() at pmap_enter_locked+0x2d0 > > > > pmap_enter() at pmap_enter+0x64 > > > > kmem_malloc() at kmem_malloc+0x6e0 > > > > page_alloc() at page_alloc+0x28 > > > > uma_large_malloc() at uma_large_malloc+0x44 > > > > malloc() at malloc+0x1b0 > > > > sf_buf_init() at sf_buf_init+0xf8 > > > > mi_startup() at mi_startup+0x18c > > > > btext() at btext+0x34 > > > > > > > Do you by chance load the new kernel manually via the loader > > prompt, with the old kernel being <= 8MB in size and the new > > one > 8MB? > > I get this panic on an E220R at work, but my "new" kernel is smaller. > If the actual panic string is "vm_phys_paddr_to_vm_page: paddr is not in any segment" than that's the problem I had in mind when replying to Kris but unfortunately failed to describe the right way around. > > ll /boot/kernel/kernel* /boot/test/kernel* > -r-xr-xr-x 1 root wheel 7821094 Feb 6 2007 /boot/kernel/kernel > -r-xr-xr-x 1 root wheel 13902501 Feb 6 2007 /boot/kernel/kernel.symbols > -r-xr-xr-x 1 root wheel 4534968 Oct 6 00:20 /boot/test/kernel > -r-xr-xr-x 1 root wheel 10101980 Oct 6 00:20 /boot/test/kernel.symbols > > The working kernel (~7MB) is the GENERIC kernel, and the "test" kernel > is the stripped down kernel for this machine. In my case I'm panicing in > pmap_remove_tte() called from pmap_enter_locked(). I added some KTR traces > to the pmap code to try and investigate, but I'm guessing the root problem is > that the loader doesn't properly handle telling OFW about needing to change > the mappings when unloading and then loading a new kernel? > > Hmm, it looks like currently the loader doesn't do any sort of MD callback > when unloading a file, so the loader isn't going to free up the RAM it asked > for from OFW for the old kernel. > Correct, the immediate problem (which I had a patch for somewhere) is that in case the "old" kernel required more TLB slots to be used than the "new" one one can't use the kernel end in order to determine how many slots are used for the kernel map. As you describe the real problem lies within the loader though. The funny thing is that no arch except sparc64 and sun4v seems to rely on the kernel end provided by the loader. If no idea what's the cause of the problem Kris is seeing though. Marius