From owner-freebsd-sparc64@FreeBSD.ORG Tue Nov 6 07:38:43 2007 Return-Path: Delivered-To: freebsd-sparc64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D746316A421; Tue, 6 Nov 2007 07:38:43 +0000 (UTC) (envelope-from alc@cs.rice.edu) Received: from mail.cs.rice.edu (mail.cs.rice.edu [128.42.1.31]) by mx1.freebsd.org (Postfix) with ESMTP id 7E72C13C48A; Tue, 6 Nov 2007 07:38:43 +0000 (UTC) (envelope-from alc@cs.rice.edu) Received: from mail.cs.rice.edu (localhost.localdomain [127.0.0.1]) by mail.cs.rice.edu (Postfix) with ESMTP id 5BD982C2C4A; Tue, 6 Nov 2007 01:38:33 -0600 (CST) X-Virus-Scanned: by amavis-2.4.0 at mail.cs.rice.edu Received: from mail.cs.rice.edu ([127.0.0.1]) by mail.cs.rice.edu (mail.cs.rice.edu [127.0.0.1]) (amavisd-new, port 10024) with LMTP id KxjgJH7ZfjNL; Tue, 6 Nov 2007 01:38:25 -0600 (CST) Received: from [216.63.78.18] (adsl-216-63-78-18.dsl.hstntx.swbell.net [216.63.78.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.cs.rice.edu (Postfix) with ESMTP id 105AA2C2C47; Tue, 6 Nov 2007 01:38:18 -0600 (CST) Message-ID: <473019E8.3070203@cs.rice.edu> Date: Tue, 06 Nov 2007 01:38:16 -0600 From: Alan Cox User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.13) Gecko/20070805 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Kris Kennaway References: <46FEADFD.8020105@FreeBSD.org> <20071003132944.GA17342@alchemy.franken.de> <200710060222.31023.jhb@freebsd.org> <20071006132620.GF24840@alchemy.franken.de> <472DFC18.3080000@FreeBSD.org> <472E4573.3090708@FreeBSD.org> <20071104224618.GD36824@alchemy.franken.de> <472E54D0.8070807@FreeBSD.org> In-Reply-To: <472E54D0.8070807@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: alc@FreeBSD.org, freebsd-sparc64@FreeBSD.org, John Baldwin , Marius Strobl Subject: Re: 7.0 broken on e4500 X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Nov 2007 07:38:44 -0000 Kris Kennaway wrote: > Marius Strobl wrote: > >> On Sun, Nov 04, 2007 at 11:19:31PM +0100, Kris Kennaway wrote: >> >>> Kris Kennaway wrote: >>> >>>> Marius Strobl wrote: >>>> >>>>> On Sat, Oct 06, 2007 at 02:22:30AM -0400, John Baldwin wrote: >>>>> >>>>>> On Wednesday 03 October 2007 09:29:44 am Marius Strobl wrote: >>>>>> >>>>>>> On Sat, Sep 29, 2007 at 09:56:45PM +0200, Kris Kennaway wrote: >>>>>>> >>>>>>>> I get this early during boot with a CVS kernel (updated from last >>>>>>> >>>>>> December): >>>>>> >>>>>>>>> FreeBSD/SMP: Multiprocessor System Detected: 10 CPUs >>>>>>>>> panic: tsb_tte_enter: replacing valid kernel mapping >>>>>>>>> cpuid = 0 >>>>>>>>> KDB: enter: panic >>>>>>>>> [thread pid 0 tid 0 ] >>>>>>>>> Stopped at kdb_enter+0x68: ta %xcc, 1 >>>>>>>>> db> wh >>>>>>>>> Tracing pid 0 tid 0 td 0xc0744f80 >>>>>>>>> panic() at panic+0x204 >>>>>>>>> tsb_tte_enter() at tsb_tte_enter+0xdc >>>>>>>>> pmap_enter_locked() at pmap_enter_locked+0x2d0 >>>>>>>>> pmap_enter() at pmap_enter+0x64 >>>>>>>>> kmem_malloc() at kmem_malloc+0x6e0 >>>>>>>>> page_alloc() at page_alloc+0x28 >>>>>>>>> uma_large_malloc() at uma_large_malloc+0x44 >>>>>>>>> malloc() at malloc+0x1b0 >>>>>>>>> sf_buf_init() at sf_buf_init+0xf8 >>>>>>>>> mi_startup() at mi_startup+0x18c >>>>>>>>> btext() at btext+0x34 >>>>>>>> >>>>>>> Do you by chance load the new kernel manually via the loader >>>>>>> prompt, with the old kernel being <= 8MB in size and the new >>>>>>> one > 8MB? >>>>>> >>>>>> I get this panic on an E220R at work, but my "new" kernel is >>>>>> smaller. >>>>>> >>>>> If the actual panic string is "vm_phys_paddr_to_vm_page: paddr >>>>> is not in any segment" than that's the problem I had in mind when >>>>> replying to Kris but unfortunately failed to describe the right >>>>> way around. >>>>> >>>>>>> ll /boot/kernel/kernel* /boot/test/kernel* >>>>>> >>>>>> -r-xr-xr-x 1 root wheel 7821094 Feb 6 2007 /boot/kernel/kernel >>>>>> -r-xr-xr-x 1 root wheel 13902501 Feb 6 2007 >>>>>> /boot/kernel/kernel.symbols >>>>>> -r-xr-xr-x 1 root wheel 4534968 Oct 6 00:20 /boot/test/kernel >>>>>> -r-xr-xr-x 1 root wheel 10101980 Oct 6 00:20 >>>>>> /boot/test/kernel.symbols >>>>>> >>>>>> The working kernel (~7MB) is the GENERIC kernel, and the "test" >>>>>> kernel >>>>>> is the stripped down kernel for this machine. In my case I'm >>>>>> panicing in pmap_remove_tte() called from pmap_enter_locked(). I >>>>>> added some KTR traces to the pmap code to try and investigate, >>>>>> but I'm guessing the root problem is that the loader doesn't >>>>>> properly handle telling OFW about needing to change the mappings >>>>>> when unloading and then loading a new kernel? >>>>>> >>>>>> Hmm, it looks like currently the loader doesn't do any sort of MD >>>>>> callback >>>>>> when unloading a file, so the loader isn't going to free up the >>>>>> RAM it asked for from OFW for the old kernel. >>>>>> >>>>> Correct, the immediate problem (which I had a patch for somewhere) >>>>> is that in case the "old" kernel required more TLB slots to be used >>>>> than the "new" one one can't use the kernel end in order to determine >>>>> how many slots are used for the kernel map. As you describe the real >>>>> problem lies within the loader though. The funny thing is that no >>>>> arch except sparc64 and sun4v seems to rely on the kernel end >>>>> provided by the loader. >>>>> If no idea what's the cause of the problem Kris is seeing though. >>>>> >>>>> Marius >>>>> >>>>> >>>> FYI one of the e4500's is now booting again but another is still >>>> failing with the same panic: >>>> >>>> FreeBSD 8.0-CURRENT #44: Mon Nov 5 01:52:42 JST 2007 >>>> root@e4500-2.allbsd.org:/usr/src/sys/sparc64/compile/E4500_2 >>>> real memory = 9663676416 (9216 MB) >>>> avail memory = 9433554944 (8996 MB) >>>> cpu0: Sun Microsystems UltraSparc-II Processor (400.00 MHz CPU) >>>> cpu1: Sun Microsystems UltraSparc-II Processor (400.00 MHz CPU) >>>> cpu2: Sun Microsystems UltraSparc-II Processor (400.00 MHz CPU) >>>> cpu3: Sun Microsystems UltraSparc-II Processor (400.00 MHz CPU) >>>> cpu4: Sun Microsystems UltraSparc-II Processor (400.00 MHz CPU) >>>> cpu5: Sun Microsystems UltraSparc-II Processor (400.00 MHz CPU) >>>> cpu6: Sun Microsystems UltraSparc-II Processor (400.00 MHz CPU) >>>> cpu7: Sun Microsystems UltraSparc-II Processor (400.00 MHz CPU) >>>> cpu8: Sun Microsystems UltraSparc-II Processor (400.00 MHz CPU) >>>> cpu9: Sun Microsystems UltraSparc-II Processor (400.00 MHz CPU) >>>> FreeBSD/SMP: Multiprocessor System Detected: 10 CPUs >>>> panic: tsb_tte_enter: replacing valid kernel mapping >>>> db> wh >>>> Tracing pid 0 tid 0 td 0xc056ad30 >>>> panic() at panic+0x248 >>>> tsb_tte_enter() at tsb_tte_enter+0xdc >>>> pmap_enter_locked() at pmap_enter_locked+0x318 >>>> pmap_enter() at pmap_enter+0x64 >>>> kmem_malloc() at kmem_malloc+0x644 >>>> page_alloc() at page_alloc+0x28 >>>> uma_large_malloc() at uma_large_malloc+0x44 >>>> malloc() at malloc+0x1a0 >>>> sf_buf_init() at sf_buf_init+0xe8 >>>> mi_startup() at mi_startup+0x1e8 >>>> btext() at btext+0x34 >>>> Can anyone tell me more about the "vm_phys_paddr_to_vm_page: paddr is not in any segment" panic? Alan