Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 28 Aug 2003 15:41:28 -0400
From:      Mike Tancsa <mike@sentex.net>
To:        Tor Egge <Tor.Egge@cvsup.no.freebsd.org>, silby@silby.com
Cc:        stable@freebsd.org
Subject:   Re: Ok, are all the panics fixed now?
Message-ID:  <5.2.0.9.0.20030828153925.03f3ed80@209.112.4.2>
In-Reply-To: <20030828.161240.74667710.Tor.Egge@cvsup.no.freebsd.org>
References:  <20030827133126.D4269@odysseus.silby.com> <20030827122327.GA17847@falcon.midgard.homeip.net> <20030827115045.X4269@odysseus.silby.com> <20030827133126.D4269@odysseus.silby.com>

next in thread | previous in thread | raw e-mail | index | archive | help

Hi,
         Forgive the naive question, but is this an easy issue to solve ? 
Or is it just scratching the surface of a larger more complex problem not 
easily resolved by a simple patch or two ?

         ---Mike

At 04:12 PM 28/08/2003 +0000, Tor Egge wrote:

> > Ok, I booted with hw.physmem="16M", and a buildworld paniced with
> > "free/cache page dirty".  (I was taking a nap at the time the buildworld
> > crashed, we'll talk about that soon.)
>
>While one buildworld succeeded, another one crashed with
>"panic: vm_page_dirty: page in cache", and a third one crashed with
>a corrupt entry at the end of the vm_page_buckets[] array.
>
>I added some instrumentation to detect unexpected sleeps causing the problem.
>The problem still occurred without any such sleeps being detected, indicating
>that the previous fix was incorrect.
>
>I then added code to 'freeze' PMAP1 within critical sections.
>
>panic: PMAP1 frozen
>mp_lock = 01000002; cpuid = 1; lapic.id = 00000000^
>Debugger("panic")
>Stopped at      Debugger+0x34:  movb    $0,in_Debugger.630
>db> trace
>Debugger(c02f95aa) at Debugger+0x34
>panic(c03266cf,c08a2a18,c4395894,6e00cc64,cb426a78) at panic+0xa4
>pmap_pte(cb421cac,8098000) at pmap_pte+0xa0
>pmap_clear_modify(c0886bf4) at pmap_clear_modify+0x32
>swp_pager_async_iodone(c4395894,c2d55400,c2d3b000,c2d55400,c2d3b0a0) at 
>swp_pager_async_iodone+0x176
>biodone(c4395894,c2d3b014,c4395894) at biodone+0xf5
>dadone(c2d40a00,c2d55400,4d405,c087c514,8060000) at dadone+0x265
>camisr(c035a770,cb426e6c,c02b75cb,0,cb420018) at camisr+0x1eb
>swi_cambio(0,cb420018,cb420010,c02c0010,8060000) at swi_cambio+0xd
>doreti_swi(cb4209ec,cb421fac,8048000,40000,8048000) at doreti_swi+0xf
>vm_map_copy_entry(cb421f40,cb420980,c0364d70,cca2bcc0) at 
>vm_map_copy_entry+0xdf
>vmspace_fork(cb421f40,cb41b6c0,cb41b6c0,0,cb426f08) at vmspace_fork+0x1db
>vm_fork(cb41ee00,cb41b6c0,14) at vm_fork+0x8e
>fork1(cb41ee00,14,cb426f20,0,246) at fork1+0x7a7
>fork(cb41ee00,cb426f80,80aa3c0,80aa3c0,0) at fork+0x16
>syscall2(2f,2f,2f,0,80aa3c0) at syscall2+0x221
>Xint0x80_syscall() at Xint0x80_syscall+0x2b
>
>
>One stack level is missing: pmap_copy() is called from vm_map_copy_entry().
>
>The main problem here is that we've got code running without splvm() 
>protection
>that depends on PMAP1 being unchanged at the same time as we have interrupts
>changing PMAP1.
>
>Without the PAE MFC patch, the interrupt would change PMAP1 while the 
>functions
>accessing page tables without splvm() protection would use the per-vmspace
>APTpde/APTmap.
>
>- Tor Egge
>_______________________________________________
>freebsd-stable@freebsd.org mailing list
>http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5.2.0.9.0.20030828153925.03f3ed80>