Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 15 Oct 2013 12:12:36 -0500
From:      Alan Cox <alc@rice.edu>
To:        Konstantin Belousov <kostikbel@gmail.com>, J David <j.david.lists@gmail.com>
Cc:        "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>, alc@freebsd.org
Subject:   Re: 9.2 + ZFS + i386 = panic: pmap_enter: attempted pmap_enter on 4MB page
Message-ID:  <525D7784.5000808@rice.edu>
In-Reply-To: <20131015164537.GH3865@kib.kiev.ua>
References:  <CABXB=RQd_yT%2BsEA0qBnyCK-3ZsvxfSiSCcMVjnHoSyCB%2BZ5%2BRg@mail.gmail.com> <20131015164537.GH3865@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On 10/15/2013 11:45, Konstantin Belousov wrote:
> On Tue, Oct 15, 2013 at 11:17:53AM -0400, J David wrote:
>> What are the necessary loader.conf / kernel config invocations to make
>> ZFS stable on 9.2-RELEASE i386 node with 3GB RAM?
>>
>> This machine was rock solid under 8.4, but since upgrading to 9.2 it
>> has been a disaster.  It crashes every few hours with "panic:
>> pmap_enter: attempted pmap_enter on 4MB page."
>>
>> Here are a couple of stack traces:
>>
>> panic: pmap_enter: attempted pmap_enter on 4MB page
>> cpuid = 0
>> KDB: stack backtrace:
>> #0 0xc0afc092 at kdb_backtrace+0x52
>> #1 0xc0ac249c at panic+0x1bc
>> #2 0xc0f1825d at pmap_enter+0x63d
>> #3 0xc0d34ee5 at vm_fault_hold+0x1c45
>> #4 0xc0d33262 at vm_fault+0x82
>> #5 0xc0f1f5d6 at trap_pfault+0x186
>> #6 0xc0f1ed4b at trap+0x51b
>> #7 0xc0f0887c at calltrap+0x6
>> #8 0xc16770b6 at zio_execute+0x116
>> #9 0xc15d9460 at taskq_run_safe+0x10
>> #10 0xc0b08f26 at taskqueue_run_locked+0xe6
>> #11 0xc0b097f7 at taskqueue_thread_loop+0xb7
>> #12 0xc0a90a73 at fork_exit+0xa3
>> #13 0xc0f08924 at fork_trampoline+0x8
>>
>> panic: pmap_enter: attempted pmap_enter on 4MB page
>> cpuid = 1
>> KDB: stack backtrace:
>> #0 0xc0afc092 at kdb_backtrace+0x52
>> #1 0xc0ac249c at panic+0x1bc
>> #2 0xc0f1825d at pmap_enter+0x63d
>> #3 0xc0d34ee5 at vm_fault_hold+0x1c45
>> #4 0xc0d33262 at vm_fault+0x82
>> #5 0xc0f1f5d6 at trap_pfault+0x186
>> #6 0xc0f1ed4b at trap+0x51b
>> #7 0xc0f0887c at calltrap+0x6
>> #8 0xc1679bdf at zio_dva_allocate+0x9f
>> #9 0xc16770b6 at zio_execute+0x116
>> #10 0xc15d9460 at taskq_run_safe+0x10
>> #11 0xc0b08f26 at taskqueue_run_locked+0xe6
>> #12 0xc0b097f7 at taskqueue_thread_loop+0xb7
>> #13 0xc0a90a73 at fork_exit+0xa3
>> #14 0xc0f08924 at fork_trampoline+0x8
>>
>> For the last crash, top was running and this is what was on the screen
>> when it died:
>>
>> CPU:  0.6% user,  0.0% nice,  2.2% system,  0.2% interrupt, 97.1% idle
>> Mem: 442M Active, 91M Inact, 79M Wired, 3764K Cache, 4960K Buf, 2380M Free
>> ARC: 40M Total, 8520K MFU, 30M MRU, 304K Anon, 494K Header, 1062K Other
>> Swap: 4096M Total, 4096M Free
>> Write failed: Broken pipe
>>
>> It doesn't seem to matter what KVA_PAGES, vm.kmem_size,
>> vfs.zfs.arc_max or vfs.zfs.vdev.cache.size is set to, and the ZFS
>> tuning guides in the wiki (albeit appearing to be dating to the 7.x
>> era) provides guidelines for tuning down to 768M.  So 3GB should be
>> enough for a machine that is 99% idle.  (Particularly given that it
>> dies with >2GB free.)
>>
>> This seems to be specific to i386, this problem hasn't cropped up on
>> any amd64 nodes.  No compression, deduplication, snapshots or anything
>> like that is in use.
> It seems that i386 pmap_enter() cannot deal with the superpages, while
> amd64 can.  On the other hand, I am not quite sure that this is your
> problem, because you get traps on accessing KVA, and I suspect that
> ZFS uses wired memory.  Obtain the kernel dump and do full backtrace
> for the panic.

Try MFCing r253949.  I'm not confident that r253949 is the fix for this
particular problem, but it should be MFCed.

> Anyway, I believe that i386 pmap_enter() should do a demotion when needed.

A kernel space pmap_enter should never be overwriting an existing
superpage mapping.

> Below is the prototyped change for this.
>
> diff --git a/sys/i386/i386/pmap.c b/sys/i386/i386/pmap.c
> index 64bf1a3..91453da 100644
> --- a/sys/i386/i386/pmap.c
> +++ b/sys/i386/i386/pmap.c
> @@ -3473,17 +3473,21 @@ pmap_enter(pmap_t pmap, vm_offset_t va, vm_prot_t access, vm_page_t m,
>  	PMAP_LOCK(pmap);
>  	sched_pin();
>  
> -	/*
> -	 * In the case that a page table page is not
> -	 * resident, we are creating it here.
> -	 */
> -	if (va < VM_MAXUSER_ADDRESS) {
> +	pde = pmap_pde(pmap, va);
> +	if ((*pde & PG_PS) != 0) {
> +		/* PG_V is asserted by pmap_demote_pde */
> +		pmap_demote_pde(pmap, pde, va);
> +		if (va < VM_MAXUSER_ADDRESS) {
> +			mpte = PHYS_TO_VM_PAGE(*pde & PG_FRAME);
> +			mpte->wire_count++;
> +		}
> +	} else if (va < VM_MAXUSER_ADDRESS) {
> +		/*
> +		 * In the case that a page table page is not resident,
> +		 * we are creating it here.
> +		 */
>  		mpte = pmap_allocpte(pmap, va, M_WAITOK);
>  	}
> -
> -	pde = pmap_pde(pmap, va);
> -	if ((*pde & PG_PS) != 0)
> -		panic("pmap_enter: attempted pmap_enter on 4MB page");
>  	pte = pmap_pte_quick(pmap, va);
>  
>  	/*




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?525D7784.5000808>