From owner-freebsd-fs@FreeBSD.ORG Tue Oct 15 17:41:36 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 8C72C4C5; Tue, 15 Oct 2013 17:41:36 +0000 (UTC) (envelope-from alc@rice.edu) Received: from pp1.rice.edu (proofpoint1.mail.rice.edu [128.42.201.100]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 4DF2D29E8; Tue, 15 Oct 2013 17:41:35 +0000 (UTC) Received: from pps.filterd (pp1.rice.edu [127.0.0.1]) by pp1.rice.edu (8.14.5/8.14.5) with SMTP id r9FH7qR8001595; Tue, 15 Oct 2013 12:12:37 -0500 Received: from mh11.mail.rice.edu (mh11.mail.rice.edu [128.42.199.30]) by pp1.rice.edu with ESMTP id 1fe4ma9v4c-1; Tue, 15 Oct 2013 12:12:37 -0500 X-Virus-Scanned: by amavis-2.7.0 at mh11.mail.rice.edu, auth channel Received: from adsl-216-63-78-18.dsl.hstntx.swbell.net (108-254-203-201.lightspeed.hstntx.sbcglobal.net [108.254.203.201]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: alc) by mh11.mail.rice.edu (Postfix) with ESMTPSA id B7F064C01C0; Tue, 15 Oct 2013 12:12:36 -0500 (CDT) Message-ID: <525D7784.5000808@rice.edu> Date: Tue, 15 Oct 2013 12:12:36 -0500 From: Alan Cox User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:24.0) Gecko/20100101 Thunderbird/24.0 MIME-Version: 1.0 To: Konstantin Belousov , J David Subject: Re: 9.2 + ZFS + i386 = panic: pmap_enter: attempted pmap_enter on 4MB page References: <20131015164537.GH3865@kib.kiev.ua> In-Reply-To: <20131015164537.GH3865@kib.kiev.ua> X-Enigmail-Version: 1.5.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 kscore.is_bulkscore=0 kscore.compositescore=0 circleOfTrustscore=0 compositescore=0.600226699279213 urlsuspect_oldscore=0.000226699279213129 suspectscore=38 recipient_domain_to_sender_totalscore=0 phishscore=0 bulkscore=0 kscore.is_spamscore=0 recipient_to_sender_totalscore=0 recipient_domain_to_sender_domain_totalscore=448 rbsscore=0.600226699279213 spamscore=0 recipient_to_sender_domain_totalscore=0 urlsuspectscore=0.9 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=7.0.1-1305240000 definitions=main-1310150077 Cc: "freebsd-fs@freebsd.org" , alc@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Oct 2013 17:41:36 -0000 On 10/15/2013 11:45, Konstantin Belousov wrote: > On Tue, Oct 15, 2013 at 11:17:53AM -0400, J David wrote: >> What are the necessary loader.conf / kernel config invocations to make >> ZFS stable on 9.2-RELEASE i386 node with 3GB RAM? >> >> This machine was rock solid under 8.4, but since upgrading to 9.2 it >> has been a disaster. It crashes every few hours with "panic: >> pmap_enter: attempted pmap_enter on 4MB page." >> >> Here are a couple of stack traces: >> >> panic: pmap_enter: attempted pmap_enter on 4MB page >> cpuid = 0 >> KDB: stack backtrace: >> #0 0xc0afc092 at kdb_backtrace+0x52 >> #1 0xc0ac249c at panic+0x1bc >> #2 0xc0f1825d at pmap_enter+0x63d >> #3 0xc0d34ee5 at vm_fault_hold+0x1c45 >> #4 0xc0d33262 at vm_fault+0x82 >> #5 0xc0f1f5d6 at trap_pfault+0x186 >> #6 0xc0f1ed4b at trap+0x51b >> #7 0xc0f0887c at calltrap+0x6 >> #8 0xc16770b6 at zio_execute+0x116 >> #9 0xc15d9460 at taskq_run_safe+0x10 >> #10 0xc0b08f26 at taskqueue_run_locked+0xe6 >> #11 0xc0b097f7 at taskqueue_thread_loop+0xb7 >> #12 0xc0a90a73 at fork_exit+0xa3 >> #13 0xc0f08924 at fork_trampoline+0x8 >> >> panic: pmap_enter: attempted pmap_enter on 4MB page >> cpuid = 1 >> KDB: stack backtrace: >> #0 0xc0afc092 at kdb_backtrace+0x52 >> #1 0xc0ac249c at panic+0x1bc >> #2 0xc0f1825d at pmap_enter+0x63d >> #3 0xc0d34ee5 at vm_fault_hold+0x1c45 >> #4 0xc0d33262 at vm_fault+0x82 >> #5 0xc0f1f5d6 at trap_pfault+0x186 >> #6 0xc0f1ed4b at trap+0x51b >> #7 0xc0f0887c at calltrap+0x6 >> #8 0xc1679bdf at zio_dva_allocate+0x9f >> #9 0xc16770b6 at zio_execute+0x116 >> #10 0xc15d9460 at taskq_run_safe+0x10 >> #11 0xc0b08f26 at taskqueue_run_locked+0xe6 >> #12 0xc0b097f7 at taskqueue_thread_loop+0xb7 >> #13 0xc0a90a73 at fork_exit+0xa3 >> #14 0xc0f08924 at fork_trampoline+0x8 >> >> For the last crash, top was running and this is what was on the screen >> when it died: >> >> CPU: 0.6% user, 0.0% nice, 2.2% system, 0.2% interrupt, 97.1% idle >> Mem: 442M Active, 91M Inact, 79M Wired, 3764K Cache, 4960K Buf, 2380M Free >> ARC: 40M Total, 8520K MFU, 30M MRU, 304K Anon, 494K Header, 1062K Other >> Swap: 4096M Total, 4096M Free >> Write failed: Broken pipe >> >> It doesn't seem to matter what KVA_PAGES, vm.kmem_size, >> vfs.zfs.arc_max or vfs.zfs.vdev.cache.size is set to, and the ZFS >> tuning guides in the wiki (albeit appearing to be dating to the 7.x >> era) provides guidelines for tuning down to 768M. So 3GB should be >> enough for a machine that is 99% idle. (Particularly given that it >> dies with >2GB free.) >> >> This seems to be specific to i386, this problem hasn't cropped up on >> any amd64 nodes. No compression, deduplication, snapshots or anything >> like that is in use. > It seems that i386 pmap_enter() cannot deal with the superpages, while > amd64 can. On the other hand, I am not quite sure that this is your > problem, because you get traps on accessing KVA, and I suspect that > ZFS uses wired memory. Obtain the kernel dump and do full backtrace > for the panic. Try MFCing r253949. I'm not confident that r253949 is the fix for this particular problem, but it should be MFCed. > Anyway, I believe that i386 pmap_enter() should do a demotion when needed. A kernel space pmap_enter should never be overwriting an existing superpage mapping. > Below is the prototyped change for this. > > diff --git a/sys/i386/i386/pmap.c b/sys/i386/i386/pmap.c > index 64bf1a3..91453da 100644 > --- a/sys/i386/i386/pmap.c > +++ b/sys/i386/i386/pmap.c > @@ -3473,17 +3473,21 @@ pmap_enter(pmap_t pmap, vm_offset_t va, vm_prot_t access, vm_page_t m, > PMAP_LOCK(pmap); > sched_pin(); > > - /* > - * In the case that a page table page is not > - * resident, we are creating it here. > - */ > - if (va < VM_MAXUSER_ADDRESS) { > + pde = pmap_pde(pmap, va); > + if ((*pde & PG_PS) != 0) { > + /* PG_V is asserted by pmap_demote_pde */ > + pmap_demote_pde(pmap, pde, va); > + if (va < VM_MAXUSER_ADDRESS) { > + mpte = PHYS_TO_VM_PAGE(*pde & PG_FRAME); > + mpte->wire_count++; > + } > + } else if (va < VM_MAXUSER_ADDRESS) { > + /* > + * In the case that a page table page is not resident, > + * we are creating it here. > + */ > mpte = pmap_allocpte(pmap, va, M_WAITOK); > } > - > - pde = pmap_pde(pmap, va); > - if ((*pde & PG_PS) != 0) > - panic("pmap_enter: attempted pmap_enter on 4MB page"); > pte = pmap_pte_quick(pmap, va); > > /*