From owner-freebsd-arm@FreeBSD.ORG Sat Jun 7 18:18:09 2014 Return-Path: Delivered-To: freebsd-arm@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 01DA488C; Sat, 7 Jun 2014 18:18:09 +0000 (UTC) Received: from pp1.rice.edu (proofpoint1.mail.rice.edu [128.42.201.100]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B08CF2492; Sat, 7 Jun 2014 18:18:07 +0000 (UTC) Received: from pps.filterd (pp1.rice.edu [127.0.0.1]) by pp1.rice.edu (8.14.5/8.14.5) with SMTP id s57IH49Y019447; Sat, 7 Jun 2014 13:18:00 -0500 Received: from mh2.mail.rice.edu (mh2.mail.rice.edu [128.42.201.21]) by pp1.rice.edu with ESMTP id 1m6965uvx9-1; Sat, 07 Jun 2014 13:18:00 -0500 X-Virus-Scanned: by amavis-2.7.0 at mh2.mail.rice.edu, auth channel Received: from 108-254-203-201.lightspeed.hstntx.sbcglobal.net (108-254-203-201.lightspeed.hstntx.sbcglobal.net [108.254.203.201]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: alc) by mh2.mail.rice.edu (Postfix) with ESMTPSA id 1BB2A500169; Sat, 7 Jun 2014 13:18:00 -0500 (CDT) Message-ID: <53935755.70908@rice.edu> Date: Sat, 07 Jun 2014 13:17:57 -0500 From: Alan Cox User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Olivier Houchard , Adrian Chadd , "freebsd-arm@freebsd.org" , alc@freebsd.org, kib@freebsd.org Subject: Re: svn commit: r266850 - in head/sys/arm/xscale: i80321 i8134x ixp425 pxa References: <201405291656.s4TGudoD002868@svn.freebsd.org> <20140529171641.GA5246@ci0.org> <20140529173803.GA5294@ci0.org> <20140530063228.GD43976@funkthat.com> <5388ABF1.3030200@rice.edu> <20140601081153.GU43976@funkthat.com> In-Reply-To: <20140601081153.GU43976@funkthat.com> X-Enigmail-Version: 1.6 Content-Type: multipart/mixed; boundary="------------070407040005080007020500" X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 kscore.is_bulkscore=0 kscore.compositescore=0 circleOfTrustscore=0 compositescore=0.248919945447816 urlsuspect_oldscore=0.248919945447816 suspectscore=10 recipient_domain_to_sender_totalscore=0 phishscore=0 bulkscore=0 kscore.is_spamscore=0 recipient_to_sender_totalscore=0 recipient_domain_to_sender_domain_totalscore=0 rbsscore=0.248919945447816 spamscore=0 recipient_to_sender_domain_totalscore=0 urlsuspectscore=0.9 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=7.0.1-1402240000 definitions=main-1406070252 X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 Jun 2014 18:18:09 -0000 This is a multi-part message in MIME format. --------------070407040005080007020500 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit On 06/01/2014 03:11, John-Mark Gurney wrote: > Alan Cox wrote this message on Fri, May 30, 2014 at 11:04 -0500: >> On 05/30/2014 01:32, John-Mark Gurney wrote: >>> Olivier Houchard wrote this message on Thu, May 29, 2014 at 19:38 +0200: >>>> On Thu, May 29, 2014 at 10:19:18AM -0700, Adrian Chadd wrote: >>>>> On 29 May 2014 10:16, Olivier Houchard wrote: >>>>>> On Thu, May 29, 2014 at 10:14:53AM -0700, Adrian Chadd wrote: >>>>>>> Have you tested this on xscale hardware? >>>>>> Yeah, my two last commits were an attempt to get the AVILA kernel to boot >>>>>> again. >>>>> Woo! What can I provide to help you do this? :-) >>>>> >>>>> (Drinks? Food? Donations?) >>>>> >>>>> >>>> Drinks and food are always appreciated ;) >>>> It almost boots for me now, except a few userland programs gets SIGSEGV or >>>> SIGILL along the way, trying to figure out why. >>> Thanks for fixing ddb... I'm getting panic messages again... bad >>> news is that my panic is still around: >>> panic: vm_page_alloc: page 0xc07e73b0 is wired >>> >>> Though, interestingly, it looks like sparc64 has a similar panic: >>> https://www.freebsd.org/cgi/query-pr.cgi?pr=187080 >>> >>> kib, Alan, any clue to why this is happening? Any suggestions as to >>> help track it down? >> I'm afraid not. The dump below shows a perfectly normal, in-use page. >> If this page had actually been free prior to the vm_page_alloc() call, >> then other fields, like dirty, would have been different. In other >> words, this isn't just a problem with the wire count. >> >> What object is vm_page_alloc() being performed on? > Is this enough? Or do you need more? > > panic: vm_page_alloc: page 0xc07e73b0 is wired, obj: 0xc1500b40 > KDB: enter: panic > [ thread pid 781 tid 100051 ] > Stopped at kdb_enter+0x40: ldrb r15, [r15, r15, ror r15]! > db> show object/f 0xc1500b40 > Object 0xc1500b40: type=2, size=0xa, res=9, ref=0, flags=0x0 ruid -1 charge 0 > sref=0, backing_object(0)=(0)+0x0 > memory:=(off=0x0,page=0x8f0000),(off=0x1,page=0x8f1000),(off=0x2,page=0x8ee000),(off=0x3,page=0x8ef000),(off=0x4,page=0x8f3000),(off=0x5,page=0x8f4000) > ...(off=0x6,page=0x8fa000),(off=0x7,page=0x8fb000),(off=0x8,page=0x8fc000) > > If you need more, let me know what/how to get it, and I will... > Anyone who has seen the "wired page" panic, please try the attached patch. It introduces some new KASSERT()s that may help me to narrow down the problem. I haven't been able to trigger these KASSERT()s on amd64, but the symptoms that you guys are reporting are consistent with a bug that would trigger these KASSERT()s. >>> Lastest dump of the vm_page from a tree from today is: >>> {'act_count': '\x00', >>> 'aflags': '\x00', >>> 'busy_lock': 1, >>> 'dirty': '\xff', >>> 'flags': 0, >>> 'hold_count': 0, >>> 'listq': {'tqe_next': 0xc07e7400, 'tqe_prev': 0xc06e63a0}, >>> 'md': {'pv_kva': 3235893248, >>> 'pv_list': {'tqh_first': 0x0, 'tqh_last': 0xc07e73e0}, >>> 'pv_memattr': '\x00', >>> 'pvh_attrs': 0}, >>> 'object': 0xc06e6378, >>> 'oflags': '\x04', >>> 'order': '\t', >>> 'phys_addr': 9424896, >>> 'pindex': 3581, >>> 'plinks': {'memguard': {'p': 0, 'v': 3228461644}, >>> 'q': {'tqe_next': 0x0, 'tqe_prev': 0xc06e6a4c}, >>> 's': {'pv': 0xc06e6a4c, 'ss': {'sle_next': 0x0}}}, >>> 'pool': '\x00', >>> 'queue': '\xff', >>> 'segind': '\x02', >>> 'valid': '\xff', >>> 'wire_count': 1} >>> >>> This appears to be on the kmem_object list as: >>> c06e62d8 B kernel_object_store >>> c06e6378 B kmem_object_store >>> c06e6418 b old_msync >>> >>> and you can see the tqh_last would be part of kmem_object_store... >>> >>> Could this be something bad happening w/ when memory is low? The >>> board I'm testing on has only 64MB (54MB avail), so it hits that >>> pretty quickly... --------------070407040005080007020500 Content-Type: text/plain; charset=ISO-8859-15; name="arm_debug.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="arm_debug.patch" Index: vm/vm_phys.c =================================================================== --- vm/vm_phys.c (revision 267209) +++ vm/vm_phys.c (working copy) @@ -693,6 +693,7 @@ vm_phys_free_pages(vm_page_t m, int order) void vm_phys_free_contig(vm_page_t m, u_long npages) { + vm_page_t m_tmp; u_int n; int order; @@ -714,6 +715,10 @@ vm_phys_free_contig(vm_page_t m, u_long npages) n = 1 << order; if (npages < n) break; + for (m_tmp = m; m_tmp < &m[n]; m_tmp++) + KASSERT(m_tmp->object == NULL || + (m_tmp->flags & PG_CACHED) != 0, + ("vm_phys_free_contig: xxx")); vm_phys_free_pages(m, order); m += n; } @@ -721,6 +726,10 @@ vm_phys_free_contig(vm_page_t m, u_long npages) for (; npages > 0; npages -= n) { order = flsl(npages) - 1; n = 1 << order; + for (m_tmp = m; m_tmp < &m[n]; m_tmp++) + KASSERT(m_tmp->object == NULL || + (m_tmp->flags & PG_CACHED) != 0, + ("vm_phys_free_contig: yyy")); vm_phys_free_pages(m, order); m += n; } --------------070407040005080007020500--