From owner-freebsd-arm@FreeBSD.ORG Sun Jun 8 17:30:08 2014 Return-Path: Delivered-To: freebsd-arm@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id ADF40620; Sun, 8 Jun 2014 17:30:08 +0000 (UTC) Received: from pp2.rice.edu (proofpoint2.mail.rice.edu [128.42.201.101]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 69480283E; Sun, 8 Jun 2014 17:30:06 +0000 (UTC) Received: from pps.filterd (pp2.rice.edu [127.0.0.1]) by pp2.rice.edu (8.14.5/8.14.5) with SMTP id s58HStmj027511; Sun, 8 Jun 2014 12:30:00 -0500 Received: from mh1.mail.rice.edu (mh1.mail.rice.edu [128.42.201.20]) by pp2.rice.edu with ESMTP id 1mbn6tgcn7-1; Sun, 08 Jun 2014 12:29:59 -0500 X-Virus-Scanned: by amavis-2.7.0 at mh1.mail.rice.edu, auth channel Received: from 108-254-203-201.lightspeed.hstntx.sbcglobal.net (108-254-203-201.lightspeed.hstntx.sbcglobal.net [108.254.203.201]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: alc) by mh1.mail.rice.edu (Postfix) with ESMTPSA id 035E94601E2; Sun, 8 Jun 2014 12:29:58 -0500 (CDT) Message-ID: <53949D96.3060409@rice.edu> Date: Sun, 08 Jun 2014 12:29:58 -0500 From: Alan Cox User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: "freebsd-arm@freebsd.org" , alc@freebsd.org Subject: Re: svn commit: r266850 - in head/sys/arm/xscale: i80321 i8134x ixp425 pxa References: <201405291656.s4TGudoD002868@svn.freebsd.org> <20140529171641.GA5246@ci0.org> <20140529173803.GA5294@ci0.org> <20140530063228.GD43976@funkthat.com> <5388ABF1.3030200@rice.edu> <20140601081153.GU43976@funkthat.com> <53935755.70908@rice.edu> <20140608003944.GK31367@funkthat.com> In-Reply-To: <20140608003944.GK31367@funkthat.com> X-Enigmail-Version: 1.6 Content-Type: multipart/mixed; boundary="------------010506080105090907030104" X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 kscore.is_bulkscore=0 kscore.compositescore=0 circleOfTrustscore=0 compositescore=0.713890987064109 urlsuspect_oldscore=0.713890987064109 suspectscore=10 recipient_domain_to_sender_totalscore=0 phishscore=0 bulkscore=0 kscore.is_spamscore=1 recipient_to_sender_totalscore=0 recipient_domain_to_sender_domain_totalscore=0 rbsscore=0.713890987064109 spamscore=0 recipient_to_sender_domain_totalscore=0 urlsuspectscore=0.9 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=7.0.1-1402240000 definitions=main-1406080251 X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 08 Jun 2014 17:30:08 -0000 This is a multi-part message in MIME format. --------------010506080105090907030104 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit On 06/07/2014 19:39, John-Mark Gurney wrote: > Alan Cox wrote this message on Sat, Jun 07, 2014 at 13:17 -0500: >> On 06/01/2014 03:11, John-Mark Gurney wrote: >>> Alan Cox wrote this message on Fri, May 30, 2014 at 11:04 -0500: >>>> On 05/30/2014 01:32, John-Mark Gurney wrote: >>>>> Olivier Houchard wrote this message on Thu, May 29, 2014 at 19:38 +0200: >>>>>> On Thu, May 29, 2014 at 10:19:18AM -0700, Adrian Chadd wrote: >>>>>>> On 29 May 2014 10:16, Olivier Houchard wrote: >>>>>>>> On Thu, May 29, 2014 at 10:14:53AM -0700, Adrian Chadd wrote: >>>>>>>>> Have you tested this on xscale hardware? >>>>>>>> Yeah, my two last commits were an attempt to get the AVILA kernel to boot >>>>>>>> again. >>>>>>> Woo! What can I provide to help you do this? :-) >>>>>>> >>>>>>> (Drinks? Food? Donations?) >>>>>>> >>>>>>> >>>>>> Drinks and food are always appreciated ;) >>>>>> It almost boots for me now, except a few userland programs gets SIGSEGV or >>>>>> SIGILL along the way, trying to figure out why. >>>>> Thanks for fixing ddb... I'm getting panic messages again... bad >>>>> news is that my panic is still around: >>>>> panic: vm_page_alloc: page 0xc07e73b0 is wired >>>>> >>>>> Though, interestingly, it looks like sparc64 has a similar panic: >>>>> https://www.freebsd.org/cgi/query-pr.cgi?pr=187080 >>>>> >>>>> kib, Alan, any clue to why this is happening? Any suggestions as to >>>>> help track it down? >>>> I'm afraid not. The dump below shows a perfectly normal, in-use page. >>>> If this page had actually been free prior to the vm_page_alloc() call, >>>> then other fields, like dirty, would have been different. In other >>>> words, this isn't just a problem with the wire count. >>>> >>>> What object is vm_page_alloc() being performed on? >>> Is this enough? Or do you need more? >>> >>> panic: vm_page_alloc: page 0xc07e73b0 is wired, obj: 0xc1500b40 >>> KDB: enter: panic >>> [ thread pid 781 tid 100051 ] >>> Stopped at kdb_enter+0x40: ldrb r15, [r15, r15, ror r15]! >>> db> show object/f 0xc1500b40 >>> Object 0xc1500b40: type=2, size=0xa, res=9, ref=0, flags=0x0 ruid -1 charge 0 >>> sref=0, backing_object(0)=(0)+0x0 >>> memory:=(off=0x0,page=0x8f0000),(off=0x1,page=0x8f1000),(off=0x2,page=0x8ee000),(off=0x3,page=0x8ef000),(off=0x4,page=0x8f3000),(off=0x5,page=0x8f4000) >>> ...(off=0x6,page=0x8fa000),(off=0x7,page=0x8fb000),(off=0x8,page=0x8fc000) >>> >>> If you need more, let me know what/how to get it, and I will... >>> >> >> Anyone who has seen the "wired page" panic, please try the attached >> patch. It introduces some new KASSERT()s that may help me to narrow >> down the problem. I haven't been able to trigger these KASSERT()s on >> amd64, but the symptoms that you guys are reporting are consistent with >> a bug that would trigger these KASSERT()s. > Ok, it triggered the xxx one: > Starting sendmail_msp_queue. > panic: vm_phys_free_contig: xxx > KDB: enter: panic > [ thread pid 782 tid 100051 ] > Stopped at kdb_enter+0x40: ldrb r15, [r15, r15, ror r15]! > db> bt > Tracing pid 782 tid 100051 td 0xc1470000 > db_trace_self() at db_trace_self > pc = 0xc0566ec8 lr = 0xc0566f54 (db_trace_thread+0x50) > sp = 0xcd830850 fp = 0xc03db694 > db_trace_thread() at db_trace_thread+0x50 > pc = 0xc0566f54 lr = 0xc022cd14 (db_command_init+0x620) > sp = 0xcd8308b0 fp = 0xc03db694 > db_command_init() at db_command_init+0x620 > pc = 0xc022cd14 lr = 0xc022c3ec (db_skip_to_eol+0x480) > sp = 0xcd8308c8 fp = 0xc03db694 > r4 = 0xc0683c30 r5 = 0x00000000 > db_skip_to_eol() at db_skip_to_eol+0x480 > pc = 0xc022c3ec lr = 0xc022c554 (db_command_loop+0x5c) > sp = 0xcd830968 fp = 0xc03db694 > r4 = 0xcd83097c r5 = 0xc0683efc > r6 = 0x00000000 r7 = 0x00000000 > r8 = 0x00000001 r10 = 0x600000d3 > db_command_loop() at db_command_loop+0x5c > pc = 0xc022c554 lr = 0xc022e99c (X_db_sym_numargs+0xec) > sp = 0xcd830970 fp = 0xc03db694 > X_db_sym_numargs() at X_db_sym_numargs+0xec > pc = 0xc022e99c lr = 0xc03db8c4 (kdb_trap+0x94) > sp = 0xcd830a88 fp = 0xc03db694 > r4 = 0x00000000 > kdb_trap() at kdb_trap+0x94 > pc = 0xc03db8c4 lr = 0xc0578eb0 (undefinedinstruction+0x2c8) > sp = 0xcd830aa8 fp = 0xc03db694 > r4 = 0x00000000 r5 = 0x00000000 > r6 = 0x00000000 r7 = 0xcd830b20 > r8 = 0xe7ffffff r10 = 0xe7ffffff > undefinedinstruction() at undefinedinstruction+0x2c8 > pc = 0xc0578eb0 lr = 0xc0568a0c (exception_exit) > sp = 0xcd830b20 fp = 0xc0613e70 > r4 = 0xffffffff r5 = 0xffff1004 > r6 = 0xc06d0ebc r7 = 0xcd830ba4 > r8 = 0xc1470000 r9 = 0x00000013 > r10 = 0x00000010 > exception_exit() at exception_exit > pc = 0xc0568a0c lr = 0xc03db68c (kdb_enter+0x38) > sp = 0xcd830b70 fp = 0xc0613e70 > r0 = 0x00000012 r1 = 0x60000013 > r2 = 0xc06df2ac r3 = 0xc06d0ee8 > r4 = 0xc05e5258 r5 = 0xc06155e8 > r6 = 0xc06d0ebc r7 = 0xcd830ba4 > r8 = 0xc1470000 r9 = 0x00000013 > r10 = 0x00000010 r12 = 0xc05e2518 > kdb_enter() at kdb_enter+0x44 > pc = 0xc03db698 lr = 0xc03aa094 (kern_reboot+0x948) > sp = 0xcd830b78 fp = 0xc0613e70 > r4 = 0x00000100 > kern_reboot() at kern_reboot+0x948 > pc = 0xc03aa094 lr = 0xc03aa164 (kassert_panic+0x68) > sp = 0xcd830b90 fp = 0xc0613e70 > r4 = 0xc06155e8 r5 = 0xc07e74a0 > r6 = 0xc07e6fa0 r7 = 0x00000004 > r8 = 0x00000010 > kassert_panic() at kassert_panic+0x68 > pc = 0xc03aa164 lr = 0xc055a0a8 (vm_phys_free_contig+0x8c) > sp = 0xcd830bb0 fp = 0xc0613e70 > r0 = 0xc06155e8 r1 = 0xc07e6d20 > r2 = 0xc06e6a70 r3 = 0x00000000 > r4 = 0xc07e73b0 > vm_phys_free_contig() at vm_phys_free_contig+0x8c > pc = 0xc055a0a8 lr = 0xc055ca70 (vm_reserv_startup+0x4bc) > sp = 0xcd830bd0 fp = 0xc0613e70 > r4 = 0xc08fb2cc r5 = 0x00000008 > r6 = 0x000000e8 r7 = 0xc08fb280 > r8 = 0x00000005 r10 = 0x00000001 > vm_reserv_startup() at vm_reserv_startup+0x4bc > pc = 0xc055ca70 lr = 0xc055cb40 (vm_reserv_startup+0x58c) > sp = 0xcd830be8 fp = 0xc0613e70 > r4 = 0xc08fb280 r5 = 0x00000000 > r6 = 0xc14b7280 r7 = 0x00000040 > r8 = 0x00000000 > vm_reserv_startup() at vm_reserv_startup+0x58c > pc = 0xc055cb40 lr = 0xc055ce08 (vm_reserv_reclaim_inactive+0x34) > sp = 0xcd830bf0 fp = 0xc0613e70 > r4 = 0xc06e6550 > vm_reserv_reclaim_inactive() at vm_reserv_reclaim_inactive+0x34 > pc = 0xc055ce08 lr = 0xc0554cb8 (vm_page_alloc+0x280) > sp = 0xcd830bf8 fp = 0xc0613e70 > vm_page_alloc() at vm_page_alloc+0x280 > pc = 0xc0554cb8 lr = 0xc0540eb0 (vm_fault_hold+0x60c) > sp = 0xcd830c30 fp = 0xcd830dac > r4 = 0xc14b7280 r5 = 0xc0618d00 > r6 = 0xcd830eb0 r7 = 0xc1470000 > r8 = 0xcd830e60 r9 = 0x00000000 > r10 = 0x00000000 > vm_fault_hold() at vm_fault_hold+0x60c > pc = 0xc0540eb0 lr = 0xc05426b8 (vm_fault+0x44) > sp = 0xcd830db0 fp = 0x00000002 > r4 = 0xc14c8a0c r5 = 0xc0618d00 > r6 = 0xcd830eb0 r7 = 0xc1470000 > r8 = 0xcd830e60 r9 = 0x00000001 > r10 = 0x00000000 > vm_fault() at vm_fault+0x44 > pc = 0xc05426b8 lr = 0xc05782d0 (data_abort_handler+0x35c) > sp = 0xcd830dc0 fp = 0x00000002 > data_abort_handler() at data_abort_handler+0x35c > pc = 0xc05782d0 lr = 0xc0568a0c (exception_exit) > sp = 0xcd830dc0 fp = 0x00000002 > data_abort_handler() at data_abort_handler+0x35c > pc = 0xc05782d0 lr = 0xc0568a0c (exception_exit) > sp = 0xcd830e60 fp = 0x20c43000 > r4 = 0xffffffff r5 = 0xffff1004 > r6 = 0x00000000 r7 = 0x20443740 > r8 = 0x0009b8e4 r9 = 0x00000001 > r10 = 0x00000004 > exception_exit() at exception_exit > pc = 0xc0568a0c lr = 0x204140d0 (0x204140d0) > sp = 0xcd830eb0 fp = 0x20c43000 > r0 = 0x00000000 r1 = 0x20c4302c > r2 = 0x00000004 r3 = 0x00000000 > r4 = 0x20446190 r5 = 0x20c4302c > r6 = 0x00000000 r7 = 0x20443740 > r8 = 0x0009b8e4 r9 = 0x00000001 > r10 = 0x00000004 r12 = 0x00000001 > Unable to unwind into user mode > > Hope this helps, let me know if you need anything else... > Please try the attached patch. It adds another KASSERT() loop. Depending on which KASSERT() fires, that will tell us whether to look deeper at this function or its caller for the source of the problem. Alan --------------010506080105090907030104 Content-Type: text/plain; charset=ISO-8859-15; name="arm_debug2.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="arm_debug2.patch" Index: vm/vm_phys.c =================================================================== --- vm/vm_phys.c (revision 267209) +++ vm/vm_phys.c (working copy) @@ -693,9 +693,16 @@ vm_phys_free_pages(vm_page_t m, int order) void vm_phys_free_contig(vm_page_t m, u_long npages) { + vm_page_t m_tmp; u_int n; int order; + for (m_tmp = m; m_tmp < &m[npages]; m_tmp++) + KASSERT(m_tmp->object == NULL || + (m_tmp->flags & PG_CACHED) != 0, + ("vm_phys_free_contig: start %p %td %u", + m, m_tmp - m, npages)); + /* * Avoid unnecessary coalescing by freeing the pages in the largest * possible power-of-two-sized subsets. @@ -714,6 +721,11 @@ vm_phys_free_contig(vm_page_t m, u_long npages) n = 1 << order; if (npages < n) break; + for (m_tmp = m; m_tmp < &m[n]; m_tmp++) + KASSERT(m_tmp->object == NULL || + (m_tmp->flags & PG_CACHED) != 0, + ("vm_phys_free_contig: xxx %p %td %u", + m, m_tmp - m, n)); vm_phys_free_pages(m, order); m += n; } @@ -721,6 +733,11 @@ vm_phys_free_contig(vm_page_t m, u_long npages) for (; npages > 0; npages -= n) { order = flsl(npages) - 1; n = 1 << order; + for (m_tmp = m; m_tmp < &m[n]; m_tmp++) + KASSERT(m_tmp->object == NULL || + (m_tmp->flags & PG_CACHED) != 0, + ("vm_phys_free_contig: yyy %p %td %u", + m, m_tmp - m, n)); vm_phys_free_pages(m, order); m += n; } --------------010506080105090907030104--