Date: Mon, 9 Jun 2014 14:08:20 -0600 From: Warner Losh <imp@bsdimp.com> To: Alan Cox <alc@rice.edu> Cc: alc@freebsd.org, "freebsd-arm@freebsd.org" <freebsd-arm@freebsd.org> Subject: Re: svn commit: r266850 - in head/sys/arm/xscale: i80321 i8134x ixp425 pxa Message-ID: <6DA17B5C-1824-49BF-8192-432135D42C6E@bsdimp.com> In-Reply-To: <9100CDFA-0C40-4BC8-AA9C-1DE37EEA6208@rice.edu> References: <20140601081153.GU43976@funkthat.com> <53935755.70908@rice.edu> <20140608003944.GK31367@funkthat.com> <53949D96.3060409@rice.edu> <20140608235611.GP31367@funkthat.com> <53950BB9.3090808@rice.edu> <20140609042206.GQ31367@funkthat.com> <5395D312.5000302@rice.edu> <20140609163302.GS31367@funkthat.com> <5395E725.7020807@rice.edu> <20140609174431.GT31367@funkthat.com> <9100CDFA-0C40-4BC8-AA9C-1DE37EEA6208@rice.edu>
index | next in thread | previous in thread | raw e-mail
[-- Attachment #1 --] On Jun 9, 2014, at 1:23 PM, Alan Cox <alc@rice.edu> wrote: > > On Jun 9, 2014, at 12:44 PM, John-Mark Gurney wrote: > >> Alan Cox wrote this message on Mon, Jun 09, 2014 at 11:56 -0500: >>> On 06/09/2014 11:33, John-Mark Gurney wrote: >>>> Alan Cox wrote this message on Mon, Jun 09, 2014 at 10:30 -0500: >>>>> On 06/08/2014 23:22, John-Mark Gurney wrote: >>>>>> Alan Cox wrote this message on Sun, Jun 08, 2014 at 20:19 -0500: >>>>>>> On 06/08/2014 18:56, John-Mark Gurney wrote: >>>>>>>> Alan Cox wrote this message on Sun, Jun 08, 2014 at 12:29 -0500: >>>>>>>>> On 06/07/2014 19:39, John-Mark Gurney wrote: >>>>>>>>>> Alan Cox wrote this message on Sat, Jun 07, 2014 at 13:17 -0500: >>>>>>>>>>> On 06/01/2014 03:11, John-Mark Gurney wrote: >>>>>>>>>>>> Alan Cox wrote this message on Fri, May 30, 2014 at 11:04 -0500: >>>>>>>>>>>>> On 05/30/2014 01:32, John-Mark Gurney wrote: >>>>>>>>>>>>>> Olivier Houchard wrote this message on Thu, May 29, 2014 at 19:38 +0200: >>>>>>>>>>>>>>> On Thu, May 29, 2014 at 10:19:18AM -0700, Adrian Chadd wrote: >>>>>>>>>>>>>>>> On 29 May 2014 10:16, Olivier Houchard <cognet@ci0.org> wrote: >>>>>>>>>>>>>>>>> On Thu, May 29, 2014 at 10:14:53AM -0700, Adrian Chadd wrote: >>>>>>>>>>>>>>>>>> Have you tested this on xscale hardware? >>>>>>>>>>>>>>>>> Yeah, my two last commits were an attempt to get the AVILA kernel to boot >>>>>>>>>>>>>>>>> again. >>>>>>>>>>>>>>>> Woo! What can I provide to help you do this? :-) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (Drinks? Food? Donations?) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Drinks and food are always appreciated ;) >>>>>>>>>>>>>>> It almost boots for me now, except a few userland programs gets SIGSEGV or >>>>>>>>>>>>>>> SIGILL along the way, trying to figure out why. >>>>>>>>>>>>>> Thanks for fixing ddb... I'm getting panic messages again... bad >>>>>>>>>>>>>> news is that my panic is still around: >>>>>>>>>>>>>> panic: vm_page_alloc: page 0xc07e73b0 is wired >>>>>>>>>>>>>> >>>>>>>>>>>>>> Though, interestingly, it looks like sparc64 has a similar panic: >>>>>>>>>>>>>> https://www.freebsd.org/cgi/query-pr.cgi?pr=187080 >>>>>>>>>>>>>> >>>>>>>>>>>>>> kib, Alan, any clue to why this is happening? Any suggestions as to >>>>>>>>>>>>>> help track it down? >>>>>>>>>>>>> I'm afraid not. The dump below shows a perfectly normal, in-use page. >>>>>>>>>>>>> If this page had actually been free prior to the vm_page_alloc() call, >>>>>>>>>>>>> then other fields, like dirty, would have been different. In other >>>>>>>>>>>>> words, this isn't just a problem with the wire count. >>>>>>>>>>>>> >>>>>>>>>>>>> What object is vm_page_alloc() being performed on? >>>>>>>>>>>> Is this enough? Or do you need more? >>>>>>>>>>>> >>>>>>>>>>>> panic: vm_page_alloc: page 0xc07e73b0 is wired, obj: 0xc1500b40 >>>>>>>>>>>> KDB: enter: panic >>>>>>>>>>>> [ thread pid 781 tid 100051 ] >>>>>>>>>>>> Stopped at kdb_enter+0x40: ldrb r15, [r15, r15, ror r15]! >>>>>>>>>>>> db> show object/f 0xc1500b40 >>>>>>>>>>>> Object 0xc1500b40: type=2, size=0xa, res=9, ref=0, flags=0x0 ruid -1 charge 0 >>>>>>>>>>>> sref=0, backing_object(0)=(0)+0x0 >>>>>>>>>>>> memory:=(off=0x0,page=0x8f0000),(off=0x1,page=0x8f1000),(off=0x2,page=0x8ee000),(off=0x3,page=0x8ef000),(off=0x4,page=0x8f3000),(off=0x5,page=0x8f4000) >>>>>>>>>>>> ...(off=0x6,page=0x8fa000),(off=0x7,page=0x8fb000),(off=0x8,page=0x8fc000) >>>>>>>>>>>> >>>>>>>>>>>> If you need more, let me know what/how to get it, and I will... >>>>>>>>>>>> >>>>>>>>>>> Anyone who has seen the "wired page" panic, please try the attached >>>>>>>>>>> patch. It introduces some new KASSERT()s that may help me to narrow >>>>>>>>>>> down the problem. I haven't been able to trigger these KASSERT()s on >>>>>>>>>>> amd64, but the symptoms that you guys are reporting are consistent with >>>>>>>>>>> a bug that would trigger these KASSERT()s. >>>>>>>>>> Ok, it triggered the xxx one: >>>>>>>>>> Starting sendmail_msp_queue. >>>>>>>>>> panic: vm_phys_free_contig: xxx >>>>>>>>>> KDB: enter: panic >>>>>>>>>> [ thread pid 782 tid 100051 ] >>>>>>>>>> Stopped at kdb_enter+0x40: ldrb r15, [r15, r15, ror r15]! >>>>>>>>>> db> bt >>>>>>>>>> Tracing pid 782 tid 100051 td 0xc1470000 >>>>>>>>>> db_trace_self() at db_trace_self >>>>>>>>>> pc = 0xc0566ec8 lr = 0xc0566f54 (db_trace_thread+0x50) >>>>>>>>>> sp = 0xcd830850 fp = 0xc03db694 >>>>>>>>>> db_trace_thread() at db_trace_thread+0x50 >>>>>>>>>> pc = 0xc0566f54 lr = 0xc022cd14 (db_command_init+0x620) >>>>>>>>>> sp = 0xcd8308b0 fp = 0xc03db694 >>>>>>>>>> db_command_init() at db_command_init+0x620 >>>>>>>>>> pc = 0xc022cd14 lr = 0xc022c3ec (db_skip_to_eol+0x480) >>>>>>>>>> sp = 0xcd8308c8 fp = 0xc03db694 >>>>>>>>>> r4 = 0xc0683c30 r5 = 0x00000000 >>>>>>>>>> db_skip_to_eol() at db_skip_to_eol+0x480 >>>>>>>>>> pc = 0xc022c3ec lr = 0xc022c554 (db_command_loop+0x5c) >>>>>>>>>> sp = 0xcd830968 fp = 0xc03db694 >>>>>>>>>> r4 = 0xcd83097c r5 = 0xc0683efc >>>>>>>>>> r6 = 0x00000000 r7 = 0x00000000 >>>>>>>>>> r8 = 0x00000001 r10 = 0x600000d3 >>>>>>>>>> db_command_loop() at db_command_loop+0x5c >>>>>>>>>> pc = 0xc022c554 lr = 0xc022e99c (X_db_sym_numargs+0xec) >>>>>>>>>> sp = 0xcd830970 fp = 0xc03db694 >>>>>>>>>> X_db_sym_numargs() at X_db_sym_numargs+0xec >>>>>>>>>> pc = 0xc022e99c lr = 0xc03db8c4 (kdb_trap+0x94) >>>>>>>>>> sp = 0xcd830a88 fp = 0xc03db694 >>>>>>>>>> r4 = 0x00000000 >>>>>>>>>> kdb_trap() at kdb_trap+0x94 >>>>>>>>>> pc = 0xc03db8c4 lr = 0xc0578eb0 (undefinedinstruction+0x2c8) >>>>>>>>>> sp = 0xcd830aa8 fp = 0xc03db694 >>>>>>>>>> r4 = 0x00000000 r5 = 0x00000000 >>>>>>>>>> r6 = 0x00000000 r7 = 0xcd830b20 >>>>>>>>>> r8 = 0xe7ffffff r10 = 0xe7ffffff >>>>>>>>>> undefinedinstruction() at undefinedinstruction+0x2c8 >>>>>>>>>> pc = 0xc0578eb0 lr = 0xc0568a0c (exception_exit) >>>>>>>>>> sp = 0xcd830b20 fp = 0xc0613e70 >>>>>>>>>> r4 = 0xffffffff r5 = 0xffff1004 >>>>>>>>>> r6 = 0xc06d0ebc r7 = 0xcd830ba4 >>>>>>>>>> r8 = 0xc1470000 r9 = 0x00000013 >>>>>>>>>> r10 = 0x00000010 >>>>>>>>>> exception_exit() at exception_exit >>>>>>>>>> pc = 0xc0568a0c lr = 0xc03db68c (kdb_enter+0x38) >>>>>>>>>> sp = 0xcd830b70 fp = 0xc0613e70 >>>>>>>>>> r0 = 0x00000012 r1 = 0x60000013 >>>>>>>>>> r2 = 0xc06df2ac r3 = 0xc06d0ee8 >>>>>>>>>> r4 = 0xc05e5258 r5 = 0xc06155e8 >>>>>>>>>> r6 = 0xc06d0ebc r7 = 0xcd830ba4 >>>>>>>>>> r8 = 0xc1470000 r9 = 0x00000013 >>>>>>>>>> r10 = 0x00000010 r12 = 0xc05e2518 >>>>>>>>>> kdb_enter() at kdb_enter+0x44 >>>>>>>>>> pc = 0xc03db698 lr = 0xc03aa094 (kern_reboot+0x948) >>>>>>>>>> sp = 0xcd830b78 fp = 0xc0613e70 >>>>>>>>>> r4 = 0x00000100 >>>>>>>>>> kern_reboot() at kern_reboot+0x948 >>>>>>>>>> pc = 0xc03aa094 lr = 0xc03aa164 (kassert_panic+0x68) >>>>>>>>>> sp = 0xcd830b90 fp = 0xc0613e70 >>>>>>>>>> r4 = 0xc06155e8 r5 = 0xc07e74a0 >>>>>>>>>> r6 = 0xc07e6fa0 r7 = 0x00000004 >>>>>>>>>> r8 = 0x00000010 >>>>>>>>>> kassert_panic() at kassert_panic+0x68 >>>>>>>>>> pc = 0xc03aa164 lr = 0xc055a0a8 (vm_phys_free_contig+0x8c) >>>>>>>>>> sp = 0xcd830bb0 fp = 0xc0613e70 >>>>>>>>>> r0 = 0xc06155e8 r1 = 0xc07e6d20 >>>>>>>>>> r2 = 0xc06e6a70 r3 = 0x00000000 >>>>>>>>>> r4 = 0xc07e73b0 >>>>>>>>>> vm_phys_free_contig() at vm_phys_free_contig+0x8c >>>>>>>>>> pc = 0xc055a0a8 lr = 0xc055ca70 (vm_reserv_startup+0x4bc) >>>>>>>>>> sp = 0xcd830bd0 fp = 0xc0613e70 >>>>>>>>>> r4 = 0xc08fb2cc r5 = 0x00000008 >>>>>>>>>> r6 = 0x000000e8 r7 = 0xc08fb280 >>>>>>>>>> r8 = 0x00000005 r10 = 0x00000001 >>>>>>>>>> vm_reserv_startup() at vm_reserv_startup+0x4bc >>>>>>>>>> pc = 0xc055ca70 lr = 0xc055cb40 (vm_reserv_startup+0x58c) >>>>>>>>>> sp = 0xcd830be8 fp = 0xc0613e70 >>>>>>>>>> r4 = 0xc08fb280 r5 = 0x00000000 >>>>>>>>>> r6 = 0xc14b7280 r7 = 0x00000040 >>>>>>>>>> r8 = 0x00000000 >>>>>>>>>> vm_reserv_startup() at vm_reserv_startup+0x58c >>>>>>>>>> pc = 0xc055cb40 lr = 0xc055ce08 (vm_reserv_reclaim_inactive+0x34) >>>>>>>>>> sp = 0xcd830bf0 fp = 0xc0613e70 >>>>>>>>>> r4 = 0xc06e6550 >>>>>>>>>> vm_reserv_reclaim_inactive() at vm_reserv_reclaim_inactive+0x34 >>>>>>>>>> pc = 0xc055ce08 lr = 0xc0554cb8 (vm_page_alloc+0x280) >>>>>>>>>> sp = 0xcd830bf8 fp = 0xc0613e70 >>>>>>>>>> vm_page_alloc() at vm_page_alloc+0x280 >>>>>>>>>> pc = 0xc0554cb8 lr = 0xc0540eb0 (vm_fault_hold+0x60c) >>>>>>>>>> sp = 0xcd830c30 fp = 0xcd830dac >>>>>>>>>> r4 = 0xc14b7280 r5 = 0xc0618d00 >>>>>>>>>> r6 = 0xcd830eb0 r7 = 0xc1470000 >>>>>>>>>> r8 = 0xcd830e60 r9 = 0x00000000 >>>>>>>>>> r10 = 0x00000000 >>>>>>>>>> vm_fault_hold() at vm_fault_hold+0x60c >>>>>>>>>> pc = 0xc0540eb0 lr = 0xc05426b8 (vm_fault+0x44) >>>>>>>>>> sp = 0xcd830db0 fp = 0x00000002 >>>>>>>>>> r4 = 0xc14c8a0c r5 = 0xc0618d00 >>>>>>>>>> r6 = 0xcd830eb0 r7 = 0xc1470000 >>>>>>>>>> r8 = 0xcd830e60 r9 = 0x00000001 >>>>>>>>>> r10 = 0x00000000 >>>>>>>>>> vm_fault() at vm_fault+0x44 >>>>>>>>>> pc = 0xc05426b8 lr = 0xc05782d0 (data_abort_handler+0x35c) >>>>>>>>>> sp = 0xcd830dc0 fp = 0x00000002 >>>>>>>>>> data_abort_handler() at data_abort_handler+0x35c >>>>>>>>>> pc = 0xc05782d0 lr = 0xc0568a0c (exception_exit) >>>>>>>>>> sp = 0xcd830dc0 fp = 0x00000002 >>>>>>>>>> data_abort_handler() at data_abort_handler+0x35c >>>>>>>>>> pc = 0xc05782d0 lr = 0xc0568a0c (exception_exit) >>>>>>>>>> sp = 0xcd830e60 fp = 0x20c43000 >>>>>>>>>> r4 = 0xffffffff r5 = 0xffff1004 >>>>>>>>>> r6 = 0x00000000 r7 = 0x20443740 >>>>>>>>>> r8 = 0x0009b8e4 r9 = 0x00000001 >>>>>>>>>> r10 = 0x00000004 >>>>>>>>>> exception_exit() at exception_exit >>>>>>>>>> pc = 0xc0568a0c lr = 0x204140d0 (0x204140d0) >>>>>>>>>> sp = 0xcd830eb0 fp = 0x20c43000 >>>>>>>>>> r0 = 0x00000000 r1 = 0x20c4302c >>>>>>>>>> r2 = 0x00000004 r3 = 0x00000000 >>>>>>>>>> r4 = 0x20446190 r5 = 0x20c4302c >>>>>>>>>> r6 = 0x00000000 r7 = 0x20443740 >>>>>>>>>> r8 = 0x0009b8e4 r9 = 0x00000001 >>>>>>>>>> r10 = 0x00000004 r12 = 0x00000001 >>>>>>>>>> Unable to unwind into user mode >>>>>>>>>> >>>>>>>>>> Hope this helps, let me know if you need anything else... >>>>>>>>>> >>>>>>>>> Please try the attached patch. It adds another KASSERT() loop. >>>>>>>>> >>>>>>>>> Depending on which KASSERT() fires, that will tell us whether to look >>>>>>>>> deeper at this function or its caller for the source of the problem. >>>>>>>> Ok, that panic is: >>>>>>>> panic: vm_phys_free_contig: start 0xc07e6d20 21 24 >>>>>>>> >>>>>>>> Let me know if you need any more info... oh, btw, the last %u needed >>>>>>>> to be %lu since it was a u_long, not an unsigned... >>>>>>>> >>>>>>> Ok. Here is the next debug patch. >>>>>> so, it's crashing in the same place: >>>>>> panic: vm_phys_free_contig: start 0xc07e6d20 21 24 >>>>>> >>>>>> so, I commented out this KASSERT, and now it panics with: >>>>>> panic: vm_phys_free_contig: xxx 0xc07e6fa0 13 16 >>>>>> >>>>>> so I commented out this KASSERT too, and it panics back w/ the original >>>>>> panic.. So it didn't hit the new KASSERT in vm_reserv_break... >>>>> Next patch...It should panic in vm_reserv_break this time and tell me if >>>>> the reservation being broken belongs to the same object as the inuse >>>>> page that is being inappropriately freed. >>>> So, bad news... still panics with: >>>> panic: vm_phys_free_contig: start 0xc07e6d20 21 24 >>>> >>>> This panic seems to be consistent now, in that the start address is >>>> always the same... Is there a way you could add various debugging >>>> for this specific vm page to catch a stack trace (stack(9)) where it's >>>> going wrong? >>>> >>> >>> I made a mistake with the new KASSERT()s in vm_reserv_break(). Try this. >> >> No worried, the new patch panics: >> panic: vm_reserv_break: 2 saved_object=0xc06e6378 x=253 m_tmp->object=0xc06e6378 (1) >> > > > Is your arm processor running in big-endian or little-endian mode? Big Endian. Warner > > >> w/ a bt of: >> [...] >> vm_reserv_startup() at vm_reserv_startup+0x570 >> pc = 0xc055cd94 lr = 0xc055cec8 (vm_reserv_startup+0x6a4) >> sp = 0xcd833be8 fp = 0xc06142d0 >> r4 = 0xc08fb280 r5 = 0x00000000 >> r6 = 0xc14b76e0 r7 = 0x00000000 >> r8 = 0x00000000 r9 = 0x00000033 >> r10 = 0x00000001 >> vm_reserv_startup() at vm_reserv_startup+0x6a4 >> pc = 0xc055cec8 lr = 0xc055d190 (vm_reserv_reclaim_inactive+0x34) >> sp = 0xcd833bf0 fp = 0xc06142d0 >> r4 = 0xc06e6550 >> vm_reserv_reclaim_inactive() at vm_reserv_reclaim_inactive+0x34 >> pc = 0xc055d190 lr = 0xc0554eb0 (vm_page_alloc+0x280) >> sp = 0xcd833bf8 fp = 0xc06142d0 >> vm_page_alloc() at vm_page_alloc+0x280 >> pc = 0xc0554eb0 lr = 0xc0540ebc (vm_fault_hold+0x60c) >> sp = 0xcd833c30 fp = 0xcd833dac >> r4 = 0xc14b76e0 r5 = 0xc0619288 >> r6 = 0xcd833eb0 r7 = 0xc0f7ec80 >> r8 = 0xcd833e60 r9 = 0x00000000 >> r10 = 0x00000000 >> vm_fault_hold() at vm_fault_hold+0x60c >> pc = 0xc0540ebc lr = 0xc05426c4 (vm_fault+0x44) >> sp = 0xcd833db0 fp = 0x00000002 >> r4 = 0xc14c66ec r5 = 0xc0619288 >> r6 = 0xcd833eb0 r7 = 0xc0f7ec80 >> r8 = 0xcd833e60 r9 = 0x00000001 >> r10 = 0x00000000 >> vm_fault() at vm_fault+0x44 >> pc = 0xc05426c4 lr = 0xc05786d0 (data_abort_handler+0x35c) >> sp = 0xcd833dc0 fp = 0x00000002 >> data_abort_handler() at data_abort_handler+0x35c >> pc = 0xc05786d0 lr = 0xc0568dc8 (exception_exit) >> sp = 0xcd833e60 fp = 0x00000000 >> r4 = 0xffffffff r5 = 0xffff1004 >> r6 = 0x001b7740 r7 = 0x00052ec4 >> r8 = 0x00000000 r9 = 0x000cc4b0 >> r10 = 0x00000000 >> exception_exit() at exception_exit >> pc = 0xc0568dc8 lr = 0x203f1208 (0x203f1208) >> sp = 0xcd833eb0 fp = 0x00000000 >> r0 = 0x20c53e60 r1 = 0x00000000 >> r2 = 0x000eeb40 r3 = 0x00000001 >> r4 = 0x00000000 r5 = 0x000e9654 >> r6 = 0x001b7740 r7 = 0x00052ec4 >> r8 = 0x00000000 r9 = 0x000cc4b0 >> r10 = 0x00000000 r12 = 0x200d26a4 >> >> Let me know if you need any more information.. >> >> Thanks for tracking this down. >> >> -- >> John-Mark Gurney Voice: +1 415 225 5579 >> >> "All that I will do, has been done, All that I have, has not." >> > > _______________________________________________ > freebsd-arm@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arm > To unsubscribe, send any mail to "freebsd-arm-unsubscribe@freebsd.org" [-- Attachment #2 --] -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQIcBAEBCgAGBQJTlhQ1AAoJEGwc0Sh9sBEAUtEP/i+J+crpb9rzRBJjV4zbAEN9 HLlhKA48zf7SLSq987hCHrNLqd8yHq7Gi+pkX/0kdooo/QrXFU9nX7+aMgfY/07z 971h/k6YG9VXBkvkBedHbH6tvow1BOKSd4L7szxbsg0bwHvYAtqGKgqrDebXzXtr mfVNqQZ9l4Cf7RoXhJ8roSrQvqbfCSwdYJH8LmgdCpUnDCbugnCRddZ5hBlf79no WwygIisLzUqz+SQw0PvM/xQQBR6DqXOOaHCBaWM7omYWwCVbyBp3WZ8dmAXkJxB5 VUcD3X4bTbmde5a3WaCDDY/GtfhsXio6v9G1aP6pLpVwOUJAYfZroGQ70QOSKXOl szqYCvgec69sPSOI3DYArgMVF7fM+y441hvBeyKfb0eYgC0eX0e1h8C7kQzHy0VZ u3qwnpP5b6LQtBXcKxKniBV82q1mWCqyJJSbAbBLb+DJzUmoqnqGOHR5TonVl1lN 9r4TDFjbnRvQBaK+DJiwZH/AvisNBbAFb23mXyolAyT5w7hYBuobJuDEJSXamk2d jNTIO5h4dSMDShSwKPw8lbL4WF7nk7P+ScB9TKPxxnhKgwuJCtxC1e35eJU4B4uP hYaMUY1qUNg5ahmq9K8FF3cMyM6UUFPSrTp0fUrzvhaoke3Hs2GcCsH8NC5olBlx lzdLY5g10diMIdXAMSue =+NVF -----END PGP SIGNATURE-----help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6DA17B5C-1824-49BF-8192-432135D42C6E>
