Date: Sat, 22 Sep 2012 13:45:13 -0500 From: Alan Cox <alc@rice.edu> To: Ian Lepore <freebsd@damnhippie.dyndns.org> Cc: "arm@freebsd.org" <arm@freebsd.org> Subject: Re: arm pmap locking Message-ID: <505E0739.8080802@rice.edu> In-Reply-To: <1348333663.5548.24.camel@revolution.hippie.lan> References: <1345315508.27688.260.camel@revolution.hippie.lan> <503D12AE.1050705@rice.edu> <1346350374.1140.525.camel@revolution.hippie.lan> <5045351F.6060201@rice.edu> <1346723041.1140.602.camel@revolution.hippie.lan> <504B85BE.3030101@rice.edu> <1347316458.1137.41.camel@revolution.hippie.lan> <504F8BAC.4040902@rice.edu> <20120915045040.GZ58312@funkthat.com> <5054D69B.40209@rice.edu> <20120917033308.GB58312@funkthat.com> <505DE03D.7050101@rice.edu> <1348333663.5548.24.camel@revolution.hippie.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
On 09/22/2012 12:07, Ian Lepore wrote: > On Sat, 2012-09-22 at 10:58 -0500, Alan Cox wrote: >> On 09/16/2012 22:33, John-Mark Gurney wrote: >>> Alan Cox wrote this message on Sat, Sep 15, 2012 at 14:27 -0500: >>>> On 09/14/2012 23:50, John-Mark Gurney wrote: >>>>> I figure since you've been looking at arm's pmap, that I should report >>>>> to you a LOR that I observered recently on my armv6 board: >>>>> lock order reversal: >>>>> 1st 0xc1cf70b0 pmap (pmap) @ /usr/src.HEAD/sys/arm/arm/pmap-v6.c:673 >>>>> 2nd 0xc091e608 PV ENTRY (UMA zone) @ /usr/src.HEAD/sys/vm/uma_core.c:2084 >>>>> KDB: stack backtrace: >>>>> db_trace_self() at db_trace_self+0xc >>>>> scp=0xc05a294c rlv=0xc025b298 (X_db_sym_numargs+0x1bc) >>>>> rsp=0xc9d1a8fc rfp=0xc9d1aa18 >>>>> X_db_sym_numargs() at X_db_sym_numargs+0x194 >>>>> scp=0xc025b270 rlv=0xc0398340 (kdb_backtrace+0x3c) >>>>> rsp=0xc9d1aa1c rfp=0xc9d1aa2c >>>>> r4=0xc06bda44 >>>>> kdb_backtrace() at kdb_backtrace+0xc >>>>> scp=0xc0398310 rlv=0xc03ad3f8 (witness_display_spinlock+0x80) >>>>> rsp=0xc9d1aa30 rfp=0xc9d1aa44 >>>>> r4=0x00000001 >>>>> witness_display_spinlock() at witness_display_spinlock+0x5c >>>>> scp=0xc03ad3d4 rlv=0xc03ae6d8 (witness_checkorder+0x884) >>>>> rsp=0xc9d1aa48 rfp=0xc9d1aa98 >>>>> r5=0xc1cf70b0 r4=0xc06263d4 >>>>> witness_checkorder() at witness_checkorder+0xc >>>>> scp=0xc03ade60 rlv=0xc0355b9c (_mtx_lock_flags+0xcc) >>>>> rsp=0xc9d1aa9c rfp=0xc9d1aabc >>>>> r10=0xc1cf70b0 r9=0x00000000 >>>>> r8=0xc091d6e0 r7=0xc0620730 r6=0x00000824 r5=0x00000000 >>>>> r4=0xc091e608 >>>>> _mtx_lock_flags() at _mtx_lock_flags+0xc >>>>> scp=0xc0355adc rlv=0xc057d290 (uma_zalloc_arg+0x1a8) >>>>> rsp=0xc9d1aac0 rfp=0xc9d1aafc >>>>> r7=0xc091d748 r6=0x00000000 >>>>> r5=0xc08fc0b8 r4=0xc0620730 >>>>> uma_zalloc_arg() at uma_zalloc_arg+0xc >>>>> scp=0xc057d0f4 rlv=0xc05abb04 (pmap_growkernel+0xf20) >>>>> rsp=0xc9d1ab00 rfp=0xc9d1ab70 >>>>> r10=0xc1cf70b0 r9=0x00000000 >>>>> r8=0x00000000 r7=0x8122e032 r6=0x00000000 r5=0xc08fc0b8 >>>>> r4=0x00000001 >>>>> pmap_growkernel() at pmap_growkernel+0x3fc >>>>> scp=0xc05aafe0 rlv=0xc05ac14c (pmap_enter+0x70) >>>>> rsp=0xc9d1ab74 rfp=0xc9d1aba0 >>>>> r10=0x00000007 r9=0x00000000 >>>>> r8=0xc09885a8 r7=0xbffff000 r6=0xc08278c0 r5=0xc1cf70b0 >>>>> r4=0xc06261e0 >>>>> pmap_enter() at pmap_enter+0xc >>>>> scp=0xc05ac0e8 rlv=0xc057ff44 (vm_fault_hold+0x1638) >>>>> rsp=0xc9d1aba4 rfp=0xc9d1ad14 >>>>> r10=0xc9d1ade4 r8=0x00000000 >>>>> r7=0x00000002 r6=0x00000000 r5=0xc1cf7000 r4=0xc09885a8 >>>>> vm_fault_hold() at vm_fault_hold+0xc >>>>> scp=0xc057e918 rlv=0xc058054c (vm_fault+0x8c) >>>>> rsp=0xc9d1ad18 rfp=0xc9d1ad3c >>>>> r10=0xc9d1ade4 r9=0x00000002 >>>>> r8=0x00000000 r7=0x00000002 r6=0xbffff000 r5=0xc1cf7000 >>>>> r4=0xc1cf5000 >>>>> vm_fault() at vm_fault+0xc >>>>> scp=0xc05804cc rlv=0xc05b0ac8 (data_abort_handler+0x35c) >>>>> rsp=0xc9d1ad40 rfp=0xc9d1ade0 >>>>> r8=0xbffff000 r7=0xc1cf5000 >>>>> r6=0x00000000 r5=0xc0626844 r4=0xc1cf3088 >>>>> data_abort_handler() at data_abort_handler+0xc >>>>> scp=0xc05b0778 rlv=0xc05a4150 (address_exception_entry+0x50) >>>>> rsp=0xc9d1ade4 rfp=0xc9d1ae84 >>>>> r10=0xc0682bde r9=0xc1cf3000 >>>>> r8=0xc9d1aeac r7=0xc0682bde r6=0xc0682bd4 r5=0xc05f1648 >>>>> r4=0xbfffffff >>>>> exec_shell_imgact() at exec_shell_imgact+0x75c >>>>> scp=0xc031d348 rlv=0xc033b68c (fork_exit+0x94) >>>>> rsp=0xc9d1ae88 rfp=0xc9d1aea8 >>>>> r10=0x00000000 r9=0x00000000 >>>>> r8=0xc9d1aeac r7=0x00000000 r6=0xc031d33c r5=0xc1cf3000 >>>>> r4=0xc1cf5000 >>>>> fork_exit() at fork_exit+0xc >>>>> scp=0xc033b604 rlv=0xc05afe44 (fork_trampoline+0x14) >>>>> rsp=0xc9d1aeac rfp=0x00000000 >>>>> r8=0x00000000 r7=0x8ffbf8ef >>>>> r6=0xf5f7f4a9 r5=0x00000000 r4=0xc031d33c >>>>> >>>>> FreeBSD beaglebone 10.0-CURRENT FreeBSD 10.0-CURRENT #1 r240480: Thu Nov >>>>> 3 14:57:01 PDT 2011 >>>>> jmg@pcbsd-779:/usr/obj/arm.armv6/usr/src.HEAD/sys/helium arm >>>>> >>>>> Do you need any more information? >>>> I'm not sure what to make of this stack trace. In particular, >>>> pmap_enter() does not directly call pmap_growkernel(). In fact, the >>>> only caller to pmap_growkernel() in the entire kernel is >>>> vm_map_insert(), which doesn't appear in this stack trace. Moreover, >>>> this appears to be a legitimate page fault within the kernel address >>>> space. (The image activator touches pageable kernel memory.) However, >>>> such page faults should never need to grow the kernel page table. The >>>> necessary page table pages should have been allocated during >>>> initialization when the various kernel map submaps were created. The >>>> bottom line is that I don't have an immediate answer. I'm going to need >>>> to think about this. In general, I'm a bit uncomfortable with the way >>>> that the l2 and l2 table zones are created. >>> Still getting the LOR.. It always right after boot.. after trying >>> to mount root and warning about no time-of-day clock, but before setting >>> hostuuid.. >> I should be able to spend some time looking into this LOR today and/or >> tomorrow. >> >>>> In the meantime, my next patch is attached. This applies to both the >>>> original pmap and the new v6 pmap. Can some folks here please test >>>> this? Here is the description: >>> I just tried it on my armv6 BEAGLEBONE... It boots find to multiuser >>> mode... Ran a few programs and worked fine... This only tests the >>> pmap-v6.c patch, not the other pmap patch... >> Can someone here please try the following one-line patch to the v6 >> pmap? Knowing that this works will help me to rule out some possible >> explanations for the strange stack trace and LOR above. >> >> Alan > There has been a boot-time LOR involving the same locks, but with a > different backtrace, for a long time. In case it helps, the following > is from freebsd 8.2 (the latest I can test with right now), the pmap.c > in question is r205956 according to the __FBSDID() in the code. There > are actually two LORs below, the first involves the same locks as above, > the second one is different. > > Trying to mount root from ufs:/dev/mmcsd0s1a > warning: no time-of-day clock registered, system time will not be set accurately > lock order reversal: > 1st 0xc0a8a0b0 pmap (pmap) @ /usr/src/sys/arm/arm/pmap.c:971 > 2nd 0xc055b608 PV ENTRY (UMA zone) @ /usr/src/sys/vm/uma_core.c:2055 This is actually the correct lock ordering. So, something happened before this that incorrectly trained witness. Are you able to boot a kernel with a modified kern/subr_witness? If so, then you could explicitly set the lock order to be the above, correct ordering, and witness would report the actual case of LOR. > KDB: stack backtrace: > db_trace_thread() at db_trace_thread+0x10 > scp=0xc0215d64 rlv=0xc0215f68 (db_trace_self+0x1c) > rsp=0xc2e9d950 rfp=0xc2e9d95c > r10=0x00000000 r9=0xc0a4c4b8 > r8=0xc040aec4 r7=0xffffffff r6=0xc0a4c6c0 r5=0xc023a1fc > r4=0xc2e9d968 > db_trace_self() at db_trace_self+0x10 > scp=0xc0215f5c rlv=0xc001ee0c (db_trace_self_wrapper+0x30) > rsp=0xc2e9d960 rfp=0xc2e9da7c > db_trace_self_wrapper() at db_trace_self_wrapper+0x10 > scp=0xc001edec rlv=0xc00ef4f8 (kdb_backtrace+0x3c) > rsp=0xc2e9da80 rfp=0xc2e9da90 > r4=0xc02dbf44 > kdb_backtrace() at kdb_backtrace+0x10 > scp=0xc00ef4cc rlv=0xc0102c38 (_witness_debugger+0x2c) > rsp=0xc2e9da94 rfp=0xc2e9daa8 > r4=0x00000001 > _witness_debugger() at _witness_debugger+0x10 > scp=0xc0102c1c rlv=0xc0103458 (witness_checkorder+0x7e8) > rsp=0xc2e9daac rfp=0xc2e9daf8 > r5=0x00000000 r4=0xc055b608 > witness_checkorder() at witness_checkorder+0x10 > scp=0xc0102c80 rlv=0xc00b37fc (_mtx_lock_flags+0xa4) > rsp=0xc2e9dafc rfp=0xc2e9db20 > r10=0x00000000 r9=0xc043c5ac > r8=0x00000000 r7=0x00000807 r6=0xc026e694 r5=0x00000000 > r4=0xc055b608 > _mtx_lock_flags() at _mtx_lock_flags+0x10 > scp=0xc00b3768 rlv=0xc01f4ccc (uma_zalloc_arg+0x1a8) > rsp=0xc2e9db24 rfp=0xc2e9db60 > r8=0xc055a6e0 r7=0xc055a740 > r6=0xc0272f64 r5=0xc05370b8 r4=0xc026e694 > uma_zalloc_arg() at uma_zalloc_arg+0x10 > scp=0xc01f4b34 rlv=0xc021b4b8 (pmap_get_pv_entry+0x40) > rsp=0xc2e9db64 rfp=0xc2e9db70 > r10=0x00000000 r9=0xc043c5ac > r8=0x2078d55e r7=0xc0a8a0b0 r6=0xc0272f64 r5=0xc05370b8 > r4=0x00000000 > pmap_get_pv_entry() at pmap_get_pv_entry+0x10 > scp=0xc021b488 rlv=0xc0220184 (pmap_enter_locked+0xcfc) > rsp=0xc2e9db74 rfp=0xc2e9dbf0 > pmap_enter_locked() at pmap_enter_locked+0x10 > scp=0xc021f498 rlv=0xc02206e8 (pmap_enter+0x74) > rsp=0xc2e9dbf4 rfp=0xc2e9dc20 > r10=0x00000007 r9=0xc2e9dde4 > r8=0xc056de50 r7=0xbffff000 r6=0xc043a07c r5=0xc0a8a0b0 > r4=0xc0272f64 > pmap_enter() at pmap_enter+0x10 > scp=0xc0220684 rlv=0xc01f80c8 (vm_fault+0x1688) > rsp=0xc2e9dc24 rfp=0xc2e9dd3c > r10=0x00000002 r8=0xbffff000 > r7=0xc0a88000 r6=0x00000000 r5=0xc0273400 r4=0xc056de50 > vm_fault() at vm_fault+0x10 > scp=0xc01f6a50 rlv=0xc0224058 (data_abort_handler+0x360) > rsp=0xc2e9dd40 rfp=0xc2e9dde0 > r10=0x00000002 r9=0xc2e9dde4 > r8=0xbffff000 r7=0xc0a88000 r6=0x00000000 r5=0xc0273400 > r4=0xc0a86088 > data_abort_handler() at data_abort_handler+0x10 > scp=0xc0223d08 rlv=0xc021776c (exception_exit) > rsp=0xc2e9dde4 rfp=0xc2e9de84 > r10=0xc02b7cea r9=0xc0a86000 > r8=0xc2e9deac r7=0xc02b7cea r6=0xc02b7ce0 r5=0xffff1004 > r4=0xffffffff > start_init() at start_init+0x10 > scp=0xc0082244 rlv=0xc009c460 (fork_exit+0x84) > rsp=0xc2e9de88 rfp=0xc2e9dea8 > r10=0x00000000 r9=0x00000000 > r8=0xc2e9deac r7=0x00000000 r6=0xc0082234 r5=0xc0a86000 > r4=0xc0a88000 > fork_exit() at fork_exit+0x10 > scp=0xc009c3ec rlv=0xc022352c (fork_trampoline+0x14) > rsp=0xc2e9deac rfp=0x00000000 > r8=0x00000000 r7=0x6c6c6568 > r6=0x735f6365 r5=0x00000000 r4=0xc0082234 > lock order reversal: > 1st 0xc0a8a0b0 pmap (pmap) @ /usr/src/sys/arm/arm/pmap.c:971 > 2nd 0xc043a07c vm page queue mutex (vm page queue mutex) @ /usr/src/sys/arm/arm/pmap.c:1667 If I'm looking at the right sources, then I don't think that this LOR still exists in HEAD. > KDB: stack backtrace: > db_trace_thread() at db_trace_thread+0x10 > scp=0xc0215d64 rlv=0xc0215f68 (db_trace_self+0x1c) > rsp=0xc2e9d7e4 rfp=0xc2e9d7f0 > r10=0x00000000 r9=0xc0a4c4b8 > r8=0xc040b3ec r7=0xffffffff r6=0xc0a4c178 r5=0xc023a1fc > r4=0xc2e9d7fc > db_trace_self() at db_trace_self+0x10 > scp=0xc0215f5c rlv=0xc001ee0c (db_trace_self_wrapper+0x30) > rsp=0xc2e9d7f4 rfp=0xc2e9d910 > db_trace_self_wrapper() at db_trace_self_wrapper+0x10 > scp=0xc001edec rlv=0xc00ef4f8 (kdb_backtrace+0x3c) > rsp=0xc2e9d914 rfp=0xc2e9d924 > r4=0xc02dbf44 > kdb_backtrace() at kdb_backtrace+0x10 > scp=0xc00ef4cc rlv=0xc0102c38 (_witness_debugger+0x2c) > rsp=0xc2e9d928 rfp=0xc2e9d93c > r4=0x00000001 > _witness_debugger() at _witness_debugger+0x10 > scp=0xc0102c1c rlv=0xc0103458 (witness_checkorder+0x7e8) > rsp=0xc2e9d940 rfp=0xc2e9d98c > r5=0x00000000 r4=0xc043a07c > witness_checkorder() at witness_checkorder+0x10 > scp=0xc0102c80 rlv=0xc00b37fc (_mtx_lock_flags+0xa4) > rsp=0xc2e9d990 rfp=0xc2e9d9b4 > r10=0x00008000 r9=0x00000012 > r8=0x00000000 r7=0x00000683 r6=0xc0272f64 r5=0x00000000 > r4=0xc043a07c > _mtx_lock_flags() at _mtx_lock_flags+0x10 > scp=0xc00b3768 rlv=0xc021b618 (pmap_enter_pv+0x150) > rsp=0xc2e9d9b8 rfp=0xc2e9d9e0 > r8=0xc0272f64 r7=0x00000000 > r6=0xc0a8a0b0 r5=0xc056e120 r4=0xc0689054 > pmap_enter_pv() at pmap_enter_pv+0x10 > scp=0xc021b4d8 rlv=0xc0220260 (pmap_enter_locked+0xdd8) > rsp=0xc2e9d9e4 rfp=0xc2e9da60 > r10=0x00000000 r9=0x00000000 > r8=0x2079600e r7=0xc0a8a0b0 r6=0xc0272f64 r5=0xc0538e90 > r4=0x00000000 > pmap_enter_locked() at pmap_enter_locked+0x10 > scp=0xc021f498 rlv=0xc0220618 (pmap_enter_object+0x8c) > rsp=0xc2e9da64 rfp=0xc2e9da9c > r10=0x00008000 r9=0xc0a8a0b0 > r8=0xc056e120 r7=0x00000000 r6=0x00000099 r5=0xc056e120 > r4=0x0000000f > pmap_enter_object() at pmap_enter_object+0x10 > scp=0xc022059c rlv=0xc01fba6c (vm_map_pmap_enter+0x2a8) > rsp=0xc2e9daa0 rfp=0xc2e9dae4 > r10=0xc056e120 r9=0x00000001 > r8=0x00000000 r7=0x00000099 r6=0x00000000 r5=0x00000000 > r4=0x0000000f > vm_map_pmap_enter() at vm_map_pmap_enter+0x10 > scp=0xc01fb7d4 rlv=0xc01fc268 (vm_map_insert+0x474) > rsp=0xc2e9dae8 rfp=0xc2e9db3c > r10=0x00008000 r9=0x00000000 > r8=0x00000000 r7=0x0000040c r6=0xc0548550 r5=0xc0a8a000 > r4=0x00000000 > vm_map_insert() at vm_map_insert+0x10 > scp=0xc01fbe04 rlv=0xc00800c4 (elf32_map_insert+0x234) > rsp=0xc2e9db40 rfp=0xc2e9db88 > r10=0x000a1000 r9=0xfffff000 > r8=0x00098d64 r7=0x00000000 r6=0x00000000 r5=0x00000005 > r4=0xc02503d0 > elf32_map_insert() at elf32_map_insert+0x10 > scp=0xc007fea0 rlv=0xc0080210 (elf32_load_section+0x100) > rsp=0xc2e9db8c rfp=0xc2e9dbcc > r10=0x00000005 r9=0xfffff000 > r8=0x00098d64 r7=0xc0548550 r6=0x00000000 r5=0x00098d64 > r4=0x00008000 > elf32_load_section() at elf32_load_section+0x10 > scp=0xc0080120 rlv=0xc0081634 (exec_elf32_imgact+0x580) > rsp=0xc2e9dbd0 rfp=0xc2e9dc68 > r10=0xc02ceba8 r9=0xc2ea1034 > r8=0xc2e9dd5c r7=0xc2ea1000 r6=0x00000000 r5=0xc0548550 > r4=0x00008000 > exec_elf32_imgact() at exec_elf32_imgact+0x10 > scp=0xc00810c4 rlv=0xc009927c (kern_execve+0x2f8) > rsp=0xc2e9dc6c rfp=0xc2e9dde8 > r10=0xc0a88000 r9=0xc0a86000 > r8=0xc2e9ddf4 r7=0x00000000 r6=0x00000000 r5=0x00000000 > r4=0x00000001 > kern_execve() at kern_execve+0x10 > scp=0xc0098f94 rlv=0xc0099d4c (execve+0x4c) > rsp=0xc2e9ddec rfp=0xc2e9de2c > r10=0xc02b7cea r9=0xc0a86000 > r8=0xbffffffd r7=0xc02b7cea r6=0xc02b7ce0 r5=0xc2e9ddf4 > r4=0xc0a88000 > execve() at execve+0x10 > scp=0xc0099d10 rlv=0xc008244c (start_init+0x218) > rsp=0xc2e9de30 rfp=0xc2e9de84 > r5=0xbffffff2 r4=0xbfffffe4 > start_init() at start_init+0x10 > scp=0xc0082244 rlv=0xc009c460 (fork_exit+0x84) > rsp=0xc2e9de88 rfp=0xc2e9dea8 > r10=0x00000000 r9=0x00000000 > r8=0xc2e9deac r7=0x00000000 r6=0xc0082234 r5=0xc0a86000 > r4=0xc0a88000 > fork_exit() at fork_exit+0x10 > scp=0xc009c3ec rlv=0xc022352c (fork_trampoline+0x14) > rsp=0xc2e9deac rfp=0x00000000 > r8=0x00000000 r7=0x6c6c6568 > r6=0x735f6365 r5=0x00000000 r4=0xc0082234 > > -- Ian > > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?505E0739.8080802>