From owner-freebsd-arm@FreeBSD.ORG Mon Sep 24 06:20:00 2012 Return-Path: Delivered-To: arm@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A9766106566C for ; Mon, 24 Sep 2012 06:20:00 +0000 (UTC) (envelope-from alc@rice.edu) Received: from mh10.mail.rice.edu (mh10.mail.rice.edu [128.42.201.30]) by mx1.freebsd.org (Postfix) with ESMTP id 734AC8FC0C for ; Mon, 24 Sep 2012 06:20:00 +0000 (UTC) Received: from mh10.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh10.mail.rice.edu (Postfix) with ESMTP id D2BA2604D2; Mon, 24 Sep 2012 01:19:59 -0500 (CDT) Received: from mh10.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh10.mail.rice.edu (Postfix) with ESMTP id CF1A0604D4; Mon, 24 Sep 2012 01:19:59 -0500 (CDT) X-Virus-Scanned: by amavis-2.7.0 at mh10.mail.rice.edu, auth channel Received: from mh10.mail.rice.edu ([127.0.0.1]) by mh10.mail.rice.edu (mh10.mail.rice.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id M2A_ZK2TSG51; Mon, 24 Sep 2012 01:19:59 -0500 (CDT) Received: from adsl-216-63-78-18.dsl.hstntx.swbell.net (adsl-216-63-78-18.dsl.hstntx.swbell.net [216.63.78.18]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: alc) by mh10.mail.rice.edu (Postfix) with ESMTPSA id 18A79604D2; Mon, 24 Sep 2012 01:19:59 -0500 (CDT) Message-ID: <505FFB8C.8050903@rice.edu> Date: Mon, 24 Sep 2012 01:19:56 -0500 From: Alan Cox User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:8.0) Gecko/20111113 Thunderbird/8.0 MIME-Version: 1.0 To: Ian Lepore References: <1345315508.27688.260.camel@revolution.hippie.lan> <503D12AE.1050705@rice.edu> <1346350374.1140.525.camel@revolution.hippie.lan> <5045351F.6060201@rice.edu> <1346723041.1140.602.camel@revolution.hippie.lan> <504B85BE.3030101@rice.edu> <1347316458.1137.41.camel@revolution.hippie.lan> <504F8BAC.4040902@rice.edu> <20120915045040.GZ58312@funkthat.com> <5054D69B.40209@rice.edu> <20120917033308.GB58312@funkthat.com> <505DE03D.7050101@rice.edu> <1348333663.5548.24.camel@revolution.hippie.lan> <505E0739.8080802@rice.edu> <1348346711.15745.3.camel@revolution.hippie.lan> In-Reply-To: <1348346711.15745.3.camel@revolution.hippie.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "arm@freebsd.org" , Alan Cox Subject: Re: arm pmap locking X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the StrongARM Processor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Sep 2012 06:20:00 -0000 Going back a few messages, I now see the root of the witness problems in the stack trace below. Other architectures implement pmap_kextract() in a lock-less fashion. Arm is implementing pmap_kextract() as pmap_extract(pmap_kernel(), ...), which isn't lock-less. This locking is problematic. As long as it exists, uma_dbg_alloc() will acquire the UMA zone lock before the pmap lock, and elsewhere the pmap lock will be acquired before the UMA zone lock. On 09/22/2012 15:45, Ian Lepore wrote: > On Sat, 2012-09-22 at 13:45 -0500, Alan Cox wrote: >> On 09/22/2012 12:07, Ian Lepore wrote: >>> There has been a boot-time LOR involving the same locks, but with a >>> different backtrace, for a long time. In case it helps, the following >>> is from freebsd 8.2 (the latest I can test with right now), the pmap.c >>> in question is r205956 according to the __FBSDID() in the code. There >>> are actually two LORs below, the first involves the same locks as above, >>> the second one is different. >>> >>> Trying to mount root from ufs:/dev/mmcsd0s1a >>> warning: no time-of-day clock registered, system time will not be set accurately >>> lock order reversal: >>> 1st 0xc0a8a0b0 pmap (pmap) @ /usr/src/sys/arm/arm/pmap.c:971 >>> 2nd 0xc055b608 PV ENTRY (UMA zone) @ /usr/src/sys/vm/uma_core.c:2055 >> >> This is actually the correct lock ordering. So, something happened >> before this that incorrectly trained witness. Are you able to boot a >> kernel with a modified kern/subr_witness? If so, then you could >> explicitly set the lock order to be the above, correct ordering, and >> witness would report the actual case of LOR. > Yep. Looks like the first one (reversed) is very very early in the > init. After this one, there were no other LORs during the boot until > the one that you said is fixed now in -current. > > lock order reversal: > 1st 0xc055bd08 4096 (UMA zone) @ /usr/src/sys/vm/uma_core.c:2011 > 2nd 0xc043c198 pmap (pmap) @ /usr/src/sys/arm/arm/pmap.c:3734 > KDB: stack backtrace: > db_trace_thread() at db_trace_thread+0x10 > scp=0xc0215da0 rlv=0xc0215fa4 (db_trace_self+0x1c) > rsp=0xc04ddc28 rfp=0xc04ddc34 > r10=0x00000000 r9=0xc0a4b0d0 > r8=0xc040ba9c r7=0xffffffff r6=0xc0a4b068 r5=0xc023a23c > r4=0xc04ddc40 > db_trace_self() at db_trace_self+0x10 > scp=0xc0215f98 rlv=0xc001ee0c (db_trace_self_wrapper+0x30) > rsp=0xc04ddc38 rfp=0xc04ddd54 > db_trace_self_wrapper() at db_trace_self_wrapper+0x10 > scp=0xc001edec rlv=0xc00ef4f8 (kdb_backtrace+0x3c) > rsp=0xc04ddd58 rfp=0xc04ddd68 > r4=0xc02dbfc4 > kdb_backtrace() at kdb_backtrace+0x10 > scp=0xc00ef4cc rlv=0xc0102c48 (_witness_debugger+0x2c) > rsp=0xc04ddd6c rfp=0xc04ddd80 > r4=0x00000001 > _witness_debugger() at _witness_debugger+0x10 > scp=0xc0102c2c rlv=0xc0103468 (witness_checkorder+0x7e8) > rsp=0xc04ddd84 rfp=0xc04dddd0 > r5=0x00000000 r4=0xc043c198 > witness_checkorder() at witness_checkorder+0x10 > scp=0xc0102c90 rlv=0xc00b37fc (_mtx_lock_flags+0xa4) > rsp=0xc04dddd4 rfp=0xc04dddf8 > r10=0x00000a10 r9=0x00000002 > r8=0x00000000 r7=0x00000e96 r6=0xc0272fdc r5=0x00000000 > r4=0xc043c198 > _mtx_lock_flags() at _mtx_lock_flags+0x10 > scp=0xc00b3768 rlv=0xc021b2b8 (pmap_extract+0x2c) > rsp=0xc04dddfc rfp=0xc04dde18 > r8=0xc05550a0 r7=0xc043c198 > r6=0xc0a49000 r5=0xc043c198 r4=0x00000c0a > pmap_extract() at pmap_extract+0x10 > scp=0xc021b29c rlv=0xc01f572c (uma_dbg_getslab+0x28) > rsp=0xc04dde1c rfp=0xc04dde28 > r7=0xc0555100 r6=0xc0a49000 > r5=0x00000000 r4=0xc05550a0 > uma_dbg_getslab() at uma_dbg_getslab+0x10 > scp=0xc01f5714 rlv=0xc01f596c (uma_dbg_alloc+0x24) > rsp=0xc04dde2c rfp=0xc04dde44 > uma_dbg_alloc() at uma_dbg_alloc+0x10 > scp=0xc01f5958 rlv=0xc01f4c4c (uma_zalloc_arg+0xec) > rsp=0xc04dde48 rfp=0xc04dde84 > r6=0xc05550a0 r5=0xc0a49000 > r4=0xc026e718 > uma_zalloc_arg() at uma_zalloc_arg+0x10 > scp=0xc01f4b70 rlv=0xc00aff80 (malloc+0xf8) > rsp=0xc04dde88 rfp=0xc04ddeb0 > r10=0x00000a10 r9=0x00000002 > r8=0x00000020 r7=0xc02bf588 r6=0xc05550a0 r5=0x00000102 > r4=0x00000008 > malloc() at malloc+0x10 > scp=0xc00afe98 rlv=0xc00b28c8 (mtx_pool_create+0x54) > rsp=0xc04ddeb4 rfp=0xc04dded0 > r10=0x204fe888 r9=0x00000019 > r8=0x20479a90 r7=0x00000000 r6=0xc0254a30 r5=0x00000080 > r4=0xc0276a90 > mtx_pool_create() at mtx_pool_create+0x10 > scp=0xc00b2884 rlv=0xc00b290c (mtx_pool_setup_dynamic+0x1c) > rsp=0xc04dded4 rfp=0xc04ddee0 > r7=0x20000124 r6=0x00000004 > r5=0x20000130 r4=0xc0276a90 > mtx_pool_setup_dynamic() at mtx_pool_setup_dynamic+0x10 > scp=0xc00b2900 rlv=0xc0082654 (mi_startup+0xf8) > rsp=0xc04ddee4 rfp=0xc04ddef4 > mi_startup() at mi_startup+0x10 > scp=0xc008256c rlv=0xc00001cc (virt_done+0x18) > rsp=0xc04ddef8 rfp=0x00000000 > r4=0x2000020c > > -- Ian > >