From owner-freebsd-arm@freebsd.org Fri Sep 15 04:14:59 2017 Return-Path: Delivered-To: freebsd-arm@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B6F53E23522 for ; Fri, 15 Sep 2017 04:14:59 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-210-66.reflexion.net [208.70.210.66]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 74F3076C70 for ; Fri, 15 Sep 2017 04:14:58 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 17477 invoked from network); 15 Sep 2017 04:14:52 -0000 Received: from unknown (HELO mail-cs-02.app.dca.reflexion.local) (10.81.19.2) by 0 (rfx-qmail) with SMTP; 15 Sep 2017 04:14:52 -0000 Received: by mail-cs-02.app.dca.reflexion.local (Reflexion email security v8.40.2) with SMTP; Fri, 15 Sep 2017 00:14:52 -0400 (EDT) Received: (qmail 19639 invoked from network); 15 Sep 2017 04:14:52 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 15 Sep 2017 04:14:52 -0000 Received: from [192.168.1.109] (c-67-170-167-181.hsd1.or.comcast.net [67.170.167.181]) by iron2.pdx.net (Postfix) with ESMTPSA id 90F37EC91CC; Thu, 14 Sep 2017 21:14:51 -0700 (PDT) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: FYI: Pine64+ 2GB (so A64) booting and non-debug vs. debug kernel: nondebug+INVARIANTS+INVARIANT_SUPPORT sufficient to boot Date: Thu, 14 Sep 2017 21:14:50 -0700 References: <1C18FF04-6772-4E9C-88C5-B8D5478C5809@dsl-only.net> <6D63486A-E933-4CC2-9A24-0688BE01A0DA@dsl-only.net> To: Emmanuel Vadot , freebsd-arm , freebsd-hackers In-Reply-To: <6D63486A-E933-4CC2-9A24-0688BE01A0DA@dsl-only.net> Message-Id: <8E15A747-3413-4537-9ECA-5EDAD1285351@dsl-only.net> X-Mailer: Apple Mail (2.3273) X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Sep 2017 04:14:59 -0000 [I've traced the failure back to the bad pointer value that is in place when it was put to use. I omit the prior details of earlier explorations that got me into this area.] Summary of the following investigations: When the witness or invariant failure during: taskqgroup_adjust_softirq(0)... happens it traces back to the condition: pcpu_find(cpu)->pc_curthread =3D=3D NULL in the code: ctd =3D pcpu_find(cpu)->pc_curthread; for cpu =3D=3D 1 in tdq_notify (but inlined). It then attempts to evaluate: ctd->td_priority which gets a data_abort but it is during when blocked_lock is the container lock for the thread. In the witness included case this leads to: panic: acquiring blockable sleep lock with spinlock or critical section = held (sleep mutex) pmap @ /usr/src/sys/arm64/arm64/pmap.c:4710 cpuid =3D 0 time =3D 13 but that is a later consequence of the earlier problem. I'm not sure why pcpu_find(cpu)->pc_curthread is still NULL but since the behavior is intermittent for debug kernel builds it suggests a race for an update that was initiated but not always finished in time. (I've had occasions of hours of reboots to try to get a failure but mostly only needing a few. I've not run into a long sequence of failures to boot for a debug kernel but have had some back-to-back ones.) Supporting detail that lead to the above: int pmap_fault(pmap_t pmap, uint64_t esr, uint64_t far) The far (fault address register) argument to pmap_fault is the=20 rd one (x2 below): ffff000000606954 stp x22, x21, [sp, #-48]! ffff000000606958 stp x20, x19, [sp, #16] ffff00000060695c stp x29, x30, [sp, #32] ffff000000606960 add x29, sp, #0x20 ffff000000606964 mov x20, x2 ffff000000606968 mov x22, x1 ffff00000060696c mov x21, x0 For the failing call sequence far ends up stored on the stack via the x20 save-to-stack in: ffff0000002f8c0c <__mtx_lock_flags> stp x24, x23, [sp, #-64]! ffff0000002f8c10 <__mtx_lock_flags+0x4> stp x22, x21, [sp, #16] ffff0000002f8c14 <__mtx_lock_flags+0x8> stp x20, x19, [sp, #32] ffff0000002f8c18 <__mtx_lock_flags+0xc> stp x29, x30, [sp, #48] ffff0000002f8c1c <__mtx_lock_flags+0x10> add x29, sp, #0x30 Stack segment with a little context: 0xffff000069850470: ffff0000698504b0 ffff0000002f8b80 0xffff000069850480: ffff000000c6a528 0 0xffff000069850490: 96000004 ffff000000c6a658 0xffff0000698504a0: 37e ffff000000c6a670 0xffff0000698504b0: ffff0000698504e0 ffff000000606998 So it appears: pmap_fault`esr =3D=3D 0x96000004 pmap_fault`pmap =3D=3D 0xffff000000c6a658 (vmspace0+0x130) pmap_fault`far =3D=3D 0x37e I'll note that 0x37e =3D 894 so it matches up with x8 =3D=3D 0x0 for the likes of: elr 0xffff00000033f0dc sched_switch+0x2bc . . . ssched_switch+0x2b8: ldrb w9, [x8, #894] matching: db> show reg . . . x8 0 . . . So apparently sched_switch tried to access 0x37e db> x/gx 0xffff000000606998 =20 pmap_fault+0x44: f100111f927e0ec8 Part of the back trace is (for the example debug kernel): kassert_panic() at witness_checkorder+0x160 pc =3D 0xffff0000003174e4 lr =3D 0xffff000000374990 sp =3D 0xffff0000698503f0 fp =3D 0xffff000069850470 witness_checkorder() at __mtx_lock_flags+0xa8 pc =3D 0xffff000000374990 lr =3D 0xffff0000002f8b7c sp =3D 0xffff000069850480 fp =3D 0xffff0000698504b0 __mtx_lock_flags() at pmap_fault+0x40 pc =3D 0xffff0000002f8b7c lr =3D 0xffff000000606994 sp =3D 0xffff0000698504c0 fp =3D 0xffff0000698504e0 =20 pmap_fault() at data_abort+0xb8 pc =3D 0xffff000000606994 lr =3D 0xffff000000608a9c sp =3D 0xffff0000698504f0 fp =3D 0xffff0000698505a0 data_abort() at do_el1h_sync+0xfc pc =3D 0xffff000000608a9c lr =3D 0xffff0000006088f0 sp =3D 0xffff0000698505b0 fp =3D 0xffff0000698505e0 do_el1h_sync() at handle_el1h_sync+0x74 pc =3D 0xffff0000006088f0 lr =3D 0xffff0000005f1874 sp =3D 0xffff0000698505f0 fp =3D 0xffff000069850700 handle_el1h_sync() at sched_switch+0x2a8 pc =3D 0xffff0000005f1874 lr =3D 0xffff00000033f0c8 sp =3D 0xffff000069850710 fp =3D 0xffff0000698507f0 sched_switch() at mi_switch+0x1b8 pc =3D 0xffff00000033f0c8 lr =3D 0xffff00000032161c sp =3D 0xffff000069850800 fp =3D 0xffff000069850820 mi_switch() at taskqgroup_binder+0x7c pc =3D 0xffff00000032161c lr =3D 0xffff00000035510c sp =3D 0xffff000069850830 fp =3D 0xffff000069850860 taskqgroup_binder() at gtaskqueue_run_locked+0x104 pc =3D 0xffff00000035510c lr =3D 0xffff000000354f74 sp =3D 0xffff000069850870 fp =3D 0xffff0000698508e0 =20 gtaskqueue_run_locked() at gtaskqueue_thread_loop+0x9c pc =3D 0xffff000000354f74 lr =3D 0xffff000000354d10 sp =3D 0xffff0000698508f0 fp =3D 0xffff000069850910 gtaskqueue_thread_loop() at fork_exit+0x7c pc =3D 0xffff000000354d10 lr =3D 0xffff0000002dbd3c sp =3D 0xffff000069850920 fp =3D 0xffff000069850950 fork_exit() at fork_trampoline+0x10 pc =3D 0xffff0000002dbd3c lr =3D 0xffff000000608664 sp =3D 0xffff000069850960 fp =3D 0x0000000000000000 Note: ffff00000033f0bc ldr w0, [x19, #1292] ffff00000033f0c0 ldrb w27, [x19, #894] ffff00000033f0c4 str w0, [sp, #12] ffff00000033f0c8 bl ffff000000359708 = ffff00000033f0cc ldr x8, [x0] ffff00000033f0d0 mov w11, w27 ffff00000033f0d4 adrp x27, ffff000000c85000 = ffff00000033f0d8 ldrb w9, [x8, #894] ffff00000033f0dc cmp w11, w9 ffff00000033f0e0 b.cs ffff00000033f150 = // b.hs, b.nlast This is code for the later part of what is shown below: static void tdq_notify(struct tdq *tdq, struct thread *td) { struct thread *ctd; int pri; int cpu; if (tdq->tdq_ipipending) return; cpu =3D td_get_sched(td)->ts_cpu; pri =3D td->td_priority; ctd =3D pcpu_find(cpu)->pc_curthread; if (!sched_shouldpreempt(pri, ctd->td_priority, 1)) return; . . . } (Where: sched_shouldpreempt has been inlined and some of it is interlaced.) The failing [x8, #894] is the attempt to access: ctd->td_priority In other words: ctd =3D=3D NULL resulted from the pcpu_find (i.e., x8 =3D=3D 0 ). As for how it got to be zero: db> show reg spsr 0x9600000440000085 x0 0xffff000000ac1000 __pcpu+0x200 . . . db> x/gx cpuid_to_pcpu,32 cpuid_to_pcpu: ffff000000ac0e00 ffff000000ac1000 cpuid_to_pcpu+0x10: ffff000000ac1200 ffff000000ac1400 cpuid_to_pcpu+0x20: 0 0 . . . (So cpu =3D=3D 1 .) db> x/gx 0xffff000000ac1000,8 __pcpu+0x200: 0 fffffd00005dda80 __pcpu+0x210: 0 0 __pcpu+0x220: 0 0 __pcpu+0x230: 100000000 ffff000000ac1200 Thus it seems that at the time for cpu=3D=3D1 : pcpu_find(cpu)->pc_curthread =3D=3D NULL (at __pcpu+0x200). =3D=3D=3D Mark Millard markmi at dsl-only.net