From owner-freebsd-hackers@freebsd.org Thu Sep 14 00:47:51 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7CC60E18877 for ; Thu, 14 Sep 2017 00:47:51 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-210-66.reflexion.net [208.70.210.66]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3F9B781AF4 for ; Thu, 14 Sep 2017 00:47:50 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 3235 invoked from network); 14 Sep 2017 00:53:11 -0000 Received: from unknown (HELO mail-cs-01.app.dca.reflexion.local) (10.81.19.1) by 0 (rfx-qmail) with SMTP; 14 Sep 2017 00:53:11 -0000 Received: by mail-cs-01.app.dca.reflexion.local (Reflexion email security v8.40.2) with SMTP; Wed, 13 Sep 2017 20:47:43 -0400 (EDT) Received: (qmail 13704 invoked from network); 14 Sep 2017 00:47:43 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 14 Sep 2017 00:47:43 -0000 Received: from [192.168.1.109] (c-67-170-167-181.hsd1.or.comcast.net [67.170.167.181]) by iron2.pdx.net (Postfix) with ESMTPSA id 8E8B7EC7C39; Wed, 13 Sep 2017 17:47:42 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: FYI: Pine64+ 2GB (so A64) booting and non-debug vs. debug kernel: nondebug+INVARIANTS+INVARIANT_SUPPORT sufficient to boot From: Mark Millard In-Reply-To: Date: Wed, 13 Sep 2017 17:47:41 -0700 Cc: freebsd-hackers Content-Transfer-Encoding: quoted-printable Message-Id: <6D63486A-E933-4CC2-9A24-0688BE01A0DA@dsl-only.net> References: <1C18FF04-6772-4E9C-88C5-B8D5478C5809@dsl-only.net> To: Emmanuel Vadot , freebsd-arm X-Mailer: Apple Mail (2.3273) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Sep 2017 00:47:51 -0000 [This time a debug kernel (including witness) and verbose booting. Also showing what has spin locks active (none) and what has critical sections mentioned on the back traces (critical_exit).] On 2017-Sep-12, at 11:16 PM, Mark Millard = wrote: > [Back to nooptions for INVARIANTS and INVARIANT_SUPPORT > but now verbose booting. taskqgroup_adjust_softirq(0)... > is the one to not get a "done." before failure.] >=20 > On 2017-Sep-12, at 7:19 PM, Mark Millard = wrote: >=20 >> I took my normal GENERIC-NODBG (that includes GENERIC) >> and changed INVARIANTS and INVARIANT_SUPPORT to have >> "options" status instead of "nooptions" status. The >> result boots (so far no counterexamples). (This is >> head -r323246 .) >>=20 >> So it appears that one or more INVARIANT tests are >> "fixing" the Pine64+ 2GB boot problem. I've no clue >> which. But other debug options are not required. >>=20 >> FYI. . . >>=20 >> # more /usr/src/sys/arm64/conf/GENERIC-NODBG = = =20 >> # >> # GENERIC -- Custom configuration for the arm64/aarch64 >> # >>=20 >> include "GENERIC" >>=20 >> ident GENERIC-NODBG >>=20 >> makeoptions DEBUG=3D-g # Build kernel with gdb(1) = debug symbols >>=20 >> options ALT_BREAK_TO_DEBUGGER >>=20 >> options KDB # Enable kernel debugger = support >>=20 >> # For minimum debugger support (stable branch) use: >> #options KDB_TRACE # Print a stack trace for a = panic >> options DDB # Enable the kernel debugger >>=20 >> # Extra stuff: >> #options VERBOSE_SYSINIT # Enable verbose sysinit = messages >> #options BOOTVERBOSE=3D1 >> #options BOOTHOWTO=3DRB_VERBOSE >> #options KTR >> #options KTR_MASK=3DKTR_TRAP >> ##options KTR_CPUMASK=3D0xF >> #options KTR_VERBOSE >>=20 >> # Disable any extra checking for. . . >> nooptions DEADLKRES # Enable the deadlock = resolver >> options INVARIANTS # Enable calls of extra = sanity checking >> options INVARIANT_SUPPORT # Extra sanity checks of = internal structures, required by INVARIANTS >> nooptions WITNESS # Enable checks to detect = deadlocks and cycles >> nooptions WITNESS_SKIPSPIN # Don't run witness on = spinlocks for speed >> nooptions DIAGNOSTIC >> nooptions MALLOC_DEBUG_MAXZONES # Separate malloc(9) zones >> nooptions BUF_TRACKING >> nooptions FULL_BUF_TRACKING >=20 > I've changed to have: >=20 > options VERBOSE_SYSINIT # Enable verbose sysinit = messages > options BOOTVERBOSE=3D1 > options BOOTHOWTO=3DRB_VERBOSE >=20 > and: >=20 > nooptions INVARIANTS # Enable calls of extra = sanity checking > nooptions INVARIANT_SUPPORT # Extra sanity checks of = internal structures, required by INVARIANTS >=20 > The tail of the verbose failing boot looks like: >=20 > . . . > vt_upgrade(&vt_consdev)... done. > subsystem b000000 > nfs_rootconf(0)... done. > fhanew_init(0)... done. > subsystem d000000 > proc0_post(0)... done. > subsystem d800000 > sctp_syscalls_init(0)... done. > selectinit(0)... done. > subsystem dffff9c > linker_preload_finish(0)... done. > subsystem e000000 > kick_init(0)... done. > kstack_cache_init(0)... done. > subsystem e400000 > vm_pageout_init(0)... done. > $x.1(&page_kp)... done. > subsystem e800000 > $x.1(&vm_kp)... done. > subsystem ea00000 > $x.1(&bufspace_kp)... done. > $x.1(&buf_kp)... done. > subsystem ec00000 > $x.1(&vnlru_kp)... done. > $x.1(&up_kp)... done. > subsystem ee00000 > acpi_acad_ac_only(0)... done. > nfsiod_setup(0)... done. > subsystem f000000 > release_aps(0)... Release APs > APs not started > done. > tmr_setup_user_access(0)... done. > intr_irq_shuffle(0)... done. > tqg_record_smp_started(0)... done. > netisr_start(0)... done. > cpuset_init(0)... done. > taskqgroup_adjust_if_config_tqg(0)... done. > identify_cpu_sysinit(0)... CPU 0: ARM Cortex-A53 r0p4 affinity: 0 > Instruction Set Attributes 0 =3D > Instruction Set Attributes 1 =3D <0> > Processor Features 0 =3D > Processor Features 1 =3D <0> > Memory Model Features 0 =3D <4k Granule,64k = Granule,MixedEndian,S/NS Mem,16bit ASID,1TB PA> > Memory Model Features 1 =3D <> > Debug Features 0 =3D <2 CTX Breakpoints,4 Watchpoints,6 = Breakpoints,PMUv3,Debug v8> > Debug Features 1 =3D <0> > Auxiliary Features 0 =3D <0> > Auxiliary Features 1 =3D <0> > CPU 1: (null) (null) r0p0 affinity: 0 > CPU 2: (null) (null) r0p0 affinity: 0 > CPU 3: (null) (null) r0p0 affinity: 0 > done. > taskqgroup_adjust_softirq(0)... x0: ffff000000a1c080 > x1: fffffd0001031a80 > x2: 3 > [ thread pid 0 tid 100055 ] > Stopped at thread_lock_flags_+0x298: ldr w4, [x3, #156] > db>=20 >=20 > taskqgroup_adjust_softirq seems to be from: >=20 > /usr/src/sys/kern/subr_gtaskqueue.c : >=20 > TASKQGROUP_DEFINE(softirq, mp_ncpus, 1); [The above was a non-debug kernel with verbose messages.] So a debug kernel with verbose boot messages: CPU 1: (null) (null) r0p0 affinity: 0 CPU 2: (null) (null) r0p0 affinity: 0 CPU 3: (null) (null) r0p0 affinity: 0 done. taskqgroup_adjust_softirq(0)... panic: acquiring blockable sleep lock = with spinlock or critical section held (sleep mutex) pmap @ = /usr/src/sys/arm64/arm64/pmap.c:4710 cpuid =3D 0 time =3D 13 Thus the non-debug kernel boot-failures stop during "taskqgroup_adjust_softirq(0)..." and that is also what the debug kernel reports via witness (or invariant testing if witness is disabled). Witness does catch the problem somewhat earlier than invariant in the code sequence (when the race happens). Without invariants (and without witness) the failure seems to happen reliably. For this witness context. . . db> show allpcpu Current CPU: 0 cpuid =3D 0 dynamic pcpu =3D 0x84af00 curthread =3D 0xfffffd0001415a80: pid 0 tid 100058 "softirq_1" curpcb =3D 0xffff000069850cb0 fpcurthread =3D none idlethread =3D 0xfffffd00005de000: tid 100003 "idle: cpu0" spin locks held: cpuid =3D 1 dynamic pcpu =3D 0x81324f00 curthread =3D none curpcb =3D 0 fpcurthread =3D none idlethread =3D 0xfffffd00005dda80: tid 100004 "idle: cpu1" spin locks held: cpuid =3D 2 dynamic pcpu =3D 0x81325f00 curthread =3D none curpcb =3D 0 fpcurthread =3D none idlethread =3D 0xfffffd00005dd540: tid 100005 "idle: cpu2" spin locks held: cpuid =3D 3 dynamic pcpu =3D 0x81326f00 curthread =3D none curpcb =3D 0 fpcurthread =3D none idlethread =3D 0xfffffd00005dd000: tid 100006 "idle: cpu3" spin locks held: So no spin locks held. As for critical sections. . . db> show allchains . . . (just ones mentioning "on a run queue"). . . chain 20: thread 100014 (pid 12, swi4: clock (0)) blocked on lock = 0xffff000000c5d8e0 (sleep mutex) "Giant" thread 100000 (pid 0, swapper) is on a run queue chain 21: thread 100002 (pid 1, kernel) blocked on lock 0xffff000000c5d8e0 (sleep = mutex) "Giant" thread 100000 (pid 0, swapper) is on a run queue . . . db> thread 100000 [ thread pid 0 tid 100000 ] 0 db> bt Tracing pid 0 tid 100000 td 0xffff000000acd580 sched_switch() at mi_switch+0x1b8 pc =3D 0xffff00000033f494 lr =3D 0xffff000000321754 sp =3D 0xffff0000000109f0 fp =3D 0xffff000000010a10 mi_switch() at critical_exit+0x84 pc =3D 0xffff000000321754 lr =3D 0xffff00000031e72c sp =3D 0xffff000000010a20 fp =3D 0xffff000000010a30 critical_exit() at spinlock_exit+0x10 pc =3D 0xffff00000031e72c lr =3D 0xffff0000005f83b4 sp =3D 0xffff000000010a40 fp =3D 0xffff000000010a50 spinlock_exit() at wakeup_one+0x30 pc =3D 0xffff0000005f83b4 lr =3D 0xffff00000032157c sp =3D 0xffff000000010a60 fp =3D 0xffff000000010a70 wakeup_one() at grouptaskqueue_enqueue+0xcc pc =3D 0xffff00000032157c lr =3D 0xffff0000003533ec sp =3D 0xffff000000010a80 fp =3D 0xffff000000010aa0 =20 grouptaskqueue_enqueue() at taskqgroup_adjust+0x83c pc =3D 0xffff0000003533ec lr =3D 0xffff00000035483c sp =3D 0xffff000000010ab0 fp =3D 0xffff000000010b40 taskqgroup_adjust() at mi_startup+0x254 pc =3D 0xffff00000035483c lr =3D 0xffff0000002b5704 sp =3D 0xffff000000010b50 fp =3D 0xffff000000010bb0 mi_startup() at virtdone+0x54 pc =3D 0xffff0000002b5704 lr =3D 0xffff000000001084 sp =3D 0xffff000000010bc0 fp =3D 0x0000000000000000 From: db> show threads . . . (just ones mentioning critical_exit). . . 100027 (0xfffffd000062e000) (stack 0xffff00006a58a000) 100033 = (0xfffffd0000796000) (stack 0xffff00006a5a9000) 100034 = (0xfffffd0000795a80) (stack 0xffff00006a5b6000) 100003 = (0xfffffd00005de000) (stack 0xffff000081baa000) sched_switch() at = mi_switch+0x1b8 pc =3D 0xffff00000033f494 lr =3D 0xffff000000321754 sp =3D 0xffff000081bada20 fp =3D 0xffff000081bada40 mi_switch() at critical_exit+0x84 pc =3D 0xffff000000321754 lr =3D 0xffff00000031e72c sp =3D 0xffff000081bada50 fp =3D 0xffff000081bada60 critical_exit() at cpu_idle+0x3c pc =3D 0xffff00000031e72c lr =3D 0xffff0000005f8308 sp =3D 0xffff000081bada70 fp =3D 0xffff000081bada80 cpu_idle() at sched_idletd+0xf4 pc =3D 0xffff0000005f8308 lr =3D 0xffff000000341b84 sp =3D 0xffff000081bada90 fp =3D 0xffff000081badb50 sched_idletd() at fork_exit+0x7c pc =3D 0xffff000000341b84 lr =3D 0xffff0000002dbe74 sp =3D 0xffff000081badb60 fp =3D 0xffff000081badb90 fork_exit() at fork_trampoline+0x10 pc =3D 0xffff0000002dbe74 lr =3D 0xffff000000608664 sp =3D 0xffff000081badba0 fp =3D 0x0000000000000000 . . . I did not find any other references to "critical". Only swapper listed on the run queue as far as critical_exit goes. The other critical_exit's were from cpu_idle. It appears to me as fairly likely that what witness and invariant reports sometimes is the same thing that is involved for the non-debug kernels run into (more) reliably: non-debug is likely hanging on the lock attempt while (it appears that) a critical section is still active. As near as I can tell some invariant logic makes the critical section vs. blockable lock conflict far less likely to happen: some form of race. Thus the invariant-only and full-debug kernels usually booting, but not always booting. (But I make no claim to be expert in these areas.) =3D=3D=3D Mark Millard markmi at dsl-only.net