From owner-freebsd-ppc@freebsd.org Thu Jun 6 06:30:00 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8A77715C58C5 for ; Thu, 6 Jun 2019 06:30:00 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic308-8.consmr.mail.gq1.yahoo.com (sonic308-8.consmr.mail.gq1.yahoo.com [98.137.68.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4E8B886F19 for ; Thu, 6 Jun 2019 06:29:59 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: _HCWmj0VM1lK4tka6q3trNkY9_ZWbGmZmrH5X5mRl2T0LtEGFMXPYwJVlHaSv_m oagpdDy10JyfYUYYI9O9pgzeRbyuCUCzJPLV1W_HYgfz9FSvAh31M6Aw9Vpr.CtmNXH9TTrg39gI EBofBJFW37yaFYOAImss.mHJgc4EkKLGZNAkl64F4g6otKRRjhF20BY5JD0N21ClYgs.yN0W565C NdYENFV9JtAfXzxOJNJrjCv3MbcSDgAAZbpqQGdbdV.iwPJLgi5FQuh7_w3OoJvMLWLdV.uT1KlI PGVHOgFny0hxKaTykP4FcbdUry0TSIVljhEmj713wdBAY27rX2zoecP7m3wMY0zm_WtvcONlR3mR McGTsxUHvsTW9X.FSYIT3kPKjbZBAdb_EejndRgIXNt.01rzsVqgZxyAlwGWOIc4SfioD8u3ObrG .8kVHqOECoSUrpJ2ZI4kc_Xpu4kJfrE_XksIATpE5zq2E3s5RIGAxIlPlvnsaE8jkQZpUvD.XNMT a3U7wi6HE4c._XxJtYyUhPbFDcf9XUrm75en_q8388pe1TduRbDvFMg7HQoc_sngZusO_tR27mJc 7ZefMQ9Ep_t1F9YPbD3vZi8ZS2Mg_XGnOMhSV2j54K2aq7qlRDiLAtpTdZCuIlVfAkuT3SBdNcI6 uofz57c.wcVtwOa_6IYcuB8aaJP7RWTDaLvKNB5C1q05EpQplIUMohgLlBGsPDBqQQeOh5z6du5i ui3WAo3501LhUBfHVe5ey.1Tdn21kT4oH5mrHHh4IvnOvMPpJcvSIx12SrNkzjO2PjgndX4YVK3J 19.JziXJDBLnf0BwsskV_C5BqBE4vG7T.nyhoDPRarYaNvU8Ql3L9mjuPTfSLFXXtKsdD1SRXuHn E5r94K1ITnVIq_ULQL4VEWCIugfeK_hP1xjp.QICjjmymPbSP5hG4FGdfsGvqp7YhkfcjCzrygO8 8yb5JZP6nltIvfgXVRjt00qsW3NBzj7YeHi2pdBU0W_oqduRuuyvPKQYZc_7_fKTG9kbOkgRVGsL Wz9RWbsgHrJeNCYaA39KiMrEb.lcCvpSIOpEG0SJ0Qw7H1urLcjvITf46oKwXtVUyAdRzOnPC9t0 V1pUZQ2Q2in_v0wmgmHg5Pt1L_EiR5eGP6DZ4.uj4twbqHZ.KRHO.mdhFJMJz8J4d71u5orBJTHH P2vzNv4IVth.w8S3e5nQbToGNbv50VLNYbF7ZtGC_02G_A80DDcFMXVKsfpy2xWxMugKPbF5gKg- - Received: from sonic.gate.mail.ne1.yahoo.com by sonic308.consmr.mail.gq1.yahoo.com with HTTP; Thu, 6 Jun 2019 06:29:52 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.115]) ([67.170.167.181]) by smtp413.mail.gq1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID fc3bd0df7b029434349f7020e6ae02b8; Thu, 06 Jun 2019 06:29:50 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) Subject: Re: crash of 32-bit powerpc -r347549 kernel built via system-clang-8 (crash is while trying to mount the root file system) [debug kernel case: code generation error] [I was wrong] From: Mark Millard In-Reply-To: Date: Wed, 5 Jun 2019 23:29:49 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: References: <45D010BF-7654-43A6-8FF4-CCDEEF4004F6@yahoo.com> <4354EA25-69C2-4CAB-8273-62457333BD30@yahoo.com> <995DA649-9390-420B-AC95-FFD17079CDA9@yahoo.com> To: FreeBSD PowerPC ML , FreeBSD Toolchain X-Mailer: Apple Mail (2.3445.104.11) X-Rspamd-Queue-Id: 4E8B886F19 X-Spamd-Bar: / X-Spamd-Result: default: False [0.38 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-0.54)[-0.535,0]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_SPAM_SHORT(0.90)[0.905,0]; NEURAL_HAM_LONG(-0.38)[-0.379,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(0.90)[ip: (2.85), ipnet: 98.137.64.0/21(0.96), asn: 36647(0.77), country: US(-0.06)]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[32.68.137.98.list.dnswl.org : 127.0.5.0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Jun 2019 06:30:00 -0000 [I misanalysed the code. Sorry for the noise.] On 2019-Jun-5, at 14:17, Mark Millard wrote: > [This is from my experiments with more modern toolchains than > normally/offocially used, specifically for 32-bit powerpc this > time.] >=20 > On 2019-Jun-5, at 01:35, Mark Millard wrote: >=20 >> On 2019-Jun-3, at 19:40, Mark Millard wrote: >>=20 >>> On 2019-Jun-3, at 17:24, Mark Millard wrote: >>>=20 >>>> I tried (cross) building a 32-bit powerpc kernel and world = (non-debug)=20 >>>> with system-clang (on amd64) and use of devel/powerpc64-binutils . = The >>>> installed kernel panics trying to mount the root file system. >>>>=20 >>>> FYI: Typed from picture of screen . . . >>>>=20 >>>> Trying to mount root from ufs:/dev/ufs/FBSDG4Srootfs = [rw,noatime]... >>>> panic: getnewbuf_empty: Locked buf 0xd2800000 on free queue. >>>> . . . >>>> 0xd6919080: at kdb_backtrace+0x64 >>>> 0xd69190e0: at vpanic+0x200 >>>> 0xd6919150: at panic+0x50 >>>> 0xd6919190: at getnewbuf+0x594 >>>> 0xd69191f0: at getblkx+0x540 >>>> 0xd69192a0: at breadn_flags+0x90 >>>> 0xd69192f0: at ffs_use_bread+0x9c >>>> 0xd6919330: at readsuper+0x68 >>>> 0xd6919370: at ffs_sbget+0xcc >>>> 0xd69193c0: at ffs_mount+0x18b8 >>>> 0xd69194f0: at vfs_domount+0xa74 >>>> 0xd69196a0: at vfs_donmount+0x944 >>>> 0xd6919700: at kernel_mount+0x64 >>>> 0xd6919740: at parse_mount+0x52c >>>> 0xd6919840: at vfs_mountroot+0x71c >>>> 0xd69199b0: at start_init+0x44 >>>> 0xd6919a10: at fork_exit_0xcc >>>> 0xd6919a40: at fork_trampoline+0xc >>>> KDB: enter panic >>>> [ thread pid 1 tid 100002 ] >>>> Stopped at kdb_enter+0x74: addi r3,r0,0x0 >>>>=20 >>>> This reproduces with each boot attempt. >>>>=20 >>>> Replacing the kernel with one built via gcc 4.2.1 and booting >>>> the result does not panic. >>>>=20 >>>>=20 >>>> FYI for the context of the panic call: >>>>=20 >>>> /usr/src/sys/kern/vfs_bio.c : >>>>=20 >>>> static struct buf * >>>> buf_alloc(struct bufdomain *bd) >>>> { >>>> struct buf *bp; >>>> int freebufs; >>>>=20 >>>> /* >>>> * We can only run out of bufs in the buf zone if the average = buf >>>> * is less than BKVASIZE. In this case the actual wait/block = will >>>> * come from buf_reycle() failing to flush one of these small = bufs. >>>> */ >>>> bp =3D NULL; >>>> freebufs =3D atomic_fetchadd_int(&bd->bd_freebuffers, -1); >>>> if (freebufs > 0) >>>> bp =3D uma_zalloc(buf_zone, M_NOWAIT); >>>> if (bp =3D=3D NULL) { >>>> atomic_add_int(&bd->bd_freebuffers, 1); >>>> bufspace_daemon_wakeup(bd); >>>> counter_u64_add(numbufallocfails, 1); >>>> return (NULL); >>>> } >>>> /* >>>> * Wake-up the bufspace daemon on transition below threshold. >>>> */ >>>> if (freebufs =3D=3D bd->bd_lofreebuffers) >>>> bufspace_daemon_wakeup(bd); >>>>=20 >>>> if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_NOWAIT, NULL) !=3D 0) >>>> panic("getnewbuf_empty: Locked buf %p on free queue.", = bp); >>>=20 >>>=20 >>> I tried making a debug kernel build via system-clang-8. It >>> reports differently but still during getnewbuf being active >>> on the stack (again typed from a picture): >>>=20 >>> Trying to mount root from ufs:/dev/ufs/FBSDG4Srootfs [rw,noatime]... >>> . . . (ignore witness/diagnostic warnings) . . . >>> panic: bq_remove: Locked buf 0xd2a00000 not on a queue. >>> . . . >>> 0xd6b7bfd0: at kdb_backtrace+0x64 >>> 0xd6b7c030: at vpanic+0x200 >>> 0xd6b7c0a0: at panic+0x50 >>> 0xd6b7c0e0: at bq_remove+01e0 >>> 0xd6b7c100: at buf_import+0x8c >>> 0xd6b7c130: at uma_zalloc_arg+0x544 >>> 0xd6b7c190: at getnewbuf+0x380 >>> 0xd6b7c1f0: at getblkx+0x620 >>> 0xd6b7c290: at breadn_flags+0x90 >>> 0xd6b7c2e0: at ffs_use_bread+0xa8 >>> 0xd6b7c320: at readsuper+0x68 >>> 0xd6b7c360: at ffs_sbget+0xcc >>> 0xd6b7c3b0: at ffs_mount+0xefc >>> 0xd6b7c4e0: at vfs_domount+0xa754 >>> 0xd6b7c690: at vfs_donmount+0x78c >>> 0xd6b7c6f0: at kernel_mount+0x7c >>> 0xd6b7c730: at parse_mount+0x52c >>> 0xd6b7c830: at vfs_mountroot+0x660 >>> 0xd6b7c9a0: at start_init+0x4c >>> 0xd6b7ca10: at fork_exit_0xb0 >>> 0xd6b7ca40: at fork_trampoline+0xc >>>=20 >>> /usr/src/sys/kern/vfs_bio.c : >>>=20 >>> static void >>> bq_remove(struct bufqueue *bq, struct buf *bp) >>> { >>>=20 >>> CTR3(KTR_BUF, "bq_remove(%p) vp %p flags %X", >>> bp, bp->b_vp, bp->b_flags); >>> KASSERT(bp->b_qindex !=3D QUEUE_NONE, >>> ("bq_remove: buffer %p not on a queue.", bp)); >>> . . . >>>=20 >>> For reference: >>>=20 >>> static int >>> buf_import(void *arg, void **store, int cnt, int domain, int flags) >>> { >>> struct buf *bp; >>> int i; >>>=20 >>> BQ_LOCK(&bqempty); >>> for (i =3D 0; i < cnt; i++) { >>> bp =3D TAILQ_FIRST(&bqempty.bq_queue); >>> if (bp =3D=3D NULL) >>> break; >>> bq_remove(&bqempty, bp); >>> store[i] =3D bp; >>> } >>> BQ_UNLOCK(&bqempty); >>>=20 >>> return (i); >>> } >>>=20 >>>=20 >>=20 >> I tried building the debug kernel with KTR for KTR_BUF. >> Installing and booting the result did not panic. Manually >> forcing getting to ddb> soon enough and doing "show ktr" >> did show a bq_remove for 0xd2a00000 (and later activity). >>=20 >> =46rom the looks of the KTR_BUF CTRn's, this suggests to me >> that the access to bp->qindex in bq_remove is racy in >> some way vs. updates to the value. >=20 > The code produced by clang for the debug kernel, KTR > off in this case, for: >=20 > KASSERT(bp->b_qindex !=3D QUEUE_NONE, > ("bq_remove: buffer %p not on a queue.", bp)); >=20 > is wrong [the 84(r29) accesses bp->b_qindex]: >=20 > . . . > 00618aa8 lbz r5,84(r29) > 00618aac cmplwi r5,4 > 00618ab0 bgt- 00618c10 > . . . > 00618c10 lwz r3,-32364(r30) > 00618c14 crclr 4*cr1+eq > 00618c18 mr r4,r29 > 00618c1c bl 00541ca0 > . . . >=20 > Comparing against 4 does not match any part of > bq_remove. Comparison via gt would make sense for: Wrong. The 4 and gt use comes from inlining bufqueue(bp) in the following KASSERT. For reference (from the .i): bufqueue(struct buf *bp) { switch (bp->b_qindex) { case 0: case 4: return (((void *)0)); case 1: return (&bqempty); case 2: return (&bufdomain(bp)->bd_dirtyq); case 3: return (&bufdomain(bp)->bd_subq[bp->b_subqueue]); default: break; } panic("bufqueue(%p): Unhandled type %d\n", bp, bp->b_qindex); } The code generation put the first KASSERT's related panic in to case 0 above when bufqueue was inlined. > /usr/src/sys/sys/buf.h: uint8_t b_qindex; /* (Q) buffer = queue index */) >=20 > if the comparison was against zero. It should > have been: >=20 > /usr/src/sys/kern/vfs_bio.c:#define QUEUE_NONE 0 /* on no queue = */ >=20 >=20 > This is for a head -r347549 32-bit powerpc FreeBSD context, > built with system clang (an amd6->powerpc cross build using > devel/powerpc64-binutils ). =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)