Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 5 Jun 2019 14:17:39 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>, FreeBSD Toolchain <freebsd-toolchain@freebsd.org>
Subject:   Re: crash of 32-bit powerpc -r347549 kernel built via system-clang-8 (crash is while trying to mount the root file system) [debug kernel case: code generation error]
Message-ID:  <DD895640-0487-45F5-9D88-C0CD3CD7CF9D@yahoo.com>
In-Reply-To: <995DA649-9390-420B-AC95-FFD17079CDA9@yahoo.com>
References:  <45D010BF-7654-43A6-8FF4-CCDEEF4004F6@yahoo.com> <4354EA25-69C2-4CAB-8273-62457333BD30@yahoo.com> <995DA649-9390-420B-AC95-FFD17079CDA9@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
[This is from my experiments with more modern toolchains than
normally/offocially used, specifically for 32-bit powerpc this
time.]

On 2019-Jun-5, at 01:35, Mark Millard <marklmi at yahoo.com> wrote:

> On 2019-Jun-3, at 19:40, Mark Millard <marklmi at yahoo.com> wrote:
>=20
>> On 2019-Jun-3, at 17:24, Mark Millard <marklmi at yahoo.com> wrote:
>>=20
>>> I tried (cross) building a 32-bit powerpc kernel and world =
(non-debug)=20
>>> with system-clang (on amd64) and use of devel/powerpc64-binutils . =
The
>>> installed kernel panics trying to mount the root file system.
>>>=20
>>> FYI: Typed from picture of screen . . .
>>>=20
>>> Trying to mount root from ufs:/dev/ufs/FBSDG4Srootfs [rw,noatime]...
>>> panic: getnewbuf_empty: Locked buf 0xd2800000 on free queue.
>>> . . .
>>> 0xd6919080: at kdb_backtrace+0x64
>>> 0xd69190e0: at vpanic+0x200
>>> 0xd6919150: at panic+0x50
>>> 0xd6919190: at getnewbuf+0x594
>>> 0xd69191f0: at getblkx+0x540
>>> 0xd69192a0: at breadn_flags+0x90
>>> 0xd69192f0: at ffs_use_bread+0x9c
>>> 0xd6919330: at readsuper+0x68
>>> 0xd6919370: at ffs_sbget+0xcc
>>> 0xd69193c0: at ffs_mount+0x18b8
>>> 0xd69194f0: at vfs_domount+0xa74
>>> 0xd69196a0: at vfs_donmount+0x944
>>> 0xd6919700: at kernel_mount+0x64
>>> 0xd6919740: at parse_mount+0x52c
>>> 0xd6919840: at vfs_mountroot+0x71c
>>> 0xd69199b0: at start_init+0x44
>>> 0xd6919a10: at fork_exit_0xcc
>>> 0xd6919a40: at fork_trampoline+0xc
>>> KDB: enter panic
>>> [ thread pid 1 tid 100002 ]
>>> Stopped at kdb_enter+0x74: addi r3,r0,0x0
>>>=20
>>> This reproduces with each boot attempt.
>>>=20
>>> Replacing the kernel with one built via gcc 4.2.1 and booting
>>> the result does not panic.
>>>=20
>>>=20
>>> FYI for the context of the panic call:
>>>=20
>>> /usr/src/sys/kern/vfs_bio.c :
>>>=20
>>> static struct buf *
>>> buf_alloc(struct bufdomain *bd)
>>> {
>>>      struct buf *bp;
>>>      int freebufs;
>>>=20
>>>      /*
>>>       * We can only run out of bufs in the buf zone if the average =
buf
>>>       * is less than BKVASIZE.  In this case the actual wait/block =
will
>>>       * come from buf_reycle() failing to flush one of these small =
bufs.
>>>       */
>>>      bp =3D NULL;
>>>      freebufs =3D atomic_fetchadd_int(&bd->bd_freebuffers, -1);
>>>      if (freebufs > 0)
>>>              bp =3D uma_zalloc(buf_zone, M_NOWAIT);
>>>      if (bp =3D=3D NULL) {
>>>              atomic_add_int(&bd->bd_freebuffers, 1);
>>>              bufspace_daemon_wakeup(bd);
>>>              counter_u64_add(numbufallocfails, 1);
>>>              return (NULL);
>>>      }
>>>      /*
>>>       * Wake-up the bufspace daemon on transition below threshold.
>>>       */
>>>      if (freebufs =3D=3D bd->bd_lofreebuffers)
>>>              bufspace_daemon_wakeup(bd);
>>>=20
>>>      if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_NOWAIT, NULL) !=3D 0)
>>>              panic("getnewbuf_empty: Locked buf %p on free queue.", =
bp);
>>=20
>>=20
>> I tried making a debug kernel build via system-clang-8. It
>> reports differently but still during getnewbuf being active
>> on the stack (again typed from a picture):
>>=20
>> Trying to mount root from ufs:/dev/ufs/FBSDG4Srootfs [rw,noatime]...
>> . . . (ignore witness/diagnostic warnings) . . .
>> panic: bq_remove: Locked buf 0xd2a00000 not on a queue.
>> . . .
>> 0xd6b7bfd0: at kdb_backtrace+0x64
>> 0xd6b7c030: at vpanic+0x200
>> 0xd6b7c0a0: at panic+0x50
>> 0xd6b7c0e0: at bq_remove+01e0
>> 0xd6b7c100: at buf_import+0x8c
>> 0xd6b7c130: at uma_zalloc_arg+0x544
>> 0xd6b7c190: at getnewbuf+0x380
>> 0xd6b7c1f0: at getblkx+0x620
>> 0xd6b7c290: at breadn_flags+0x90
>> 0xd6b7c2e0: at ffs_use_bread+0xa8
>> 0xd6b7c320: at readsuper+0x68
>> 0xd6b7c360: at ffs_sbget+0xcc
>> 0xd6b7c3b0: at ffs_mount+0xefc
>> 0xd6b7c4e0: at vfs_domount+0xa754
>> 0xd6b7c690: at vfs_donmount+0x78c
>> 0xd6b7c6f0: at kernel_mount+0x7c
>> 0xd6b7c730: at parse_mount+0x52c
>> 0xd6b7c830: at vfs_mountroot+0x660
>> 0xd6b7c9a0: at start_init+0x4c
>> 0xd6b7ca10: at fork_exit_0xb0
>> 0xd6b7ca40: at fork_trampoline+0xc
>>=20
>> /usr/src/sys/kern/vfs_bio.c :
>>=20
>> static void
>> bq_remove(struct bufqueue *bq, struct buf *bp)
>> {
>>=20
>>       CTR3(KTR_BUF, "bq_remove(%p) vp %p flags %X",
>>           bp, bp->b_vp, bp->b_flags);
>>       KASSERT(bp->b_qindex !=3D QUEUE_NONE,
>>           ("bq_remove: buffer %p not on a queue.", bp));
>> . . .
>>=20
>> For reference:
>>=20
>> static int
>> buf_import(void *arg, void **store, int cnt, int domain, int flags)
>> {
>>       struct buf *bp;
>>       int i;
>>=20
>>       BQ_LOCK(&bqempty);
>>       for (i =3D 0; i < cnt; i++) {
>>               bp =3D TAILQ_FIRST(&bqempty.bq_queue);
>>               if (bp =3D=3D NULL)
>>                       break;
>>               bq_remove(&bqempty, bp);
>>               store[i] =3D bp;
>>       }
>>       BQ_UNLOCK(&bqempty);
>>=20
>>       return (i);
>> }
>>=20
>>=20
>=20
> I tried building the debug kernel with KTR for KTR_BUF.
> Installing and booting the result did not panic. Manually
> forcing getting to ddb> soon enough and doing "show ktr"
> did show a bq_remove for 0xd2a00000 (and later activity).
>=20
> =46rom the looks of the KTR_BUF CTRn's, this suggests to me
> that the access to bp->qindex in bq_remove is racy in
> some way vs. updates to the value.

The code produced by clang for the debug kernel, KTR
off in this case, for:

      KASSERT(bp->b_qindex !=3D QUEUE_NONE,
          ("bq_remove: buffer %p not on a queue.", bp));

is wrong [the 84(r29) accesses bp->b_qindex]:

. . .
00618aa8 <bq_remove+0x34> lbz     r5,84(r29)
00618aac <bq_remove+0x38> cmplwi  r5,4
00618ab0 <bq_remove+0x3c> bgt-    00618c10 <bq_remove+0x19c>
. . .
00618c10 <bq_remove+0x19c> lwz     r3,-32364(r30)
00618c14 <bq_remove+0x1a0> crclr   4*cr1+eq
00618c18 <bq_remove+0x1a4> mr      r4,r29
00618c1c <bq_remove+0x1a8> bl      00541ca0 <panic>
. . .

Comparing against 4 does not match any part of
bq_remove. Comparison via gt would make sense for:

/usr/src/sys/sys/buf.h: uint8_t         b_qindex;       /* (Q) buffer =
queue index */)

if the comparison was against zero. It should
have been:

/usr/src/sys/kern/vfs_bio.c:#define QUEUE_NONE  0       /* on no queue =
*/


This is for a head -r347549 32-bit powerpc FreeBSD context,
built with system clang (an amd6->powerpc cross build using
devel/powerpc64-binutils ).



=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?DD895640-0487-45F5-9D88-C0CD3CD7CF9D>