Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 5 Jun 2019 01:35:28 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>
Subject:   Re: crash of 32-bit powerpc -r347549 kernel built via system-clang-8 (crash is while trying to mount the root file system)
Message-ID:  <995DA649-9390-420B-AC95-FFD17079CDA9@yahoo.com>
In-Reply-To: <4354EA25-69C2-4CAB-8273-62457333BD30@yahoo.com>
References:  <45D010BF-7654-43A6-8FF4-CCDEEF4004F6@yahoo.com> <4354EA25-69C2-4CAB-8273-62457333BD30@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2019-Jun-3, at 19:40, Mark Millard <marklmi at yahoo.com> wrote:

> On 2019-Jun-3, at 17:24, Mark Millard <marklmi at yahoo.com> wrote:
>=20
>> I tried (cross) building a 32-bit powerpc kernel and world =
(non-debug)=20
>> with system-clang (on amd64) and use of devel/powerpc64-binutils . =
The
>> installed kernel panics trying to mount the root file system.
>>=20
>> FYI: Typed from picture of screen . . .
>>=20
>> Trying to mount root from ufs:/dev/ufs/FBSDG4Srootfs [rw,noatime]...
>> panic: getnewbuf_empty: Locked buf 0xd2800000 on free queue.
>> . . .
>> 0xd6919080: at kdb_backtrace+0x64
>> 0xd69190e0: at vpanic+0x200
>> 0xd6919150: at panic+0x50
>> 0xd6919190: at getnewbuf+0x594
>> 0xd69191f0: at getblkx+0x540
>> 0xd69192a0: at breadn_flags+0x90
>> 0xd69192f0: at ffs_use_bread+0x9c
>> 0xd6919330: at readsuper+0x68
>> 0xd6919370: at ffs_sbget+0xcc
>> 0xd69193c0: at ffs_mount+0x18b8
>> 0xd69194f0: at vfs_domount+0xa74
>> 0xd69196a0: at vfs_donmount+0x944
>> 0xd6919700: at kernel_mount+0x64
>> 0xd6919740: at parse_mount+0x52c
>> 0xd6919840: at vfs_mountroot+0x71c
>> 0xd69199b0: at start_init+0x44
>> 0xd6919a10: at fork_exit_0xcc
>> 0xd6919a40: at fork_trampoline+0xc
>> KDB: enter panic
>> [ thread pid 1 tid 100002 ]
>> Stopped at kdb_enter+0x74: addi r3,r0,0x0
>>=20
>> This reproduces with each boot attempt.
>>=20
>> Replacing the kernel with one built via gcc 4.2.1 and booting
>> the result does not panic.
>>=20
>>=20
>> FYI for the context of the panic call:
>>=20
>> /usr/src/sys/kern/vfs_bio.c :
>>=20
>> static struct buf *
>> buf_alloc(struct bufdomain *bd)
>> {
>>       struct buf *bp;
>>       int freebufs;
>>=20
>>       /*
>>        * We can only run out of bufs in the buf zone if the average =
buf
>>        * is less than BKVASIZE.  In this case the actual wait/block =
will
>>        * come from buf_reycle() failing to flush one of these small =
bufs.
>>        */
>>       bp =3D NULL;
>>       freebufs =3D atomic_fetchadd_int(&bd->bd_freebuffers, -1);
>>       if (freebufs > 0)
>>               bp =3D uma_zalloc(buf_zone, M_NOWAIT);
>>       if (bp =3D=3D NULL) {
>>               atomic_add_int(&bd->bd_freebuffers, 1);
>>               bufspace_daemon_wakeup(bd);
>>               counter_u64_add(numbufallocfails, 1);
>>               return (NULL);
>>       }
>>       /*
>>        * Wake-up the bufspace daemon on transition below threshold.
>>        */
>>       if (freebufs =3D=3D bd->bd_lofreebuffers)
>>               bufspace_daemon_wakeup(bd);
>>=20
>>       if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_NOWAIT, NULL) !=3D 0)
>>               panic("getnewbuf_empty: Locked buf %p on free queue.", =
bp);
>=20
>=20
> I tried making a debug kernel build via system-clang-8. It
> reports differently but still during getnewbuf being active
> on the stack (again typed from a picture):
>=20
> Trying to mount root from ufs:/dev/ufs/FBSDG4Srootfs [rw,noatime]...
> . . . (ignore witness/diagnostic warnings) . . .
> panic: bq_remove: Locked buf 0xd2a00000 not on a queue.
> . . .
> 0xd6b7bfd0: at kdb_backtrace+0x64
> 0xd6b7c030: at vpanic+0x200
> 0xd6b7c0a0: at panic+0x50
> 0xd6b7c0e0: at bq_remove+01e0
> 0xd6b7c100: at buf_import+0x8c
> 0xd6b7c130: at uma_zalloc_arg+0x544
> 0xd6b7c190: at getnewbuf+0x380
> 0xd6b7c1f0: at getblkx+0x620
> 0xd6b7c290: at breadn_flags+0x90
> 0xd6b7c2e0: at ffs_use_bread+0xa8
> 0xd6b7c320: at readsuper+0x68
> 0xd6b7c360: at ffs_sbget+0xcc
> 0xd6b7c3b0: at ffs_mount+0xefc
> 0xd6b7c4e0: at vfs_domount+0xa754
> 0xd6b7c690: at vfs_donmount+0x78c
> 0xd6b7c6f0: at kernel_mount+0x7c
> 0xd6b7c730: at parse_mount+0x52c
> 0xd6b7c830: at vfs_mountroot+0x660
> 0xd6b7c9a0: at start_init+0x4c
> 0xd6b7ca10: at fork_exit_0xb0
> 0xd6b7ca40: at fork_trampoline+0xc
>=20
> /usr/src/sys/kern/vfs_bio.c :
>=20
> static void
> bq_remove(struct bufqueue *bq, struct buf *bp)
> {
>=20
>        CTR3(KTR_BUF, "bq_remove(%p) vp %p flags %X",
>            bp, bp->b_vp, bp->b_flags);
>        KASSERT(bp->b_qindex !=3D QUEUE_NONE,
>            ("bq_remove: buffer %p not on a queue.", bp));
> . . .
>=20
> For reference:
>=20
> static int
> buf_import(void *arg, void **store, int cnt, int domain, int flags)
> {
>        struct buf *bp;
>        int i;
>=20
>        BQ_LOCK(&bqempty);
>        for (i =3D 0; i < cnt; i++) {
>                bp =3D TAILQ_FIRST(&bqempty.bq_queue);
>                if (bp =3D=3D NULL)
>                        break;
>                bq_remove(&bqempty, bp);
>                store[i] =3D bp;
>        }
>        BQ_UNLOCK(&bqempty);
>=20
>        return (i);
> }
>=20
>=20

I tried building the debug kernel with KTR for KTR_BUF.
Installing and booting the result did not panic. Manually
forcing getting to ddb> soon enough and doing "show ktr"
did show a bq_remove for 0xd2a00000 (and later activity).

=46rom the looks of the KTR_BUF CTRn's, this suggests to me
that the access to bp->qindex in bq_remove is racy in
some way vs. updates to the value.

=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?995DA649-9390-420B-AC95-FFD17079CDA9>