Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 27 Nov 2019 12:45:59 -0800
From:      Mark Millard <marklmi@yahoo.com>
To:        freebsd-arm@freebsd.org
Subject:   Re: After more than 59 hr 20 min of poudreire based port building, the Rock64 (4 GiByte) got a data_abort with a panic message that mentioned "vm_fault failed" (cmd in dma_done was NULL, making cmd->data fail).
Message-ID:  <AEA74194-7652-40F1-A340-3DD59A250C3D@yahoo.com>
In-Reply-To: <F337577B-3ED5-4B72-AB02-2FB10FDB7600@yahoo.com>
References:  <F337577B-3ED5-4B72-AB02-2FB10FDB7600.ref@yahoo.com> <F337577B-3ED5-4B72-AB02-2FB10FDB7600@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help


On 2019-Nov-27, at 09:31, Mark Millard <marklmi at yahoo.com> wrote:

> The failure was while dwmmc_intr was active on the bus. It looks
> like the vm_fault failed address matches the elr value, which is
> near the lr value and near the "pc =3D" value listed for dwmmc_intr.
> (Back trace shown later.)

I should have mentioned that the system was running a non-debug
build (with symbols).

Looks like "cmd" was zero (NULL) in:

766     static int
767     dma_done(struct dwmmc_softc *sc, struct mmc_command *cmd)
768     {
769             struct mmc_data *data;
771             data =3D cmd->data;
   0xffff00000078e51c <+648>:     ldr     x8, [x23, #40]

for the use of dma_done in dwmmc_intr that is shown below:

. . .
        cmd =3D sc->curcmd;
. . .
        /* Ack interrupts */
        WRITE4(sc, SDMMC_RINTSTS, reg);

        if (sc->use_pio) {
                if (reg & (SDMMC_INTMASK_RXDR|SDMMC_INTMASK_DTO)) {
                        pio_read(sc, cmd);
                }
                if (reg & (SDMMC_INTMASK_TXDR|SDMMC_INTMASK_DTO)) {
                        pio_write(sc, cmd);
                }
        } else {
                /* Now handle DMA interrupts */
                reg =3D READ4(sc, SDMMC_IDSTS);
                if (reg) {
                        dprintf("dma intr 0x%08x\n", reg);
                        if (reg & (SDMMC_IDINTEN_TI | SDMMC_IDINTEN_RI)) =
{
                                WRITE4(sc, SDMMC_IDSTS, =
(SDMMC_IDINTEN_TI |
                                                         =
SDMMC_IDINTEN_RI));
                                WRITE4(sc, SDMMC_IDSTS, =
SDMMC_IDINTEN_NI);
                                dma_done(sc, cmd);
                        }
                }
        }
. . .

Unfortunately, I did not get a dump.

> This is a head -r355027 based context.
>=20
> This does not look easy to reproduce.
>=20
> I had poudriere running 4 jobs, each allowed to use 4 processes,
> so the bulk of the time the load average was between 8 and 17.
>=20
> The last top update (of my extended top) showed top never saw
> significant swap usage:
>=20
> Swap: 4608M Total, 22M Used, 4586M Free, 32M MaxObsUsed
>=20
> ("MaxObs" is short for "Maximum Observed".)
>=20
> It also showed (line wrapped by me):
>=20
> Mem: 196M Active, 1078M Inact, 4272K Laundry, 650M Wired, 264M Buf,
> 2035M Free, 2517M MaxObsActive, 805M MaxObsWired, 3219M =
MaxObs(Act+Wir)
>=20
> It showed as running:
>=20
> /usr/local/sbin/pkg-static create -r =
/wrkdirs/usr/ports/devel/llvm90/work/stage . . .
> (earlier llvm80 had completed fine)
>=20
> and 3 of processes the form:
>=20
> cpdup -i0 -x ref0?
>=20
> Those 3 seem to be for the 3 "Building"s listed below:
>=20
> [59:20:56] [02] [00:14:53] Finished devel/qt5-linguist | =
qt5-linguist-5.13.2: Success
> [59:20:57] [02] [00:00:00] Building deskutils/lumina-archiver | =
lumina-archiver-1.5.0
> [59:20:57] [03] [00:00:00] Building deskutils/lumina-calculator | =
lumina-calculator-1.5.0
> [59:20:57] [04] [00:00:00] Building x11/lumina-core | =
lumina-core-1.5.0
>=20
>=20
> The serial console's report was:
>=20
> Fatal data abort:
>  x0: fffffd0000b45b00
>  x1: ffff000040588000
>  x2:               8c
>  x3:              100
>  x4: ffff00004035caa0
>  x5: ffff00004035c7b0
>  x6:                0
>  x7:                1
>  x8: ffff000000758ebc
>  x9: ffff000000a33100
> x10: fffffd0000a28678
> x11:                0
> x12:         9633b10b
> x13:             2af8
> x14:             2777
> x15:             2af8
> x16:               38
> x17:               38
> x18: ffff00004035c870
> x19: fffffd0000a28600
> x20:               8c
> x21: fffffd0000b45e58
> x22: ffff000000a4b000
> x23:                0
> x24: fffffd0000b45e10
> x25: fffffd0000b89514
> x26: fffffd0000b8f180
> x27: fffffd0000b45e00
> x28: ffff000000a4bd98
> x29: ffff00004035c8b0
>  sp: ffff00004035c870
>  lr: ffff00000078e518
> elr: ffff00000078e51c
> spsr:              145
> far:               28
> esr:         96000005
> panic: vm_fault failed: ffff00000078e51c
> cpuid =3D 2
> time =3D 1574872496
> KDB: stack backtrace:
> db_trace_self() at db_trace_self_wrapper+0x28
>         pc =3D 0xffff00000075ba9c  lr =3D 0xffff0000001066a8
>         sp =3D 0xffff00004035c270  fp =3D 0xffff00004035c480
>=20
> db_trace_self_wrapper() at vpanic+0x18c
>         pc =3D 0xffff0000001066a8  lr =3D 0xffff00000041903c
>         sp =3D 0xffff00004035c490  fp =3D 0xffff00004035c530
>=20
> vpanic() at panic+0x44
>         pc =3D 0xffff00000041903c  lr =3D 0xffff000000418eac
>         sp =3D 0xffff00004035c540  fp =3D 0xffff00004035c5c0
>=20
> panic() at data_abort+0x1e0
>         pc =3D 0xffff000000418eac  lr =3D 0xffff000000777d94
>         sp =3D 0xffff00004035c5d0  fp =3D 0xffff00004035c680
>=20
> data_abort() at do_el1h_sync+0x144
>         pc =3D 0xffff000000777d94  lr =3D 0xffff000000776fb0
>         sp =3D 0xffff00004035c690  fp =3D 0xffff00004035c6c0
>=20
> do_el1h_sync() at handle_el1h_sync+0x78
>         pc =3D 0xffff000000776fb0  lr =3D 0xffff00000075e078
>         sp =3D 0xffff00004035c6d0  fp =3D 0xffff00004035c7e0
>=20
> handle_el1h_sync() at dwmmc_intr+0x280
>         pc =3D 0xffff00000075e078  lr =3D 0xffff00000078e514
>         sp =3D 0xffff00004035c7f0  fp =3D 0xffff00004035c8b0
>=20
> dwmmc_intr() at ithread_loop+0x1f4
>         pc =3D 0xffff00000078e514  lr =3D 0xffff0000003db604
>         sp =3D 0xffff00004035c8c0  fp =3D 0xffff00004035c940
>=20
> ithread_loop() at fork_exit+0x90
>         pc =3D 0xffff0000003db604  lr =3D 0xffff0000003d7be4
>         sp =3D 0xffff00004035c950  fp =3D 0xffff00004035c980
>=20
> fork_exit() at fork_trampoline+0x10
>         pc =3D 0xffff0000003d7be4  lr =3D 0xffff000000776cec
>         sp =3D 0xffff00004035c990  fp =3D 0x0000000000000000
>=20
> KDB: enter: panic
> [ thread pid 12 tid 100038 ]
> Stopped at      dwmmc_intr+0x288:       ldr     x8, [x23, #40]
> db>=20

=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AEA74194-7652-40F1-A340-3DD59A250C3D>