Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 11 Dec 2021 16:19:15 -0800
From:      Mark Millard via freebsd-arm <freebsd-arm@freebsd.org>
To:        Free BSD <freebsd-arm@freebsd.org>
Cc:        =?utf-8?Q?Kornel_Dul=C4=99ba?= <mindal@semihalf.com>, Emmanuel Vadot <manu@bidouilliste.com>
Subject:   Re: Rock64 configuration fails to boot for main 22c4ab6cb015 but worked for main 06bd74e1e39c (Nov 21): e.MMC mishandled?
Message-ID:  <7717F6CF-0239-4DC0-B23F-B9D5F75C0A8D@yahoo.com>
In-Reply-To: <CCB7E706-E866-4141-AB8F-BE7065376EAA@yahoo.com>
References:  <243CBFC7-DFB5-4F8B-A8A3-CFF78455148D.ref@yahoo.com> <243CBFC7-DFB5-4F8B-A8A3-CFF78455148D@yahoo.com> <20211209081930.7970b6995a8f7c5f7466227d@bidouilliste.com> <053617FD-AA34-4A3F-853A-4D2E44F8254B@yahoo.com> <43901D57-9C39-4FAC-A2BE-CCE642791705@yahoo.com> <CAKpxNiwxvs7-%2BsNa1mX8rAUy_Bs4FdE1%2Bamf5hZXB9CehEJdwQ@mail.gmail.com> <8DAA50A1-3CF0-4AFA-9977-58FE15D4F171@yahoo.com> <CAKpxNiyzKF_JgMFEPK00jU=%2B9_qUq3Vg9KzSos8oCXNs2%2BPYyw@mail.gmail.com> <21B0478B-340F-4BB2-9189-B5A6AE458134@yahoo.com> <CCB7E706-E866-4141-AB8F-BE7065376EAA@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
[I've cut out the history: just presenting some new evidence.]

First, a little context from getting to the db> prompt.

db> ps
  pid  ppid  pgrp   uid  state   wmesg   wchan               cmd
   18     0     0     0  DL      syncer  0xffff000000eca5a8  [syncer]
   17     0     0     0  DL      vlruwt  0xffffa00007d2ea60  [vnlru]
   16     0     0     0  DL      (threaded)                  [bufdaemon]
100089                   D       qsleep  0xffff000000ec9478  [bufdaemon]
100092                   D       -       0xffff000000c11100  =
[bufspacedaemon-0]
100093                   D       -       0xffff000000c21680  =
[bufspacedaemon-1]
    9     0     0     0  DL      psleep  0xffff000000ef0650  [vmdaemon]
    8     0     0     0  DL      (threaded)                  =
[pagedaemon]
100087                   D       psleep  0xffff000000ee2b38  [dom0]
100094                   D       launds  0xffff000000ee2b44  [laundry: =
dom0]
100095                   D       umarcl  0xffff0000007b38d8  [uma]
    7     0     0     0  DL      mmcsd d 0xffffa00007b72e00  =
[mmcsd0boot1: mmc/sd]
    6     0     0     0  DL      mmcsd d 0xffffa00007b71300  =
[mmcsd0boot0: mmc/sd]
    5     0     0     0  DL      mmcreq  0xffff00009b5d0710  [mmcsd0: =
mmc/sd card]
    4     0     0     0  DL      -       0xffff000000ccc020  =
[rand_harvestq]
   15     0     0     0  DL      (threaded)                  [usb]
. . .

and "mmcreq" is from the while loop in:

static int
mmc_wait_for_req(struct mmc_softc *sc, struct mmc_request *req)
{
 =20
        req->done =3D mmc_wakeup;
        req->done_data =3D sc;
        if (__predict_false(mmc_debug > 1)) {
                device_printf(sc->dev, "REQUEST: CMD%d arg %#x flags =
%#x",
                    req->cmd->opcode, req->cmd->arg, req->cmd->flags);   =
 =20
                if (req->cmd->data) {
                        printf(" data %d\n", (int)req->cmd->data->len);=20=

                } else
                        printf("\n");
        }
        MMCBR_REQUEST(device_get_parent(sc->dev), sc->dev, req);
        MMC_LOCK(sc);
        while ((req->flags & MMC_REQ_DONE) =3D=3D 0)
                msleep(req, &sc->sc_mtx, 0, "mmcreq", 0);
        MMC_UNLOCK(sc);
        if (__predict_false(mmc_debug > 2 || (mmc_debug > 0 &&
            req->cmd->error !=3D MMC_ERR_NONE)))
                device_printf(sc->dev, "CMD%d RESULT: %d\n",
                    req->cmd->opcode, req->cmd->error);
        return (0);
}

So it appears that the error report:

mmcsd0: Error indicated: 4 Failed

ends up associated with (req->flags & MMC_REQ_DONE) =3D=3D 0 staying
true in the above code: an unbounded loop with MMC_LOCK(sc) active.
The "4" in the error report seems to be from:

#define MMC_ERR_FAILED  4

It looks like there are some problems with handling errors, problems
such that it gets stuck looping (no panic, no progress).

That seems to be separate from why the MMC_ERR_FAILED was generated
in the first place. So: 2 problems, not just one. Thus it may be a
good context for tackling the looping problem with a known example
failure to look at.



Just for reference, I tried "boot -v" with debug.verbose_sysinit=3D1 in =
place,
just to capture and report the tail of the output for the boot failure.

. . .
subsystem f000000
   release_aps(0)... Release APs...done
done.
   intr_irq_shuffle(0)... Trying to mount root from =
ufs:/dev/gpt/Rock64root []...
done.
   netisr_start(0)... done.
   taskqgroup_bind_softirq(0)... done.
GEOM: new disk mmcsd0
GEOM: new disk mmcsd0boot0
GEOM: new disk mmcsd0boot1
   smp_after_idle_runnable(0)... done.
   taskqgroup_bind_if_config_tqg(0)... done.
   taskqgroup_bind_if_io_tqg(0)... done.
   tmr_setup_user_access(0)... done.
subsystem f000001
mmcsd0: Error indicated: 4 Failed
   epoch_init_smp(0)... done.
subsystem f100000
   racctd_init(0)... done.
subsystem fffffff
   start_periodic_resettodr(0)... done.
   oktousecallout(0)... done.
   clknode_finish(0)... Unresolved linked clock found: hdmi_phy
Unresolved linked clock found: usb480m_phy
done.
   regulator_constraint(0)... done.
   regulator_shutdown(0)... regulator: shutting down unused regulators
regulator: shutting down vcc_sd... busy
done.
uhub0: 1 port with 1 removable, self powered
uhub2: 2 ports with 2 removable, self powered
uhub3: 1 port with 1 removable, self powered
uhub1: 1 port with 1 removable, self powered
ugen4.2: <Samsung PSSD T7 Touch> at usbus4
umass0 on uhub2
umass0: <Samsung PSSD T7 Touch, class 0/0, rev 3.20/1.00, addr 1> on =
usbus4
umass0:  SCSI over Bulk-Only; quirks =3D 0x0000
umass0:0:0: Attached to scbus0
pass0 at umass-sim0 bus 0 scbus0 target 0 lun 0
pass0: <Samsung PSSD T7 Touch 0> Fixed Direct Access SPC-4 SCSI device
pass0: Serial Number REPLACED
pass0: 400.000MB/s transfers
da0 at umass-sim0 bus 0 scbus0 target 0 lun 0
da0: <Samsung PSSD T7 Touch 0> Fixed Direct Access SPC-4 SCSI device
da0: Serial Number REPLACED
da0: 400.000MB/s transfers
da0: 953869MB (1953525168 512 byte sectors)
da0: quirks=3D0x2<NO_6_BYTE>
da0: Delete methods: <NONE(*),ZERO>
random: unblocking device.

No more output after that.

=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7717F6CF-0239-4DC0-B23F-B9D5F75C0A8D>