Date: Sat, 11 Dec 2021 16:19:15 -0800 From: Mark Millard via freebsd-arm <freebsd-arm@freebsd.org> To: Free BSD <freebsd-arm@freebsd.org> Cc: =?utf-8?Q?Kornel_Dul=C4=99ba?= <mindal@semihalf.com>, Emmanuel Vadot <manu@bidouilliste.com> Subject: Re: Rock64 configuration fails to boot for main 22c4ab6cb015 but worked for main 06bd74e1e39c (Nov 21): e.MMC mishandled? Message-ID: <7717F6CF-0239-4DC0-B23F-B9D5F75C0A8D@yahoo.com> In-Reply-To: <CCB7E706-E866-4141-AB8F-BE7065376EAA@yahoo.com> References: <243CBFC7-DFB5-4F8B-A8A3-CFF78455148D.ref@yahoo.com> <243CBFC7-DFB5-4F8B-A8A3-CFF78455148D@yahoo.com> <20211209081930.7970b6995a8f7c5f7466227d@bidouilliste.com> <053617FD-AA34-4A3F-853A-4D2E44F8254B@yahoo.com> <43901D57-9C39-4FAC-A2BE-CCE642791705@yahoo.com> <CAKpxNiwxvs7-%2BsNa1mX8rAUy_Bs4FdE1%2Bamf5hZXB9CehEJdwQ@mail.gmail.com> <8DAA50A1-3CF0-4AFA-9977-58FE15D4F171@yahoo.com> <CAKpxNiyzKF_JgMFEPK00jU=%2B9_qUq3Vg9KzSos8oCXNs2%2BPYyw@mail.gmail.com> <21B0478B-340F-4BB2-9189-B5A6AE458134@yahoo.com> <CCB7E706-E866-4141-AB8F-BE7065376EAA@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[I've cut out the history: just presenting some new evidence.] First, a little context from getting to the db> prompt. db> ps pid ppid pgrp uid state wmesg wchan cmd 18 0 0 0 DL syncer 0xffff000000eca5a8 [syncer] 17 0 0 0 DL vlruwt 0xffffa00007d2ea60 [vnlru] 16 0 0 0 DL (threaded) [bufdaemon] 100089 D qsleep 0xffff000000ec9478 [bufdaemon] 100092 D - 0xffff000000c11100 = [bufspacedaemon-0] 100093 D - 0xffff000000c21680 = [bufspacedaemon-1] 9 0 0 0 DL psleep 0xffff000000ef0650 [vmdaemon] 8 0 0 0 DL (threaded) = [pagedaemon] 100087 D psleep 0xffff000000ee2b38 [dom0] 100094 D launds 0xffff000000ee2b44 [laundry: = dom0] 100095 D umarcl 0xffff0000007b38d8 [uma] 7 0 0 0 DL mmcsd d 0xffffa00007b72e00 = [mmcsd0boot1: mmc/sd] 6 0 0 0 DL mmcsd d 0xffffa00007b71300 = [mmcsd0boot0: mmc/sd] 5 0 0 0 DL mmcreq 0xffff00009b5d0710 [mmcsd0: = mmc/sd card] 4 0 0 0 DL - 0xffff000000ccc020 = [rand_harvestq] 15 0 0 0 DL (threaded) [usb] . . . and "mmcreq" is from the while loop in: static int mmc_wait_for_req(struct mmc_softc *sc, struct mmc_request *req) { =20 req->done =3D mmc_wakeup; req->done_data =3D sc; if (__predict_false(mmc_debug > 1)) { device_printf(sc->dev, "REQUEST: CMD%d arg %#x flags = %#x", req->cmd->opcode, req->cmd->arg, req->cmd->flags); = =20 if (req->cmd->data) { printf(" data %d\n", (int)req->cmd->data->len);=20= } else printf("\n"); } MMCBR_REQUEST(device_get_parent(sc->dev), sc->dev, req); MMC_LOCK(sc); while ((req->flags & MMC_REQ_DONE) =3D=3D 0) msleep(req, &sc->sc_mtx, 0, "mmcreq", 0); MMC_UNLOCK(sc); if (__predict_false(mmc_debug > 2 || (mmc_debug > 0 && req->cmd->error !=3D MMC_ERR_NONE))) device_printf(sc->dev, "CMD%d RESULT: %d\n", req->cmd->opcode, req->cmd->error); return (0); } So it appears that the error report: mmcsd0: Error indicated: 4 Failed ends up associated with (req->flags & MMC_REQ_DONE) =3D=3D 0 staying true in the above code: an unbounded loop with MMC_LOCK(sc) active. The "4" in the error report seems to be from: #define MMC_ERR_FAILED 4 It looks like there are some problems with handling errors, problems such that it gets stuck looping (no panic, no progress). That seems to be separate from why the MMC_ERR_FAILED was generated in the first place. So: 2 problems, not just one. Thus it may be a good context for tackling the looping problem with a known example failure to look at. Just for reference, I tried "boot -v" with debug.verbose_sysinit=3D1 in = place, just to capture and report the tail of the output for the boot failure. . . . subsystem f000000 release_aps(0)... Release APs...done done. intr_irq_shuffle(0)... Trying to mount root from = ufs:/dev/gpt/Rock64root []... done. netisr_start(0)... done. taskqgroup_bind_softirq(0)... done. GEOM: new disk mmcsd0 GEOM: new disk mmcsd0boot0 GEOM: new disk mmcsd0boot1 smp_after_idle_runnable(0)... done. taskqgroup_bind_if_config_tqg(0)... done. taskqgroup_bind_if_io_tqg(0)... done. tmr_setup_user_access(0)... done. subsystem f000001 mmcsd0: Error indicated: 4 Failed epoch_init_smp(0)... done. subsystem f100000 racctd_init(0)... done. subsystem fffffff start_periodic_resettodr(0)... done. oktousecallout(0)... done. clknode_finish(0)... Unresolved linked clock found: hdmi_phy Unresolved linked clock found: usb480m_phy done. regulator_constraint(0)... done. regulator_shutdown(0)... regulator: shutting down unused regulators regulator: shutting down vcc_sd... busy done. uhub0: 1 port with 1 removable, self powered uhub2: 2 ports with 2 removable, self powered uhub3: 1 port with 1 removable, self powered uhub1: 1 port with 1 removable, self powered ugen4.2: <Samsung PSSD T7 Touch> at usbus4 umass0 on uhub2 umass0: <Samsung PSSD T7 Touch, class 0/0, rev 3.20/1.00, addr 1> on = usbus4 umass0: SCSI over Bulk-Only; quirks =3D 0x0000 umass0:0:0: Attached to scbus0 pass0 at umass-sim0 bus 0 scbus0 target 0 lun 0 pass0: <Samsung PSSD T7 Touch 0> Fixed Direct Access SPC-4 SCSI device pass0: Serial Number REPLACED pass0: 400.000MB/s transfers da0 at umass-sim0 bus 0 scbus0 target 0 lun 0 da0: <Samsung PSSD T7 Touch 0> Fixed Direct Access SPC-4 SCSI device da0: Serial Number REPLACED da0: 400.000MB/s transfers da0: 953869MB (1953525168 512 byte sectors) da0: quirks=3D0x2<NO_6_BYTE> da0: Delete methods: <NONE(*),ZERO> random: unblocking device. No more output after that. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7717F6CF-0239-4DC0-B23F-B9D5F75C0A8D>