Date: Sat, 19 Sep 2020 12:18:50 -0700 From: Mark Millard <marklmi@yahoo.com> To: Hans Petter Selasky <hps@selasky.org> Cc: freebsd-arm <freebsd-arm@freebsd.org> Subject: Re: Comment #135 for bugzilla 237666 : a USB3-handling problem with a investigatory fix for a cortex-a72 context Message-ID: <565258A0-BEE1-48F8-9851-E6C7CF7ADAE7@yahoo.com> In-Reply-To: <723E6915-94F5-417C-B4AF-EEEBFBDF6162@yahoo.com> References: <6E618C3D-12DF-429E-A249-5BAB90FC6B15.ref@yahoo.com> <6E618C3D-12DF-429E-A249-5BAB90FC6B15@yahoo.com> <866bd652-5a8b-7500-b1e0-7528c035048b@selasky.org> <F9E8A6A0-619B-4C6F-8485-032B8358B780@yahoo.com> <578faf26-6042-91ec-c639-2bca17105fae@selasky.org> <723E6915-94F5-417C-B4AF-EEEBFBDF6162@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
In the below I've cut out text without always indicating where so that code looks complete that is complete. I've also included some text from another reply of yours that fills in some specifics of what you were after. On 2020-Sep-19, at 04:46, Hans Petter Selasky <hps at selasky.org> = wrote: >=20 >>>> . . . >>>>=20 >>>> Your finding indicate a problem in usb_pc_cpu_flush() and >>>>=20 >>>> bus_dmamap_sync(pc->tag, pc->map, BUS_DMASYNC_PREWRITE); >>>>=20 >>>> Try to put the dsb only after dmamap_sync. >>>>=20 >>>> void >>>> usb_pc_cpu_flush(struct usb_page_cache *pc) >>>> { >>>> if (pc->page_offset_end =3D=3D pc->page_offset_buf) { >>>> /* nothing has been loaded into this page cache! */ >>>> return; >>>> } >>>> bus_dmamap_sync(pc->tag, pc->map, BUS_DMASYNC_PREWRITE); >>>> } >>>> . . . >>> After adding a couple of more DSB ST instructions and rebuilding >>> (cross build), installing, and booting, I started a self-hosted, >>> from-scratch, buildworld buildlkernel and intend to installkernel >>> installworld and reboot if it appears to go well. (Better than >>> just a boot test for if things seem at least sufficient, even if >>> overkill.) But it will be hours before buildworld build kernel is >>> done (-j3). (I'll sleep during much of that time.) The buildworld buildkernel, installkernel installworld, reboot sequence seems to have gone fine. >>> . . . >>> phwr =3D buf_res.buffer; >>> addr =3D buf_res.physaddr; >>> addr +=3D (uintptr_t)&((struct xhci_hw_root = *)0)->hwr_events[0]; >>> /* reset hardware root structure */ >>> memset(phwr, 0, sizeof(*phwr)); >>> phwr->hwr_ring_seg[0].qwEvrsTablePtr =3D = htole64(addr); >>> phwr->hwr_ring_seg[0].dwEvrsTableSize =3D = htole32(XHCI_MAX_EVENTS); >>> DPRINTF("ERDP(0)=3D0x%016llx\n", (unsigned long = long)addr); >>> #ifdef __aarch64__ >>> __asm __volatile("dsb st" : : : "memory"); >>> #endif >>> XWRITE4(sc, runt, XHCI_ERDP_LO(0), (uint32_t)addr); >>> XWRITE4(sc, runt, XHCI_ERDP_HI(0), (uint32_t)(addr >> 32)); >>> addr =3D buf_res.physaddr; >>> DPRINTF("ERSTBA(0)=3D0x%016llx\n", (unsigned long = long)addr); >>> XWRITE4(sc, runt, XHCI_ERSTBA_LO(0), (uint32_t)addr); >>> XWRITE4(sc, runt, XHCI_ERSTBA_HI(0), (uint32_t)(addr >> 32)); >> . . . >> Probably same bug in the USB invalidate function. >> On 2020-09-19 11:31, Mark Millard wrote: >>> #ifdef __aarch64__ >>> __asm __volatile("dsb ld" : : : "memory"); >>> #endif >>> temp =3D le32toh(phwr->hwr_events[i].dwTrb3); >>> =20 >>=20 >>=20 >> If you look a few lines up you'll find the: >>=20 >> usb_pc_cpu_invalidate(&sc->sc_hw.root_pc); >>=20 >> That's where this instruction belongs. >> Try to patch those generic flush/invalidate functions instead. I've = been very careful about those. >=20 > Sure. >=20 >>> temp =3D le32toh(phwr->hwr_events[i].dwTrb3); >>> k =3D (temp & XHCI_TRB_3_CYCLE_BIT) ? 1 : 0; >>> if (j !=3D k) >>> break; >>> always took the break because temp was always zero, k was always = zero, >>> and j was 1. I expect this is because the wrong qwEvrsTablePtr and >>> dwEvrsTableSize content was in use by the xhci (fixed by the DSB = ST). >>> . . . >>=20 >> It smells like a generic problem in the busdma synchronize code. >>=20 >>> (Pe means "executing processor".) >>> I expect the removal of that specific DSB ST to result in >>> xhci_interrupt_poll's code again taking the break every time, >>> just based on event ring behavior alone. >>=20 >> Again, try to patch both >> usb_pc_cpu_flush() >> and >> usb_pc_cpu_invalidate() >=20 > Sure. >=20 Unfortunately, it fails the same old way as far as what is seen on the console. So the steps used were: I reverted /usr/src/sys/dev/usb/controller/xhci.c . Then changed things so that: # svnlite diff /usr/src/sys/dev/usb/usb_busdma.c Index: /usr/src/sys/dev/usb/usb_busdma.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/sys/dev/usb/usb_busdma.c (revision 363590) +++ /usr/src/sys/dev/usb/usb_busdma.c (working copy) @@ -737,6 +737,9 @@ */ bus_dmamap_sync(pc->tag, pc->map, BUS_DMASYNC_POSTREAD); bus_dmamap_sync(pc->tag, pc->map, BUS_DMASYNC_PREREAD); +#ifdef __aarch64__ +__asm __volatile("dsb ld" : : : "memory"); +#endif } =20 = /*------------------------------------------------------------------------= * @@ -750,6 +753,9 @@ return; } bus_dmamap_sync(pc->tag, pc->map, BUS_DMASYNC_PREWRITE); +#ifdef __aarch64__ +__asm __volatile("dsb st" : : : "memory"); +#endif } =20 = /*------------------------------------------------------------------------= * As I expected, the resulting build fails to boot the same old way: a sequence messages that repeats indefinitely and involves "xhci0: Resetting controller" and no root file system mount happens. So I put back the just last change that had previously lead to the successful booting (given the prior DSB ST / DSB LD additions). I did so exactly as I originally had it: # svnlite diff /usr/src/sys/dev/usb/controller/xhci.c Index: /usr/src/sys/dev/usb/controller/xhci.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/sys/dev/usb/controller/xhci.c (revision 363590) +++ /usr/src/sys/dev/usb/controller/xhci.c (working copy) @@ -406,6 +406,9 @@ =20 addr =3D buf_res.physaddr; =20 +#ifdef __aarch64__ +__asm __volatile("dsb st" : : : "memory"); +#endif XWRITE4(sc, oper, XHCI_DCBAAP_LO, (uint32_t)addr); XWRITE4(sc, oper, XHCI_DCBAAP_HI, (uint32_t)(addr >> 32)); XWRITE4(sc, oper, XHCI_DCBAAP_LO, (uint32_t)addr); The result still fails. So more is required. I'll do some exploring to try to find a combination that has fewer DSB ST / DSB LD changes in usb/controller/xhci.c than my original A72 success, leaving the usb/usb_busdma.c patch in place. It may take a while for me to have a successful result to report. Side note: The from-scratch buildworld buildkernel reported: World built in 37469 seconds, ncpu: 4, make -j3 Kernel(s) GENERIC-NODBG built in 2474 seconds, ncpu: 4, make -j3 which totals to somewhat over 1hr 50min faster than the last time I did such a -j3 from-scratch rebuild on the same RPi4B: World built in 44034 seconds, ncpu: 4, make -j3 Kernel(s) GENERIC-NODBG built in 2895 seconds, ncpu: 4, make -j3 Both were based on a head -r363590 variation rebuilding itself but the newer build is based on everything being -mcpu=3Dcortex-a72 based already in the running system. (Same over_voltage=3D6 and arm_freq=3D2000 setting in use both times.) (I can cross build, which does not take long.) =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?565258A0-BEE1-48F8-9851-E6C7CF7ADAE7>