Date: Mon, 12 Feb 2024 16:36:28 -0800 From: Mark Millard <marklmi@yahoo.com> To: John Baldwin <jhb@freebsd.org> Cc: FreeBSD ARM List <freebsd-arm@freebsd.org>, Current FreeBSD <freebsd-current@freebsd.org>, Warner Losh <imp@bsdimp.com> Subject: Re: Recent commits reject RPi4B booting: pcib0 vs. pcib1 "rman_manage_region: <pcib1 memory window> request" leads to panic Message-ID: <3A145420-399D-4EBD-9FF4-18924908AB1D@yahoo.com> In-Reply-To: <4C279710-5F88-4295-B1A4-7C395F3587E5@yahoo.com> References: <76AB969F-5BC5-4116-8AF4-3ED2CABEBBA5.ref@yahoo.com> <76AB969F-5BC5-4116-8AF4-3ED2CABEBBA5@yahoo.com> <f4326cda-c602-439f-ab03-f66b3bca5ff2@FreeBSD.org> <1F704317-FDB8-4BDA-8A67-61CF48794DFE@yahoo.com> <9AFDF067-96E4-4E67-90D2-F40DAF3F138F@yahoo.com> <4C279710-5F88-4295-B1A4-7C395F3587E5@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Feb 12, 2024, at 16:10, Mark Millard <marklmi@yahoo.com> wrote: > On Feb 12, 2024, at 12:00, Mark Millard <marklmi@yahoo.com> wrote: >=20 >> [Gack: I was looking at the wrong vintage of source code, predating >> your changes: wrong system used.] >>=20 >>=20 >> On Feb 12, 2024, at 10:41, Mark Millard <marklmi@yahoo.com> wrote: >>=20 >>> On Feb 12, 2024, at 09:32, John Baldwin <jhb@freebsd.org> wrote: >>>=20 >>>> On 2/9/24 8:13 PM, Mark Millard wrote: >>>>> Summary: >>>>> pcib0: <BCM2838-compatible PCI-express controller> mem = 0x7d500000-0x7d50930f irq 80,81 on simplebus2 >>>>> pcib0: parsing FDT for ECAM0: >>>>> pcib0: PCI addr: 0xc0000000, CPU addr: 0x600000000, Size: = 0x40000000 >>>>> . . . >>>>> rman_manage_region: <pcib1 memory window> request: start = 0x600000000, end 0x6000fffff >>>>> panic: Failed to add resource to rman >>>>=20 >>>> Hmmm, I suspect this is due to the way that bus_translate_resource = works which is >>>> fundamentally broken. It rewrites the start address of a resource = in-situ instead >>>> of keeping downstream resources separate from the upstream = resources. For example, >>>> I don't see how you could ever release a resource in this design = without completely >>>> screwing up your rman. That is, I expect trying to detach a PCI = device behind a >>>> translating bridge that uses the current approach should corrupt = the allocated >>>> resource ranges in an rman long before my changes. >>>>=20 >>>> That said, that doesn't really explain the panic. Hmm, the panic = might be because >>>> for PCI bridge windows the driver now passes RF_ACTIVE and the = bus_translate_resource >>>> hack only kicks in the activate_resource method of = pci_host_generic.c. >>>>=20 >>>>> Detail: >>>>> . . . >>>>> pcib0: <BCM2838-compatible PCI-express controller> mem = 0x7d500000-0x7d50930f irq 80,81 on simplebus2 >>>>> pcib0: parsing FDT for ECAM0: >>>>> pcib0: PCI addr: 0xc0000000, CPU addr: 0x600000000, Size: = 0x40000000 >>>>=20 >>>> This indicates this is a translating bus. >>>>=20 >>>>> pcib1: <PCI-PCI bridge> irq 91 at device 0.0 on pci0 >>>>> rman_manage_region: <pcib1 bus numbers> request: start 0x1, end = 0x1 >>>>> pcib0: rman_reserve_resource: start=3D0xc0000000, end=3D0xc00fffff, = count=3D0x100000 >>>>> rman_reserve_resource_bound: <PCIe Memory> request: [0xc0000000, = 0xc00fffff], length 0x100000, flags 102, device pcib1 >>>>> rman_reserve_resource_bound: trying 0xffffffff = <0xc0000000,0xfffff> >>>>> considering [0xc0000000, 0xffffffff] >>>>> truncated region: [0xc0000000, 0xc00fffff]; size 0x100000 = (requested 0x100000) >>>>> candidate region: [0xc0000000, 0xc00fffff], size 0x100000 >>>>> allocating from the beginning >>>>> rman_manage_region: <pcib1 memory window> request: start = 0x600000000, end 0x6000fffff >=20 > What you later typed does not match: >=20 > 0x600000000 > 0x6000fffff >=20 > You later typed: >=20 > 0x60000000 > 0x600fffffff >=20 > This seems to have lead to some confusion from using the > wrong figure(s). >=20 >>>> The fact that we are trying to reserve the CPU addresses in the = rman is because >>>> bus_translate_resource rewrote the start address in the resource = after it was allocated. >>>>=20 >>>> That said, I can't see why rman_manage_region would actually fail. = At this point the >>>> rman is empty (this is the first call to rman_manage_region for = "pcib1 memory window"), >>>> so only the check that should be failing are the checks against = rm_start and >>>> rm_end. For the memory window, rm_start is always 0, and rm_end is = always >>>> 0xffffffff, so both the old (0xc00000000 - 0xc00fffff) and new = (0x60000000 - 0x600fffffff) >>>> ranges are within those bounds. >=20 > No: >=20 > 0xffffffff >=20 > .vs (actual): >=20 > 0x600000000 > 0x6000fffff >=20 > Both the actuals are larger then the 0xffffffff figure you list above. >=20 > I've no clue if the 0xffffffff might have its own typos. >=20 >>>> I would instead expect to see some other issue later >>>> on where we fail to allocate a resource for a child BAR, but I = wouldn't expect >>>> rman_manage_region to fail. >>>>=20 >>>> Logging the return value from rman_manage_region would be the first = step I think >>>> to see which error value it is returning. >>>=20 >>> Looking at the code in sys/kern/subr_rman.c for rman_manage_region I = see >>> the (mostly) return rv related logic only has ENONMEM (explicit = return) and >>> EBUSY as non-0 possibilities (no other returns). >>=20 >> The modern code also has EINVAL via an explicit return. >>=20 >>> All the rv references and >>> all the returns are shown below: >>>=20 >>> int rv =3D 0; >>> . . . >>=20 >> The modern code also has here: >>=20 >> DPRINTF(("rman_manage_region: <%s> request: start %#jx, end = %#jx\n", >> rm->rm_descr, start, end)); >> if (start < rm->rm_start || end > rm->rm_end) >> return EINVAL; >>=20 >> Adding rm->rm_start and rm->rm_end to the message might be >> appropriate --and would also be enough additional information >> to know if EINVAL would be returned. >=20 > I added such code and built, installed, and tried booting > a separate kernel. The result was: >=20 > rman_manage_region: <pcib1 memory window> range 0..0xffffffff : = request: start 0x600000000, end 0x6000fffff > panic: Failed to add resource to rman >=20 > So: >=20 > 0 > vs. start: > 0x600000000 >=20 > and: >=20 > 0xffffffff > vs. end: > 0x6000fffff >=20 > The 0x600000000..0x6000fffff range is not a range of RAM addresses > [too large for that for the 8 GiByte RPi4B] and, so, is well above > the 32 bit boundary too. >=20 > That is the only line referencing "memory window". A 32 bit initial = range > for some reason. For reference, the source change in question was: >=20 > - DPRINTF(("rman_manage_region: <%s> request: start %#jx, end = %#jx\n", > - rm->rm_descr, start, end)); > + DPRINTF(("rman_manage_region: <%s> range %#jx..%#jx : request: = start %#jx, end %#jx\n", > + rm->rm_descr, rm->rm_start, rm->rm_end, start, end)); > if (start < rm->rm_start || end > rm->rm_end) > return EINVAL; >=20 > I'll show a larger boot output context at the bottom of the message > for reference. >=20 >>> r =3D int_alloc_resource(M_NOWAIT); >>> if (r =3D=3D NULL) >>> return ENOMEM; >>> . . . >>> /* Check for any overlap with the current region. */ >>> if (r->r_start <=3D s->r_end && r->r_end >=3D = s->r_start) { >>> rv =3D EBUSY; >>> goto out; >>> } >>>=20 >>> /* Check for any overlap with the next region. */ >>> t =3D TAILQ_NEXT(s, r_link); >>> if (t && r->r_start <=3D t->r_end && r->r_end >=3D = t->r_start) { >>> rv =3D EBUSY; >>> goto out; >>> } >>> . . . >>> out: >>> mtx_unlock(rm->rm_mtx); >>> return rv; >>>=20 >>> int_alloc_resource failure would be failure (NULL) from the: >>>=20 >>> struct resource_i *r; >>>=20 >>> r =3D malloc(sizeof *r, M_RMAN, malloc_flag | M_ZERO); >>>=20 >>> (associated with the M_NOWAIT argument). >>>=20 >>> The malloc failure would likely go in a very different direction. >>>=20 >>>=20 >>>=20 >>> Side note: looks like the EBUSY cases leak what r references. >>=20 >> Still true in the newer code. >>=20 >>>> Probably I should fix pci_host_generic.c to handle translation = properly however. >>>> I can work on a patch for that. >>>=20 >=20 > For reference: >=20 > . . . > pcib0: <BCM2838-compatible PCI-express controller> mem = 0x7d500000-0x7d50930f irq 80,81 on simplebus2 > pcib0: parsing FDT for ECAM0: > pcib0: PCI addr: 0xc0000000, CPU addr: 0x600000000, Size: 0x40000000 > pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0 > pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0 > pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0 > pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0 > pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0 > pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0 > pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0 > pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0 > pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0 > pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0 > pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0 > pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0 > pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0 > pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0 > pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0 > pcib0: Bus is not cache-coherent > rman_reserve_resource_bound: <I/O memory addresses> request: = [0xfd500000, 0xfd50930f], length 0x9310, flags 100, device pcib0 > rman_reserve_resource_bound: trying 0x1fff <0xfd500000,0x930f> > rman_reserve_resource_bound: tried 0x1fff <0xfd500000,0x930f> > rman_reserve_resource_bound: tried 0x31bfffff <0xfd500000,0x930f> > rman_reserve_resource_bound: tried 0x33298fff <0xfd500000,0x930f> > rman_reserve_resource_bound: tried 0x39bf1fff <0xfd500000,0x930f> > rman_reserve_resource_bound: tried 0x39c02fff <0xfd500000,0x930f> > rman_reserve_resource_bound: tried 0x39c06fff <0xfd500000,0x930f> > rman_reserve_resource_bound: tried 0x39c07fff <0xfd500000,0x930f> > rman_reserve_resource_bound: tried 0x39c08fff <0xfd500000,0x930f> > rman_reserve_resource_bound: tried 0x39c2afff <0xfd500000,0x930f> > rman_reserve_resource_bound: tried 0x39c36fff <0xfd500000,0x930f> > rman_reserve_resource_bound: tried 0x39c37fff <0xfd500000,0x930f> > rman_reserve_resource_bound: tried 0x3b03ffff <0xfd500000,0x930f> > rman_reserve_resource_bound: tried 0x3b04ffff <0xfd500000,0x930f> > rman_reserve_resource_bound: tried 0x3b2fffff <0xfd500000,0x930f> > rman_reserve_resource_bound: tried 0x3ee61fff <0xfd500000,0x930f> > rman_reserve_resource_bound: tried 0x3ee63fff <0xfd500000,0x930f> > rman_reserve_resource_bound: tried 0x3fffffff <0xfd500000,0x930f> > rman_reserve_resource_bound: tried 0xfbffffff <0xfd500000,0x930f> > considering [0xfc000000, 0xfd5d1fff] > truncated region: [0xfd500000, 0xfd50930f]; size 0x9310 (requested = 0x9310) > candidate region: [0xfd500000, 0xfd50930f], size 0x9310 > splitting region in three parts: [0xfc000000, 0xfd4fffff]; = [0xfd500000, 0xfd50930f]; [0xfd509310, 0xfd5d1fff] > rman_manage_region: <PCIe Memory> range 0..0xffffffffffffffff : = request: start 0xc0000000, end 0xffffffff > pcib0: hardware identifies as revision 0x304. > pcib0: note: reported link speed is 5.0 GT/s. > rman_reserve_resource_bound: <Interrupts> request: [0x51, 0x51], = length 0x1, flags 0, device pcib0 > rman_reserve_resource_bound: trying 0 <0x51,0> > rman_reserve_resource_bound: tried 0 <0x51,0> > rman_reserve_resource_bound: tried 0x1 <0x51,0> > rman_reserve_resource_bound: tried 0x2 <0x51,0> > rman_reserve_resource_bound: tried 0x3 <0x51,0> > rman_reserve_resource_bound: tried 0x4 <0x51,0> > rman_reserve_resource_bound: tried 0x5 <0x51,0> > rman_reserve_resource_bound: tried 0x6 <0x51,0> > rman_reserve_resource_bound: tried 0x7 <0x51,0> > rman_reserve_resource_bound: tried 0xc <0x51,0> > rman_reserve_resource_bound: tried 0xd <0x51,0> > rman_reserve_resource_bound: tried 0xe <0x51,0> > rman_reserve_resource_bound: tried 0xf <0x51,0> > rman_reserve_resource_bound: tried 0x10 <0x51,0> > rman_reserve_resource_bound: tried 0x11 <0x51,0> > rman_reserve_resource_bound: tried 0x12 <0x51,0> > rman_reserve_resource_bound: tried 0x17 <0x51,0> > rman_reserve_resource_bound: tried 0x18 <0x51,0> > rman_reserve_resource_bound: tried 0x1a <0x51,0> > rman_reserve_resource_bound: tried 0x1b <0x51,0> > rman_reserve_resource_bound: tried 0x22 <0x51,0> > rman_reserve_resource_bound: tried 0x23 <0x51,0> > rman_reserve_resource_bound: tried 0x24 <0x51,0> > rman_reserve_resource_bound: tried 0x25 <0x51,0> > rman_reserve_resource_bound: tried 0x26 <0x51,0> > rman_reserve_resource_bound: tried 0x27 <0x51,0> > rman_reserve_resource_bound: tried 0x28 <0x51,0> > rman_reserve_resource_bound: tried 0x29 <0x51,0> > rman_reserve_resource_bound: tried 0x4e <0x51,0> > rman_reserve_resource_bound: tried 0x4f <0x51,0> > considering [0x50, 0xffffffffffffffff] > truncated region: [0x51, 0x51]; size 0x1 (requested 0x1) > candidate region: [0x51, 0x51], size 0x1 > splitting region in three parts: [0x50, 0x50]; [0x51, 0x51]; [0x52, = 0xffffffffffffffff] > pci0: <OFW PCI bus> on pcib0 > rman_manage_region: <PCI domain 0 bus numbers> range 0..0xff : = request: start 0, end 0xff > rman_reserve_resource_bound: <PCI domain 0 bus numbers> request: [0, = 0], length 0x1, flags 0, device pci0 > rman_reserve_resource_bound: trying 0xff <0,0> > considering [0, 0xff] > truncated region: [0, 0]; size 0x1 (requested 0x1) > candidate region: [0, 0], size 0x1 > allocating from the beginning > pci0: domain=3D0, physical bus=3D0 > found-> vendor=3D0x14e4, dev=3D0x2711, revid=3D0x00 > domain=3D0, bus=3D0, slot=3D0, func=3D0 > class=3D06-04-00, hdrtype=3D0x01, mfdev=3D0 > cmdreg=3D0x0000, statreg=3D0x0010, cachelnsz=3D0 (dwords) > lattimer=3D0x00 (0 ns), mingnt=3D0x00 (0 ns), maxlat=3D0x00 (0 = ns) > intpin=3Da, irq=3D0 > powerspec 3 supports D0 D3 current D0 > secbus=3D1, subbus=3D1 > rman_reserve_resource_bound: <PCI domain 0 bus numbers> request: [0x1, = 0x1], length 0x1, flags 0, device (null) > rman_reserve_resource_bound: trying 0 <0x1,0> > rman_reserve_resource_bound: tried 0 <0x1,0> > considering [0x1, 0xff] > truncated region: [0x1, 0x1]; size 0x1 (requested 0x1) > candidate region: [0x1, 0x1], size 0x1 > allocating from the beginning > pcib1: <PCI-PCI bridge> irq 91 at device 0.0 on pci0 > rman_manage_region: <pcib1 bus numbers> range 0..0xff : request: start = 0x1, end 0x1 > pcib0: rman_reserve_resource: start=3D0xc0000000, end=3D0xc00fffff, = count=3D0x100000 > rman_reserve_resource_bound: <PCIe Memory> request: [0xc0000000, = 0xc00fffff], length 0x100000, flags 102, device pcib1 > rman_reserve_resource_bound: trying 0xffffffff <0xc0000000,0xfffff> > considering [0xc0000000, 0xffffffff] > truncated region: [0xc0000000, 0xc00fffff]; size 0x100000 (requested = 0x100000) > candidate region: [0xc0000000, 0xc00fffff], size 0x100000 > allocating from the beginning > rman_manage_region: <pcib1 memory window> range 0..0xffffffff : = request: start 0x600000000, end 0x6000fffff > panic: Failed to add resource to rman > cpuid =3D 0 > time =3D 1 > KDB: stack backtrace: > db_trace_self() at db_trace_self > db_trace_self_wrapper() at db_trace_self_wrapper+0x38 > vpanic() at vpanic+0x1a4 > panic() at panic+0x48 > pcib_add_window_resources() at pcib_add_window_resources+0xf4 > pcib_alloc_window() at pcib_alloc_window+0xfc > pcib_attach_common() at pcib_attach_common+0xa18 > pcib_attach() at pcib_attach+0x14 > device_attach() at device_attach+0x3fc > device_probe_and_attach() at device_probe_and_attach+0x80 > bus_generic_attach() at bus_generic_attach+0x1c > pci_attach() at pci_attach+0xec > device_attach() at device_attach+0x3fc > device_probe_and_attach() at device_probe_and_attach+0x80 > bus_generic_attach() at bus_generic_attach+0x1c > device_attach() at device_attach+0x3fc > device_probe_and_attach() at device_probe_and_attach+0x80 > bus_generic_new_pass() at bus_generic_new_pass+0x100 > bus_generic_new_pass() at bus_generic_new_pass+0xb0 > bus_generic_new_pass() at bus_generic_new_pass+0xb0 > bus_generic_new_pass() at bus_generic_new_pass+0xb0 > bus_set_pass() at bus_set_pass+0x50 > mi_startup() at mi_startup+0x1e0 > virtdone() at virtdone+0x68 > KDB: enter: panic > [ thread pid 0 tid 100000 ] > Stopped at kdb_enter+0x4c: str xzr, [x19, #1280] > db>=20 >=20 FYI: bcm2711-peripherals.pdf "Figure 1. BCM2711 Address Maps" documents the PCIe address space range as (their "_" notation): 0x6_0000_0000..0x7_FFFF_FFFF for both of the address maps: 'Full 35-bit Address Map' 'ARM view of the Address Map in "Low Peripheral" mode' The Low Peripheral mode is what is in use. So the 0x6_0000_0000..0x6_000f_ffff range looks to be a valid ARM system address space for the PCIe to map to. =3D=3D=3D Mark Millard marklmi at yahoo.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3A145420-399D-4EBD-9FF4-18924908AB1D>