Date: Mon, 28 Sep 2020 19:04:48 -0700 From: Mark Millard <marklmi@yahoo.com> To: Robert Crowston <crowston@protonmail.com>, freebsd-arm <freebsd-arm@freebsd.org> Subject: Re: RPi4B's DMA11 (DMA4 engine example) vs. xHCI/pcie Message-ID: <0FE382AB-8DE3-4467-9CB0-E8582AC70EA2@yahoo.com> In-Reply-To: <8C6DE44F-6CE2-4C74-8748-3BBFB54AE183@yahoo.com> References: <8C6DE44F-6CE2-4C74-8748-3BBFB54AE183@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2020-Sep-28, at 18:29, Mark Millard <marklmi@yahoo.com> wrote: > [Be warned that the material is not familiar so I may need > educating. THis is based ont he example context that I > happen to have around.] >=20 > In the u-boot fdt print / output there are 2 distinct sets of dma = channel > information, 1 for soc and 1 for scb, where the dma_tag values for the = two > sets should be distinct as far as I can tell: >=20 > U-Boot> fdt address 0x7ef1000 > U-Boot> fdt print / =20 > / { > . . . > soc { > dma@7e007000 { > compatible =3D "brcm,bcm2835-dma"; > reg =3D <0x7e007000 0x00000b00>; > interrupts =3D * 0x0000000007ef645c = [0x00000084]; > interrupt-names =3D "dma0", "dma1", "dma2", = "dma3", "dma4", "dma5", "dma6", "dma7", "dma8", "dma9", "dma10"; > #dma-cells =3D <0x00000001>; > brcm,dma-channel-mask =3D <0x000001f5>; > phandle =3D <0x0000000b>; > }; >=20 > scb { > . . . > dma@7e007b00 { > compatible =3D "brcm,bcm2711-dma"; > reg =3D <0x00000000 0x7e007b00 0x00000000 = 0x00000400>; > interrupts =3D <0x00000000 0x00000059 = 0x00000004 0x00000000 0x0000005a 0x00000004 0x00000000 0x0000005b = 0x00000004 0x00000000 0x0000005c 0x00000004>; > interrupt-names =3D "dma11", "dma12", "dma13", = "dma14"; > #dma-cells =3D <0x00000001>; > brcm,dma-channel-mask =3D <0x00007000>; > phandle =3D <0x0000003d>; > }; > . . . >=20 > So, 0 through 10 need the soc criteria (mix of DMA and DMA LITE = engine criteria) > and 11 through 14 need the scb criteria (DMA4 engine criteria). (I'm = ignore > dma-channel-mask's at this point.) >=20 >=20 > I'll here note the code has: >=20 > #define BCM_DMA_CH_MAX 12 >=20 > for use in code like: >=20 > /* setup initial settings */ > for (i =3D 0; i < BCM_DMA_CH_MAX; i++) { > ch =3D &sc->sc_dma_ch[i]; >=20 > bzero(ch, sizeof(struct bcm_dma_ch)); > ch->ch =3D i; > ch->flags =3D BCM_DMA_CH_UNMAP; >=20 > if ((bcm_dma_channel_mask & (1 << i)) =3D=3D 0) > continue; > . . . >=20 > It looks to me like the only scb/DMA4-engine "dma11" is covered > by such loops and that the "brcm,dma-channel-mask =3D <0x00007000>" > means that dma11 will not be used. >=20 > So: No scb/DMA4 engine will be used??? (That could explain the > 1 GiByte limit?) >=20 >=20 > rpi_DATA_2711_1p0.pdf reports that soc/0-10 have 2 types (0-6 vs. 7-10 > as it turns out) as well as the scb/DM4-engines (11-14): >=20 > QUOTE (with omitted marked by ". . .") > . . . > The BCM2711 DMA Controller provides a total of 16 DMA channels. Four = of these are DMA Lite channels (with reduced performance and features), = and four of them are DMA4 channels (with increased performance and a = wider address range). > . . . > 4.5. DMA LITE Engines >=20 > Several of the DMA engines are of the LITE design. This is a reduced = specification engine designed to save space. The engine behaves in the = same way as a normal DMA engine except for the following differences: > . . . > =E2=80=A2 The DMA length register is now 16 bits, limiting the = maximum transferable length to 65536 bytes. > . . . > 4.6. DMA4 Engines >=20 > Several of the DMA engines are of the DMA4 design. These have higher = performance due to their uncoupled read/write design and can access up = to 40 address bits. Unlike the other DMA engines they are also capable = of performing write bursts. Note that they directly access the full = 35-bit address bus of the BCM2711 and so bypass the paging registers of = the DMA and DMA Lite engines. >=20 > DMA channel 11 is additionally able to access the PCIe interface. > END QUOTE >=20 > The register map indicates (with some extra notes added): >=20 > 0-6: DMA > 7-10: DMA LITE (65536 bytes limit, for example) > 11-14: DMA4 (11 is special relative to "PCIe interface") > ("DMA Channel 15 is exclusively used by the VPU.") >=20 > Yet what I see in the head -r365932 code is: >=20 > #define BCM_DMA_CH_MAX 12 > . . . > struct bcm_dma_softc { > device_t sc_dev; > struct mtx sc_mtx; > struct resource * sc_mem; > struct resource * sc_irq[BCM_DMA_CH_MAX]; > void * sc_intrhand[BCM_DMA_CH_MAX]; > struct bcm_dma_ch sc_dma_ch[BCM_DMA_CH_MAX]; > bus_dma_tag_t sc_dma_tag; > }; > . . . > err =3D bus_dma_tag_create(bus_get_dma_tag(dev), > 1, 0, BUS_SPACE_MAXADDR_32BIT, > BUS_SPACE_MAXADDR, NULL, NULL, > sizeof(struct bcm_dma_cb), 1, > sizeof(struct bcm_dma_cb), > BUS_DMA_ALLOCNOW, NULL, NULL, > &sc->sc_dma_tag); >=20 > As an example: does that deal with the likes of DMA LITE (so 7-10) = "limiting > the maximum transferable length to 65536 bytes"? >=20 > As another example: Does it deal with the DMA4 (11-14) distinctions = (if > such were in use anyway)? >=20 > For reference from the fdt print / : >=20 > / { > . . . > #address-cells =3D <0x00000002>; > #size-cells =3D <0x00000001>; > . . . > soc { > compatible =3D "simple-bus"; > #address-cells =3D <0x00000001>; > #size-cells =3D <0x00000001>; > . . . > dma-ranges =3D <0xc0000000 0x00000000 0x00000000 = 0x40000000>; > . . . > firmware { > compatible =3D "raspberrypi,bcm2835-firmware", = "simple-bus"; > mboxes =3D <0x0000001c>; > dma-ranges; > . . . > emmc2bus { > compatible =3D "simple-bus"; > #address-cells =3D <0x00000002>; > #size-cells =3D <0x00000001>; > . . . > dma-ranges =3D <0x00000000 0xc0000000 0x00000000 = 0x00000000 0x40000000>; > . . . > scb { > compatible =3D "simple-bus"; > #address-cells =3D <0x00000002>; > #size-cells =3D <0x00000002>; > . . . > dma-ranges =3D <0x00000000 0x00000000 0x00000000 = 0x00000000 0x00000000 0xfc000000 0x00000001 0x00000000 0x00000001 = 0x00000000 0x00000001 0x00000000>; > . . . > pcie@7d500000 { > compatible =3D "brcm,bcm2711-pcie"; > . . . > #address-cells =3D <0x00000003>; > . . . > #size-cells =3D <0x00000002>; > . . . > dma-ranges =3D <0x02000000 0x00000000 = 0x00000000 0x00000000 0x00000000 0x00000000 0xc0000000>; > . . . > v3dbus { > compatible =3D "simple-bus"; > #address-cells =3D <0x00000001>; > #size-cells =3D <0x00000002>; > . . . > dma-ranges =3D <0x00000000 0x00000000 0x00000000 = 0x00000004 0x00000000>; > . . . rpi_DATA_2711_1p0.pdf reports: (I ignore 2D DMA transfer mode here.) For DMA engines 0-6: XLENGTH has bits 29:0 bits 31:30 are write as 0, read as do not care. That would put maxsegsz as 2**30 =3D=3D 1,073,741,824 which matches a 1 GiByte space. For DMA LITE engines 7-10: XLENGTH has bit 15:0 bits 31:16 are write as 0, read as do not care. That would put maxsegsz as 2**16 =3D=3D 65,536. For DMA4 engines 11-14: XLENGTH has bits 29:0 bits 31:30 are write as 0, read as do not care. That would put maxsegsz as 2**30 =3D=3D 1,073,741,824 which is smaller than the 3 GiByte space associated with xHCI. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0FE382AB-8DE3-4467-9CB0-E8582AC70EA2>