Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 28 Sep 2020 19:04:48 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        Robert Crowston <crowston@protonmail.com>, freebsd-arm <freebsd-arm@freebsd.org>
Subject:   Re: RPi4B's DMA11 (DMA4 engine example) vs. xHCI/pcie
Message-ID:  <0FE382AB-8DE3-4467-9CB0-E8582AC70EA2@yahoo.com>
In-Reply-To: <8C6DE44F-6CE2-4C74-8748-3BBFB54AE183@yahoo.com>
References:  <8C6DE44F-6CE2-4C74-8748-3BBFB54AE183@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help


On 2020-Sep-28, at 18:29, Mark Millard <marklmi@yahoo.com> wrote:

> [Be warned that the material is not familiar so I may need
> educating. THis is based ont he example context that I
> happen to have around.]
>=20
> In the u-boot fdt print / output there are 2 distinct sets of dma =
channel
> information, 1 for soc and 1 for scb, where the dma_tag values for the =
two
> sets should be distinct as far as I can tell:
>=20
> U-Boot> fdt address 0x7ef1000
> U-Boot> fdt print /         =20
> / {
> . . .
>        soc {
>                dma@7e007000 {
>                        compatible =3D "brcm,bcm2835-dma";
>                        reg =3D <0x7e007000 0x00000b00>;
>                        interrupts =3D * 0x0000000007ef645c =
[0x00000084];
>                        interrupt-names =3D "dma0", "dma1", "dma2", =
"dma3", "dma4", "dma5", "dma6", "dma7", "dma8", "dma9", "dma10";
>                        #dma-cells =3D <0x00000001>;
>                        brcm,dma-channel-mask =3D <0x000001f5>;
>                        phandle =3D <0x0000000b>;
>                };
>=20
>        scb {
> . . .
>                dma@7e007b00 {
>                        compatible =3D "brcm,bcm2711-dma";
>                        reg =3D <0x00000000 0x7e007b00 0x00000000 =
0x00000400>;
>                        interrupts =3D <0x00000000 0x00000059 =
0x00000004 0x00000000 0x0000005a 0x00000004 0x00000000 0x0000005b =
0x00000004 0x00000000 0x0000005c 0x00000004>;
>                        interrupt-names =3D "dma11", "dma12", "dma13", =
"dma14";
>                        #dma-cells =3D <0x00000001>;
>                        brcm,dma-channel-mask =3D <0x00007000>;
>                        phandle =3D <0x0000003d>;
>                };
> . . .
>=20
> So,  0 through 10 need the soc criteria (mix of DMA and DMA LITE =
engine criteria)
> and 11 through 14 need the scb criteria (DMA4 engine criteria). (I'm =
ignore
> dma-channel-mask's at this point.)
>=20
>=20
> I'll here note the code has:
>=20
> #define	BCM_DMA_CH_MAX		12
>=20
> for use in code like:
>=20
>        /* setup initial settings */
>        for (i =3D 0; i < BCM_DMA_CH_MAX; i++) {
>                ch =3D &sc->sc_dma_ch[i];
>=20
>                bzero(ch, sizeof(struct bcm_dma_ch));
>                ch->ch =3D i;
>                ch->flags =3D BCM_DMA_CH_UNMAP;
>=20
>                if ((bcm_dma_channel_mask & (1 << i)) =3D=3D 0)
>                        continue;
> . . .
>=20
> It looks to me like the only scb/DMA4-engine "dma11" is covered
> by such loops and that the "brcm,dma-channel-mask =3D <0x00007000>"
> means that dma11 will not be used.
>=20
> So: No scb/DMA4 engine will be used??? (That could explain the
> 1 GiByte limit?)
>=20
>=20
> rpi_DATA_2711_1p0.pdf reports that soc/0-10 have 2 types (0-6 vs. 7-10
> as it turns out) as well as the scb/DM4-engines (11-14):
>=20
> QUOTE (with omitted marked by ". . .")
> . . .
> The BCM2711 DMA Controller provides a total of 16 DMA channels. Four =
of these are DMA Lite channels (with reduced performance and features), =
and four of them are DMA4 channels (with increased performance and a =
wider address range).
> . . .
> 4.5. DMA LITE Engines
>=20
> Several of the DMA engines are of the LITE design. This is a reduced =
specification engine designed to save space. The engine behaves in the =
same way as a normal DMA engine except for the following differences:
> . . .
> 	=E2=80=A2 The DMA length register is now 16 bits, limiting the =
maximum transferable length to 65536 bytes.
> . . .
> 4.6. DMA4 Engines
>=20
> Several of the DMA engines are of the DMA4 design. These have higher =
performance due to their uncoupled read/write design and can access up =
to 40 address bits. Unlike the other DMA engines they are also capable =
of performing write bursts. Note that they directly access the full =
35-bit address bus of the BCM2711 and so bypass the paging registers of =
the DMA and DMA Lite engines.
>=20
> DMA channel 11 is additionally able to access the PCIe interface.
> END QUOTE
>=20
> The register map indicates (with some extra notes added):
>=20
> 0-6:   DMA
> 7-10:  DMA LITE (65536 bytes limit, for example)
> 11-14: DMA4 (11 is special relative to "PCIe interface")
> ("DMA Channel 15 is exclusively used by the VPU.")
>=20
> Yet what I see in the head -r365932 code is:
>=20
> #define	BCM_DMA_CH_MAX		12
> . . .
> struct bcm_dma_softc {
>        device_t                sc_dev;
>        struct mtx              sc_mtx;
>        struct resource *       sc_mem;
>        struct resource *       sc_irq[BCM_DMA_CH_MAX];
>        void *                  sc_intrhand[BCM_DMA_CH_MAX];
>        struct bcm_dma_ch       sc_dma_ch[BCM_DMA_CH_MAX];
>        bus_dma_tag_t           sc_dma_tag;
> };
> . . .
>        err =3D bus_dma_tag_create(bus_get_dma_tag(dev),
>            1, 0, BUS_SPACE_MAXADDR_32BIT,
>            BUS_SPACE_MAXADDR, NULL, NULL,
>            sizeof(struct bcm_dma_cb), 1,
>            sizeof(struct bcm_dma_cb),
>            BUS_DMA_ALLOCNOW, NULL, NULL,
>            &sc->sc_dma_tag);
>=20
> As an example: does that deal with the likes of DMA LITE (so 7-10) =
"limiting
> the maximum transferable length to 65536 bytes"?
>=20
> As another example: Does it deal with the DMA4 (11-14) distinctions =
(if
> such were in use anyway)?
>=20
> For reference from the fdt print / :
>=20
> / {
> . . .
>        #address-cells =3D <0x00000002>;
>        #size-cells =3D <0x00000001>;
> . . .
>        soc {
>                compatible =3D "simple-bus";
>                #address-cells =3D <0x00000001>;
>                #size-cells =3D <0x00000001>;
> . . .
>                dma-ranges =3D <0xc0000000 0x00000000 0x00000000 =
0x40000000>;
> . . .
>                firmware {
>                        compatible =3D "raspberrypi,bcm2835-firmware", =
"simple-bus";
>                        mboxes =3D <0x0000001c>;
>                        dma-ranges;
> . . .
>        emmc2bus {
>                compatible =3D "simple-bus";
>                #address-cells =3D <0x00000002>;
>                #size-cells =3D <0x00000001>;
> . . .
>                dma-ranges =3D <0x00000000 0xc0000000 0x00000000 =
0x00000000 0x40000000>;
> . . .
>        scb {
>                compatible =3D "simple-bus";
>                #address-cells =3D <0x00000002>;
>                #size-cells =3D <0x00000002>;
> . . .
>                dma-ranges =3D <0x00000000 0x00000000 0x00000000 =
0x00000000 0x00000000 0xfc000000 0x00000001 0x00000000 0x00000001 =
0x00000000 0x00000001 0x00000000>;
> . . .
>                pcie@7d500000 {
>                        compatible =3D "brcm,bcm2711-pcie";
> . . .
>                        #address-cells =3D <0x00000003>;
> . . .
>                        #size-cells =3D <0x00000002>;
> . . .
>                        dma-ranges =3D <0x02000000 0x00000000 =
0x00000000 0x00000000 0x00000000 0x00000000 0xc0000000>;
> . . .
>        v3dbus {
>                compatible =3D "simple-bus";
>                #address-cells =3D <0x00000001>;
>                #size-cells =3D <0x00000002>;
> . . .
>                dma-ranges =3D <0x00000000 0x00000000 0x00000000 =
0x00000004 0x00000000>;
> . . .

rpi_DATA_2711_1p0.pdf reports:
(I ignore 2D DMA transfer mode here.)

For DMA engines 0-6: XLENGTH has bits 29:0
bits 31:30 are write as 0, read as do not care.
That would put maxsegsz as 2**30 =3D=3D 1,073,741,824
which matches a 1 GiByte space.

For DMA LITE engines 7-10: XLENGTH has bit 15:0
bits 31:16 are write as 0, read as do not care.
That would put maxsegsz as 2**16 =3D=3D 65,536.

For DMA4 engines 11-14: XLENGTH has bits 29:0
bits 31:30 are write as 0, read as do not care.
That would put maxsegsz as 2**30 =3D=3D 1,073,741,824
which is smaller than the 3 GiByte space associated
with xHCI.


=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0FE382AB-8DE3-4467-9CB0-E8582AC70EA2>