Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 2 Dec 2022 15:34:24 -0800
From:      Mark Millard <marklmi@yahoo.com>
To:        freebsd-arm <freebsd-arm@freebsd.org>, FreeBSD Hackers <freebsd-hackers@freebsd.org>
Subject:   Does the RPi* DMA code in the kernel handle the distinctions between the 3 kinds of DMA engines?
Message-ID:  <18DF2CDD-3BC2-4100-9B8E-3BFD08F1F119@yahoo.com>
References:  <18DF2CDD-3BC2-4100-9B8E-3BFD08F1F119.ref@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Does the RPi* DMA code in the kernel handle the distinctions between
the 3 kinds of DMA engines?

The engine types are named:

DMA engine	(30-bit DMA addressing: 1 GiByte, 256-bit burst space)

DMA lite engine (only 128 bit burst space,
		no YLENGTH, TDMODE, S_STRIDE, D_STRIDE registsers,
		only 16 bits for DMA length (?_TXFR_LEN XLENGTH),
		no SRC_IGNORE or DEST_IGNORE modes,
		only about half the bandwidth of type DMA,
		uses ENABLE Regsiter 31:28 "PAGELITE" instead of
                27:24 "PAGE" to control the "1G SDRAM ram page" used)

DMA4 engine	(not the same as dma4 of specific engines dma0..dma15;
		different register map offset pattern after ??_CB,
		more register map offsets than the other DMA types,
		40 address bits with 40:32 in ??_SRCI/??_DESTI 7:0,
		??_CB bits 31:0 are for Control Block Address 36:5,
		higher performance via "uncoupled read/write design",
		write burst capable,
		directly accesses the BCM2711 35-bit address bus
		so bypasses the paginging registers that are used
		for types DMA and DMA lite)

The specific instances of engines (channels) have types:

dma0 ..dma6 : Always type DMA (so far?)
dma7 ..dma10: Always type DMA lite (so far?)
dma11..dma14: Not the same for BCM2711:
		For BCM2711, type DMA4
		Otherwise, type DMA lite (so far?)

BCM2711 specific note:
dma11 "is additionally able to access the PCIe interface".

For reference, the live device tree for the BCM2711 has
(examples from an 8 GiByte RPi4B):

		dma@7e007000 {
			compatible =3D "brcm,bcm2835-dma";
			reg =3D <0x7e007000 0x00000b00>;
			interrupts =3D <0x00000000 0x00000050 0x00000004 =
0x00000000 0x00000051 0x00000004 0x00000000 0x00000052 0x00000004 =
0x00000000 0x00000053 0x00000004 0x00000000 0x00000054 0x00000004 =
0x00000000 0x00000055 0x00000004 0x00000000 0x00000056 0x00000004 =
0x00000000 0x00000057 0x00000004 0x00000000 0x00000057 0x00000004 =
0x00000000 0x00000058 0x00000004 0x00000000 0x00000058 0x00000004>;
			interrupt-names =3D "dma0", "dma1", "dma2", =
"dma3", "dma4", "dma5", "dma6", "dma7", "dma8", "dma9", "dma10";
			#dma-cells =3D <0x00000001>;
			brcm,dma-channel-mask =3D <0x000007f5>;
			phandle =3D <0x0000000b>;
		};
. . .
		dma@7e007b00 {
			compatible =3D "brcm,bcm2711-dma";
			reg =3D <0x00000000 0x7e007b00 0x00000000 =
0x00000400>;
			interrupts =3D <0x00000000 0x00000059 0x00000004 =
0x00000000 0x0000005a 0x00000004 0x00000000 0x0000005b 0x00000004 =
0x00000000 0x0000005c 0x00000004>;
			interrupt-names =3D "dma11", "dma12", "dma13", =
"dma14";
			#dma-cells =3D <0x00000001>;
			brcm,dma-channel-mask =3D <0x00003000>;
			phandle =3D <0x00000040>;
		};

Note: dma15 "is exclusively used by the VPU" and I ignore it here.

I ask, in part, because of:

#define    BCM_DMA_CH_MAX          12

that is used via the likes of:

struct bcm_dma_ch sc_dma_ch[BCM_DMA_CH_MAX];

and:

for (i =3D 0; i < BCM_DMA_CH_MAX; i++) {

But the BCM2711 only has 0..10 defined for brcm,bcm2835-dma
in its device live device tree, although brcm,dma-channel-mask
allows avoiding what is missing. 11..14 are defined in
brcm,bcm2711-dma instead --but the code makes no use of it,
given the brcm,bcm2835-dma's brcm,dma-channel-mask.

But/also, the brcm,bcm2835-dma's brcm,dma-channel-mask includes
examples of both "DMA engine" and "DMA lite engine", so still
leading to some distinctions that should be made.

So far, I've not been able to identify code/data making any
distinctions for the likes of dma7..dma14 beyond what
brcm,dma-channel-mask completely avoids.


Note:	So far as I can tell, the PCIe bus-mastering that is
	associated with the XHCI in the BCM2711 is separate
	from the above.

	The "B0T" BCM2711 parts have a 3 GiByte limitation just
	for this PCIe bus-mastering(/XHCI) and the "C0T" BCM2711
	parts no longer are limited to a subset of the available
	RAM for PCIe bus-mastering(/XHCI). Examples from 8 GiByte
	RPi4B's:

		dma-ranges =3D <0x02000000 0x00000004 0x00000000 =
0x00000000 0x00000000 0x00000000 0xc0000000>;
	vs.
		dma-ranges =3D <0x02000000 0x00000004 0x00000000 =
0x00000000 0x00000000 0x00000002 0x00000000>;

	So, XHCI related bounce buffering could be avoided on=20
	"C0T" parts.

	For BCM2711 "B0T" vs. "C0T" there is also emmc2bus with:

		dma-ranges =3D <0x00000000 0xc0000000 0x00000000 =
0x00000000 0x40000000>;
		(so: matching the soc dma-ranges, other than the #of =
cells for holding the 1st addr)
	vs.
		dma-ranges =3D <0x00000000 0x00000000 0x00000000 =
0x00000000 0xfc000000>;
		(so not matching the soc dma-ranges)

	emmc2bus contains:

		emmc2@7e340000 {
			compatible =3D "brcm,bcm2711-emmc2";
			. . .

	which looks to mean that the dma_ranges for brcm,bcm2711-emmc2
	changed to not match the soc dma-ranges. I've not noticed any
	code/data that would track this change.

	I do not know the implications of the difference for
	what FreeBSD's code does --or if FreeBSD ever tries to
	use the brcm,bcm2711-emmc2 .

=3D=3D=3D
Mark Millard
marklmi at yahoo.com




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?18DF2CDD-3BC2-4100-9B8E-3BFD08F1F119>