Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 6 Oct 2020 21:50:36 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        Robert Crowston <crowston@protonmail.com>, freebsd-arm <freebsd-arm@freebsd.org>
Subject:   Re: A basis for a possible update to the pcie based xhci support? It survived huge-file duplicate-then-diff testing so far.
Message-ID:  <0A440C23-7D21-4515-B872-1C64FF80A873@yahoo.com>
In-Reply-To: <31C1F4F8-6727-4EBE-9D20-39F5B2DA89A5@yahoo.com>
References:  <31C1F4F8-6727-4EBE-9D20-39F5B2DA89A5@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help


On 2020-Oct-6, at 21:43, Mark Millard <marklmi@yahoo.com> wrote:

> Note: based on a head -r363932 context, not more recent.
>=20
> First off, a note about lowaddr values. What sysctl showed
> me were the likes of (prior to the changes that this note
> is about):
>=20
> . . .
> hw.busdma.zone2.lowaddr: 0x3c000fff
> . . .
> hw.busdma.zone1.lowaddr: 0x3fffffff
> . . .
> hw.busdma.zone0.lowaddr: 0xffffffff
> . . .
>=20
> So I've guessed that lowaddr should identify the
> end page of the possibly-use-it region, not the
> first do-not-use-it page.


> If wrong, at most it
> should avoid bouncing one page that it could
> avoid.

That was a wonderfully messed up sentence.
Trying again:

"If wrong, at most it would bounce one page that it
could avoid bouncing."

> But, if correct, it might bounce a page
> that it should instead of not doing so.
>=20
> Otherwise what I've done is put back some of your old
> bcm2838_pci.c code and removed the sc->sc_bus.dma_bits
> adjustment from the bcm2838_xhci.c code. Be warned
> that I copied the likes of REG_VALUE_4GB_WINDOW and
> REG_VALUE_4GB_CONFIG without understanding the values
> or encoding. (I'm not pcie knowledgable.)
>=20
> # svnlite diff /usr/src/sys/arm/broadcom/
> Index: /usr/src/sys/arm/broadcom/bcm2835/bcm2838_pci.c
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> --- /usr/src/sys/arm/broadcom/bcm2835/bcm2838_pci.c	(revision =
365932)
> +++ /usr/src/sys/arm/broadcom/bcm2835/bcm2838_pci.c	(working copy)
> @@ -91,27 +91,22 @@
> #define REG_EP_CONFIG_CHOICE			0x9000
> #define REG_EP_CONFIG_DATA			0x8000
>=20
> +#define REG_VALUE_4GB_WINDOW    0x11
> +#define REG_VALUE_4GB_CONFIG    0x88003000
> +
> /*
>  * The system memory controller can address up to 16 GiB of physical =
memory
>  * (although at time of writing the largest memory size available for =
purchase
> - * is 8 GiB). However, the system DMA controller is capable of =
accessing only a
> - * limited portion of the address space. Worse, the PCI-e controller =
has further
> - * constraints for DMA, and those limitations are not wholly clear to =
the
> - * author. NetBSD and Linux allow DMA on the lower 3 GiB of the =
physical memory,
> - * but experimentation shows DMA performed above 960 MiB results in =
data
> - * corruption with this driver. The limit of 960 MiB is taken from =
OpenBSD, but
> + * is 8 GiB). However, the system DMA controller in early enough =
boards is
> + * capable of accessing only a limited portion of the address space =
(3 GiByte).
> + * Worse, the PCI-e controller has further constraints for DMA, and =
those
> + * limitations are not wholly clear to the author. NetBSD and Linux =
allow
> + * DMA on the lower 3 GiB of the physical memory. OpenBSD used 960 =
MiByte but
>  * apparently that value was chosen for satisfying a constraint of an =
unrelated
>  * peripheral.
> - *
> - * Whatever the true maximum address, 960 MiB works.
>  */
> -#define DMA_HIGH_LIMIT			0x3c000000
> -#define MAX_MEMORY_LOG2			0x21
> -#define REG_VALUE_DMA_WINDOW_LOW	(MAX_MEMORY_LOG2 - 0xf)
> +#define DMA_HIGH_LIMIT			=
((bus_addr_t)0xc0000000u-1)
> #define REG_VALUE_DMA_WINDOW_HIGH	0x0
> -#define DMA_WINDOW_ENABLE		0x3000
> -#define REG_VALUE_DMA_WINDOW_CONFIG	\
> -    (((MAX_MEMORY_LOG2 - 0xf) << 0x1b) | DMA_WINDOW_ENABLE)
>=20
> #define REG_VALUE_MSI_CONFIG	0xffe06540
>=20
> @@ -645,9 +640,9 @@
> 	    DMA_HIGH_LIMIT,			/* lowaddr */
> 	    BUS_SPACE_MAXADDR,			/* highaddr */
> 	    NULL, NULL,				/* filter, filterarg */
> -	    DMA_HIGH_LIMIT,			/* maxsize */
> +	    BUS_SPACE_MAXSIZE,			/* maxsize */
> 	    BUS_SPACE_UNRESTRICTED,		/* nsegments */
> -	    DMA_HIGH_LIMIT,			/* maxsegsize */
> +	    BUS_SPACE_MAXSIZE,			/* maxsegsize */
> 	    0, 					/* flags */
> 	    NULL, NULL,				/* lockfunc, lockarg */
> 	    &sc->dmat);
> @@ -674,9 +669,9 @@
> 	 * Set PCI->CPU memory window. This encodes the inbound window =
showing
> 	 * the system memory to the controller.
> 	 */
> -	bcm_pcib_set_reg(sc, REG_DMA_WINDOW_LOW, =
REG_VALUE_DMA_WINDOW_LOW);
> +	bcm_pcib_set_reg(sc, REG_DMA_WINDOW_LOW, REG_VALUE_4GB_WINDOW);
> 	bcm_pcib_set_reg(sc, REG_DMA_WINDOW_HIGH, =
REG_VALUE_DMA_WINDOW_HIGH);
> -	bcm_pcib_set_reg(sc, REG_DMA_CONFIG, =
REG_VALUE_DMA_WINDOW_CONFIG);
> +	bcm_pcib_set_reg(sc, REG_DMA_CONFIG, REG_VALUE_4GB_CONFIG);
>=20
> 	bcm_pcib_set_reg(sc, REG_BRIDGE_GISB_WINDOW, 0);
> 	bcm_pcib_set_reg(sc, REG_DMA_WINDOW_1, 0);
> Index: /usr/src/sys/arm/broadcom/bcm2835/bcm2838_xhci.c
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> --- /usr/src/sys/arm/broadcom/bcm2835/bcm2838_xhci.c	(revision =
365932)
> +++ /usr/src/sys/arm/broadcom/bcm2835/bcm2838_xhci.c	(working copy)
> @@ -189,15 +189,7 @@
> 	bcm_xhci_install_xhci_firmware(dev);
>=20
> 	error =3D xhci_pci_attach(dev);
> -	if (error)
> -		return (error);
> -
> -	/* 32 bit DMA is a limitation of the PCI-e controller, not the =
VL805. */
> -	sc->sc_bus.dma_bits =3D 32;
> -	if (bootverbose)
> -		device_printf(dev, "note: switched to 32-bit DMA.\n");
> -
> -	return (0);
> +	return (error);
> }
>=20
> /*
>=20
> I've concluded from what I've seen in the code that lowaddr
> should be based on the pcie properties and should not worry
> about the maxsize and maxseg size figures being possibly
> smaller: that is a dma engine use worry, not a pci one. (Not
> that I could get that from the documentation that I quoted in
> the review.) Thus I put back the 2 BUS_SPACE_MAXSIZE uses.
>=20
> After the first huge-file duplicate-then-diff test sysctl
> reported lots of bounced transfers:
>=20
> # sysctl hw.busdma
> hw.busdma.zone1.alignment: 4096
> hw.busdma.zone1.lowaddr: 0x3fffffff
> hw.busdma.zone1.total_deferred: 0
> hw.busdma.zone1.total_bounced: 755770
> hw.busdma.zone1.active_bpages: 0
> hw.busdma.zone1.reserved_bpages: 0
> hw.busdma.zone1.free_bpages: 838
> hw.busdma.zone1.total_bpages: 838
> hw.busdma.zone0.alignment: 4096
> hw.busdma.zone0.lowaddr: 0xffffffff
> hw.busdma.zone0.total_deferred: 0
> hw.busdma.zone0.total_bounced: 0
> hw.busdma.zone0.active_bpages: 256
> hw.busdma.zone0.reserved_bpages: 0
> hw.busdma.zone0.free_bpages: 257
> hw.busdma.zone0.total_bpages: 513
> hw.busdma.total_bpages: 1351
>=20
> For the non-power-of-2 boundary (0xc0000000-1), it
> appears to use the next smaller power of 2 for the
> boundary (0x40000000-1), without having to explicitly
> code both types of values specially for the RPi4B.
> (Of course, it also avoids using 2 GiBytes to
> potentially avoid more bouncing.)
>=20
> I'll note that, prior to the change, there
> was after an example first test:
>=20
> hw.busdma.zone2.total_bounced: 1091942
>=20
> and 174 in zone 1. So the bounce count has
> decreased.
>=20
> I'll note that "total_bounced" need not be the
> a page count: it is incremented by 1 after
> the loop for a bounce, not inside the loop.
> Lots of pages of data were bounced.
>=20
> For reference (the test as of a gpu_mem_1024=3D32
> context):
>=20
> Physical memory chunk(s):
> 0x00000000002000 - 0x00000007ef0fff, 133099520 bytes (32495 pages)
> 0x00000007f0f000 - 0x00000034bfffff, 751767552 bytes (183537 pages)
> 0x00000036052000 - 0x0000003cb2efff, 112054272 bytes (27357 pages)
> 0x0000003cb36000 - 0x0000003cb36fff, 4096 bytes (1 pages)
> 0x0000003cb38000 - 0x0000003cb39fff, 8192 bytes (2 pages)
> 0x0000003cb3b000 - 0x0000003cb3cfff, 8192 bytes (2 pages)
> 0x0000003cb40000 - 0x0000003cb40fff, 4096 bytes (1 pages)
> 0x0000003cb42000 - 0x0000003cb43fff, 8192 bytes (2 pages)
> 0x0000003cb45000 - 0x0000003df4ffff, 21016576 bytes (5131 pages)
> 0x0000003df60000 - 0x0000003dffffff, 655360 bytes (160 pages)
> 0x00000040000000 - 0x000000fbffffff, 3154116608 bytes (770048 pages)
> 0x00000100000000 - 0x000001f372afff, 4084379648 bytes (997163 pages)
>=20
>=20
> FYI, before the huge-file duplicate-and-diff test:
>=20
> # sysctl hw.busdma
> hw.busdma.zone1.alignment: 4096
> hw.busdma.zone1.lowaddr: 0x3fffffff
> hw.busdma.zone1.total_deferred: 0
> hw.busdma.zone1.total_bounced: 866
> hw.busdma.zone1.active_bpages: 2
> hw.busdma.zone1.reserved_bpages: 0
> hw.busdma.zone1.free_bpages: 836
> hw.busdma.zone1.total_bpages: 838
> hw.busdma.zone0.alignment: 4096
> hw.busdma.zone0.lowaddr: 0xffffffff
> hw.busdma.zone0.total_deferred: 0
> hw.busdma.zone0.total_bounced: 0
> hw.busdma.zone0.active_bpages: 256
> hw.busdma.zone0.reserved_bpages: 0
> hw.busdma.zone0.free_bpages: 257
> hw.busdma.zone0.total_bpages: 513
> hw.busdma.total_bpages: 1351
>=20
> After the duplicate but before the diff:
>=20
> # sysctl hw.busdma
> hw.busdma.zone1.alignment: 4096
> hw.busdma.zone1.lowaddr: 0x3fffffff
> hw.busdma.zone1.total_deferred: 0
> hw.busdma.zone1.total_bounced: 513604
> hw.busdma.zone1.active_bpages: 8
> hw.busdma.zone1.reserved_bpages: 0
> hw.busdma.zone1.free_bpages: 830
> hw.busdma.zone1.total_bpages: 838
> hw.busdma.zone0.alignment: 4096
> hw.busdma.zone0.lowaddr: 0xffffffff
> hw.busdma.zone0.total_deferred: 0
> hw.busdma.zone0.total_bounced: 0
> hw.busdma.zone0.active_bpages: 256
> hw.busdma.zone0.reserved_bpages: 0
> hw.busdma.zone0.free_bpages: 257
> hw.busdma.zone0.total_bpages: 513
> hw.busdma.total_bpages: 1351




=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0A440C23-7D21-4515-B872-1C64FF80A873>