From owner-freebsd-arm@freebsd.org Wed Oct 7 04:43:51 2020 Return-Path: Delivered-To: freebsd-arm@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id E65F83FB052 for ; Wed, 7 Oct 2020 04:43:51 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic308-8.consmr.mail.gq1.yahoo.com (sonic308-8.consmr.mail.gq1.yahoo.com [98.137.68.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4C5hYf2yRhz4KLn for ; Wed, 7 Oct 2020 04:43:50 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: Ee0Fs1MVM1mnxiYnwuTov1.bWKKYbWDvAGazIffSJkl0ePtPvz8Ueej8lsZA.Bd vZVK6tyBFqwEKXCLaZPRHum991pLULcUJazD1Xh7F4LdisBl1cax5AxPcvdY16_4LxIPeLWTAkkw ewhgtSuDCxS7lxystnimEyfUVHxBXbcQ.SmKTvU06n.9c0ymuDPM6aqDDk_S7R8wOI64b1JjJJu7 n2s_VJG.z4U0Fld47Jnf_trm3K909RIsolnKHyZD3aWpQhzT0rapHtuCnlTE2VqKKz1uvYsehXCP enzQBdtgJ1WBCC7nidmpQUouYCYpSIHcS93Pm3pRjmmTM_q.l80srZVUqXrW.vNJrcsEBpVBSIOE L5v46LBTbcdDC_wz_jni7uEPARIuvupgnHH4.xHxqZSF_sx0BCCQlGvNw86GJYifpO8QxVl3bQdu NLg2mcWK21dGrtyPdkeAZdDXRdsk6RpoF9UepuNYJV6ksGHncufSQKRbCU5z240izIDvLHc5wi8Q GhDIQGhzWI7eLgJv.CEFlYd.zlhtSwvbv.5Dla4mPSjP0.eolcISttkv._n0ixrSgL6g3bC2BjY1 DvBvmvlpFFKfjsiI.52TNMnKknLKlgiqHb..V4TN3fdLECAyk816pMUtcqCWLHOaqF52X36Xs706 A52KWWHpQgU.QE9YeVxnt3YGUDR5Y.nW2wH.r8l2__X1NGCf9xYO1bhWul3f.otbZ7VfcT6uuM2N h5AiV8sL1E59wvCbIaMQEw7rqocD54MmGkgRQX1vKXMZ_gXbsLJEz454NxzGyzFF2TVoM9Cjwxyu 0s7iyKqOmaBy.d.Br81cUyLQyYb_6SPzH0EZLn98JAqITJZzu3anyVAmy27sD38VZZ26uI96aEUG CD9Kgin8CleFxO1ai8kkJ7tjTU7DR.mj5mC2hxz99f8Ohx0_o.5AkSbGbtePl5mgop3ajpum5PjK eCDj5K6Pd1qW5JoTS9A3uvnZ429exB6ccj6mSTURbsiciLhtHSKBKd2uR7c40eeuHy6M9tHbN3.s XD5o1Rc1XseOPCfn7.mV76qqcmNmP1MQvSG9ap6zc3y2OMSIjAvsR17bcSIjdQ0WWmwhVa3jJJr3 .P.XGARMflV8GIyqVZDizxnNCxGsQ_rF4hL1kDsoDg5KpN7cDRgBgqVEKGY7GNPE6o6rncvDlEXi 7hLTccfsU5AO9rAawTbuDXy0HM318wal0kOwoEj9oAeeeEjOA9aLP_k1IbxEM3IQKqoJ.sYASzQW dJVIZRle35Q4dDu5hrwi72.2t_Tt7YeG8AE8IM0_1B9yGDjukTiSVIo4AKXw1I.YT410.tMWaEgc Y0BUVWHJ9CHC1uuEp5N9.Gopbs33eEhNSLakFpAgB1HChS49x1Xoq_XjEil.g0DYz4CVRPk3HGTs FT2R32R63CWkJGrQ40hLFgZd0lGETdVblksmiWyEqQnq_M1q1bJwisQ3viimwWJQBRbzXksu62ek 9Z8QuFk.osstuG3zOGfenoX0K.atYJDOCLDQtIgl_uWAFZWn4zpC3v9KcYNkj9juXjotZ.BU2KA- - Received: from sonic.gate.mail.ne1.yahoo.com by sonic308.consmr.mail.gq1.yahoo.com with HTTP; Wed, 7 Oct 2020 04:43:47 +0000 Received: by smtp409.mail.ne1.yahoo.com (VZM Hermes SMTP Server) with ESMTPA ID 80db86b1a6b320d6c43a129ec1bbfaec; Wed, 07 Oct 2020 04:43:43 +0000 (UTC) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.1\)) Subject: A basis for a possible update to the pcie based xhci support? It survived huge-file duplicate-then-diff testing so far. Message-Id: <31C1F4F8-6727-4EBE-9D20-39F5B2DA89A5@yahoo.com> Date: Tue, 6 Oct 2020 21:43:42 -0700 To: Robert Crowston , freebsd-arm X-Mailer: Apple Mail (2.3608.120.23.2.1) References: <31C1F4F8-6727-4EBE-9D20-39F5B2DA89A5.ref@yahoo.com> X-Rspamd-Queue-Id: 4C5hYf2yRhz4KLn X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.73 / 15.00]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; NEURAL_HAM_SHORT(-1.19)[-1.194]; FREEMAIL_TO(0.00)[protonmail.com,freebsd.org]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/20, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com:dkim]; ARC_NA(0.00)[]; SUBJECT_HAS_QUESTION(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; NEURAL_HAM_MEDIUM(-1.02)[-1.022]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_LONG(-1.02)[-1.018]; MIME_GOOD(-0.10)[text/plain]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[98.137.68.32:from]; RWL_MAILSPIKE_POSSIBLE(0.00)[98.137.68.32:from]; RCVD_COUNT_TWO(0.00)[2]; MAILMAN_DEST(0.00)[freebsd-arm] X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Oct 2020 04:43:52 -0000 Note: based on a head -r363932 context, not more recent. First off, a note about lowaddr values. What sysctl showed me were the likes of (prior to the changes that this note is about): . . . hw.busdma.zone2.lowaddr: 0x3c000fff . . . hw.busdma.zone1.lowaddr: 0x3fffffff . . . hw.busdma.zone0.lowaddr: 0xffffffff . . . So I've guessed that lowaddr should identify the end page of the possibly-use-it region, not the first do-not-use-it page. If wrong, at most it should avoid bouncing one page that it could avoid. But, if correct, it might bounce a page that it should instead of not doing so. Otherwise what I've done is put back some of your old bcm2838_pci.c code and removed the sc->sc_bus.dma_bits adjustment from the bcm2838_xhci.c code. Be warned that I copied the likes of REG_VALUE_4GB_WINDOW and REG_VALUE_4GB_CONFIG without understanding the values or encoding. (I'm not pcie knowledgable.) # svnlite diff /usr/src/sys/arm/broadcom/ Index: /usr/src/sys/arm/broadcom/bcm2835/bcm2838_pci.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/sys/arm/broadcom/bcm2835/bcm2838_pci.c (revision = 365932) +++ /usr/src/sys/arm/broadcom/bcm2835/bcm2838_pci.c (working copy) @@ -91,27 +91,22 @@ #define REG_EP_CONFIG_CHOICE 0x9000 #define REG_EP_CONFIG_DATA 0x8000 =20 +#define REG_VALUE_4GB_WINDOW 0x11 +#define REG_VALUE_4GB_CONFIG 0x88003000 + /* * The system memory controller can address up to 16 GiB of physical = memory * (although at time of writing the largest memory size available for = purchase - * is 8 GiB). However, the system DMA controller is capable of = accessing only a - * limited portion of the address space. Worse, the PCI-e controller = has further - * constraints for DMA, and those limitations are not wholly clear to = the - * author. NetBSD and Linux allow DMA on the lower 3 GiB of the = physical memory, - * but experimentation shows DMA performed above 960 MiB results in = data - * corruption with this driver. The limit of 960 MiB is taken from = OpenBSD, but + * is 8 GiB). However, the system DMA controller in early enough boards = is + * capable of accessing only a limited portion of the address space (3 = GiByte). + * Worse, the PCI-e controller has further constraints for DMA, and = those + * limitations are not wholly clear to the author. NetBSD and Linux = allow + * DMA on the lower 3 GiB of the physical memory. OpenBSD used 960 = MiByte but * apparently that value was chosen for satisfying a constraint of an = unrelated * peripheral. - * - * Whatever the true maximum address, 960 MiB works. */ -#define DMA_HIGH_LIMIT 0x3c000000 -#define MAX_MEMORY_LOG2 0x21 -#define REG_VALUE_DMA_WINDOW_LOW (MAX_MEMORY_LOG2 - 0xf) +#define DMA_HIGH_LIMIT ((bus_addr_t)0xc0000000u-1) #define REG_VALUE_DMA_WINDOW_HIGH 0x0 -#define DMA_WINDOW_ENABLE 0x3000 -#define REG_VALUE_DMA_WINDOW_CONFIG \ - (((MAX_MEMORY_LOG2 - 0xf) << 0x1b) | DMA_WINDOW_ENABLE) =20 #define REG_VALUE_MSI_CONFIG 0xffe06540 =20 @@ -645,9 +640,9 @@ DMA_HIGH_LIMIT, /* lowaddr */ BUS_SPACE_MAXADDR, /* highaddr */ NULL, NULL, /* filter, filterarg */ - DMA_HIGH_LIMIT, /* maxsize */ + BUS_SPACE_MAXSIZE, /* maxsize */ BUS_SPACE_UNRESTRICTED, /* nsegments */ - DMA_HIGH_LIMIT, /* maxsegsize */ + BUS_SPACE_MAXSIZE, /* maxsegsize */ 0, /* flags */ NULL, NULL, /* lockfunc, lockarg */ &sc->dmat); @@ -674,9 +669,9 @@ * Set PCI->CPU memory window. This encodes the inbound window = showing * the system memory to the controller. */ - bcm_pcib_set_reg(sc, REG_DMA_WINDOW_LOW, = REG_VALUE_DMA_WINDOW_LOW); + bcm_pcib_set_reg(sc, REG_DMA_WINDOW_LOW, REG_VALUE_4GB_WINDOW); bcm_pcib_set_reg(sc, REG_DMA_WINDOW_HIGH, = REG_VALUE_DMA_WINDOW_HIGH); - bcm_pcib_set_reg(sc, REG_DMA_CONFIG, = REG_VALUE_DMA_WINDOW_CONFIG); + bcm_pcib_set_reg(sc, REG_DMA_CONFIG, REG_VALUE_4GB_CONFIG); =20 bcm_pcib_set_reg(sc, REG_BRIDGE_GISB_WINDOW, 0); bcm_pcib_set_reg(sc, REG_DMA_WINDOW_1, 0); Index: /usr/src/sys/arm/broadcom/bcm2835/bcm2838_xhci.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/sys/arm/broadcom/bcm2835/bcm2838_xhci.c (revision = 365932) +++ /usr/src/sys/arm/broadcom/bcm2835/bcm2838_xhci.c (working copy) @@ -189,15 +189,7 @@ bcm_xhci_install_xhci_firmware(dev); =20 error =3D xhci_pci_attach(dev); - if (error) - return (error); - - /* 32 bit DMA is a limitation of the PCI-e controller, not the = VL805. */ - sc->sc_bus.dma_bits =3D 32; - if (bootverbose) - device_printf(dev, "note: switched to 32-bit DMA.\n"); - - return (0); + return (error); } =20 /* I've concluded from what I've seen in the code that lowaddr should be based on the pcie properties and should not worry about the maxsize and maxseg size figures being possibly smaller: that is a dma engine use worry, not a pci one. (Not that I could get that from the documentation that I quoted in the review.) Thus I put back the 2 BUS_SPACE_MAXSIZE uses. After the first huge-file duplicate-then-diff test sysctl reported lots of bounced transfers: # sysctl hw.busdma hw.busdma.zone1.alignment: 4096 hw.busdma.zone1.lowaddr: 0x3fffffff hw.busdma.zone1.total_deferred: 0 hw.busdma.zone1.total_bounced: 755770 hw.busdma.zone1.active_bpages: 0 hw.busdma.zone1.reserved_bpages: 0 hw.busdma.zone1.free_bpages: 838 hw.busdma.zone1.total_bpages: 838 hw.busdma.zone0.alignment: 4096 hw.busdma.zone0.lowaddr: 0xffffffff hw.busdma.zone0.total_deferred: 0 hw.busdma.zone0.total_bounced: 0 hw.busdma.zone0.active_bpages: 256 hw.busdma.zone0.reserved_bpages: 0 hw.busdma.zone0.free_bpages: 257 hw.busdma.zone0.total_bpages: 513 hw.busdma.total_bpages: 1351 For the non-power-of-2 boundary (0xc0000000-1), it appears to use the next smaller power of 2 for the boundary (0x40000000-1), without having to explicitly code both types of values specially for the RPi4B. (Of course, it also avoids using 2 GiBytes to potentially avoid more bouncing.) I'll note that, prior to the change, there was after an example first test: hw.busdma.zone2.total_bounced: 1091942 and 174 in zone 1. So the bounce count has decreased. I'll note that "total_bounced" need not be the a page count: it is incremented by 1 after the loop for a bounce, not inside the loop. Lots of pages of data were bounced. For reference (the test as of a gpu_mem_1024=3D32 context): Physical memory chunk(s): 0x00000000002000 - 0x00000007ef0fff, 133099520 bytes (32495 pages) 0x00000007f0f000 - 0x00000034bfffff, 751767552 bytes (183537 pages) 0x00000036052000 - 0x0000003cb2efff, 112054272 bytes (27357 pages) 0x0000003cb36000 - 0x0000003cb36fff, 4096 bytes (1 pages) 0x0000003cb38000 - 0x0000003cb39fff, 8192 bytes (2 pages) 0x0000003cb3b000 - 0x0000003cb3cfff, 8192 bytes (2 pages) 0x0000003cb40000 - 0x0000003cb40fff, 4096 bytes (1 pages) 0x0000003cb42000 - 0x0000003cb43fff, 8192 bytes (2 pages) 0x0000003cb45000 - 0x0000003df4ffff, 21016576 bytes (5131 pages) 0x0000003df60000 - 0x0000003dffffff, 655360 bytes (160 pages) 0x00000040000000 - 0x000000fbffffff, 3154116608 bytes (770048 pages) 0x00000100000000 - 0x000001f372afff, 4084379648 bytes (997163 pages) FYI, before the huge-file duplicate-and-diff test: # sysctl hw.busdma hw.busdma.zone1.alignment: 4096 hw.busdma.zone1.lowaddr: 0x3fffffff hw.busdma.zone1.total_deferred: 0 hw.busdma.zone1.total_bounced: 866 hw.busdma.zone1.active_bpages: 2 hw.busdma.zone1.reserved_bpages: 0 hw.busdma.zone1.free_bpages: 836 hw.busdma.zone1.total_bpages: 838 hw.busdma.zone0.alignment: 4096 hw.busdma.zone0.lowaddr: 0xffffffff hw.busdma.zone0.total_deferred: 0 hw.busdma.zone0.total_bounced: 0 hw.busdma.zone0.active_bpages: 256 hw.busdma.zone0.reserved_bpages: 0 hw.busdma.zone0.free_bpages: 257 hw.busdma.zone0.total_bpages: 513 hw.busdma.total_bpages: 1351 After the duplicate but before the diff: # sysctl hw.busdma hw.busdma.zone1.alignment: 4096 hw.busdma.zone1.lowaddr: 0x3fffffff hw.busdma.zone1.total_deferred: 0 hw.busdma.zone1.total_bounced: 513604 hw.busdma.zone1.active_bpages: 8 hw.busdma.zone1.reserved_bpages: 0 hw.busdma.zone1.free_bpages: 830 hw.busdma.zone1.total_bpages: 838 hw.busdma.zone0.alignment: 4096 hw.busdma.zone0.lowaddr: 0xffffffff hw.busdma.zone0.total_deferred: 0 hw.busdma.zone0.total_bounced: 0 hw.busdma.zone0.active_bpages: 256 hw.busdma.zone0.reserved_bpages: 0 hw.busdma.zone0.free_bpages: 257 hw.busdma.zone0.total_bpages: 513 hw.busdma.total_bpages: 1351 =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)