Date: Tue, 4 Apr 2017 17:32:01 +0100 From: Andrew Turner <andrew@fubar.geek.nz> To: Zbigniew Bodek <zbb@semihalf.com> Cc: Marcin Wojtas <mw@semihalf.com>, Adrian Chadd <adrian.chadd@gmail.com>, "freebsd-arm@freebsd.org" <freebsd-arm@freebsd.org> Subject: Re: Coherent bus_dma for ARMv7 Message-ID: <67707E59-C2F6-4037-80A5-D9EFE7D5F52D@fubar.geek.nz> In-Reply-To: <CAG7dG%2BxvNSwhxsy3%2BZrj8VEwP_mfdc2VZ2qppBS6ifpBd3dOqg@mail.gmail.com> References: <CAPv3WKejupH4JG1=_XV6PknnKifxbF0qoVubtMRVtZWEoBZ7xg@mail.gmail.com> <CAPv3WKcG_Y=5zPk-2vGbQfCpiNcWvuUymY8EHRnWsM2FUzcG4Q@mail.gmail.com> <CAJ-Vmo=UC9K3e0TJGU86JZ6npzRevjpC3UEwgWrrh9mjCELKMg@mail.gmail.com> <CAPv3WKez8ZP=xONvPWGiyL5pQDKO4wGF_qQn=eT1S_LzrGJ74g@mail.gmail.com> <0EA39E6B-3460-45B9-8247-CB6CC8631C5F@fubar.geek.nz> <CAG7dG%2BxhB5J-sHSCg9-03776PO=SHDTuESpfy76GbeNOc8mziw@mail.gmail.com> <5586A20F-125D-41EC-9741-BBFFDB0A7A38@fubar.geek.nz> <CAG7dG%2BxvNSwhxsy3%2BZrj8VEwP_mfdc2VZ2qppBS6ifpBd3dOqg@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> On 3 Apr 2017, at 15:58, Zbigniew Bodek <zbb@semihalf.com> wrote: >=20 > 2017-04-03 16:37 GMT+02:00 Andrew Turner <andrew@fubar.geek.nz = <mailto:andrew@fubar.geek.nz>>: >=20 > > On 3 Apr 2017, at 15:14, Zbigniew Bodek <zbb@semihalf.com = <mailto:zbb@semihalf.com>> wrote: > > > > 2017-04-03 15:37 GMT+02:00 Andrew Turner <andrew@fubar.geek.nz = <mailto:andrew@fubar.geek.nz>>: > > > > > On 3 Apr 2017, at 14:16, Marcin Wojtas <mw@semihalf.com = <mailto:mw@semihalf.com>> wrote: > > > > > > Hi Adrian, > > > > > > Frankly we are not such experts in armv6 bus_dma, which looks more > > > complicated than one in arm64, so we thought it's much safer no to = mix > > > the two solutions and leave for the user a single switch to = decide, > > > which one to pick. Afaik Andrew Turner did the oposite for arm64 > > > (implement not coherent solution on top of coherent bus_dma), = however > > > I'm not sure if it's possible here in an easy way - there's also > > > pretty significant risk of regression for all platforms. Please = let me > > > know your opinion. Do you think some sort of update of armv6 is > > > doable? > > > > I don=E2=80=99t see any reason to think it would be difficult to add = support for coherent hardware to the existing armv6 busdma code. It=E2=80=99= s mostly skipping cache operations based on a flag in the dam tag. > > > > Andrew > > > > Hello Andrew, > > > > I don't think anyone uses flags related to DMA coherency in = bus_dma_tag_create. >=20 > The generic PCI and ThunderX PCIe PEM drivers do. The former based on = the FDT dma-coherent flag. >=20 > In this particular example this will work as almost all (not all) = devices on ThunderX are PCIe devices. For most ARMv7-based SoCs this is = not true. We would need to create a coherent DMA tag for the top level = buses and ensure that this is propagated correctly down to the = subordinate buses and devices. You will already need to ensure the property is propagated to children, = although DMA coherency is a property of the device, not the system, e.g. = it is possible for only some devices to be coherent depending on how the = vendor attached them to the internal bus. > =20 >=20 > > > > Nevertheless, for coherent platforms we want bus_dma to always map = DMA memory as normal WBWA regardless of the flags passed to create a = bus_dma MAP. > > For example, we don't want to perform any synchronization and we = want to have the cacheable memory regardless of BUS_DMA_COHERENT flag = used. >=20 > That=E2=80=99s already the case on arm64, the only synchronisation = used when the tag is created with BUS_DMA_COHERENT is a memory barrier. >=20 > For PCI. I have non-PCI devices with coherent DMA, I just haven=E2=80=99t had a = chance to finish testing and upstreaming the patches. > =20 >=20 > > Otherwise the performance improvement will apply only to those = drivers that dare to use BUS_DMA_COHERENT flag and very few of them does = that. In other words, what is the point of having coherent DMA if you do = cache maintenance anyway? >=20 > The drivers should be getting the parent DMA tag and passing this to = bus_dma_tag_create. If this was created with BUS_DMA_COHERENT it will = pass this to the child tag. This is how the above PCI drivers work. >=20 >=20 > This basically makes sense to me if we do the same for all buses or = once for every platform. The question is how much additional stuff is = added to busdma_machdep-v6.c to make it work on all relevant platforms = because it is quite different from the ARM64 implementation. It should just be setting a flag then using it to always allocate = cacheable memory and stop performing cache operations on a sync. It = shouldn=E2=80=99t affect existing platforms as they won=E2=80=99t set = the appropriate flag when creating the tag. >=20 > Still we can go with the ARM64 approach and add new DMA handling, = parallel to the existing one. Improve it over time to handle = non-coherent DMA and replace the old one with the new one when it is = proven to be correct for all. The arm64 approach would be to handle BUS_DMA_COHERENT when creating a = tag, and using this to decide how to correctly sync the DMA memory. The = current code has been well tested on multiple different SoCs. Andrew
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?67707E59-C2F6-4037-80A5-D9EFE7D5F52D>