Date: Mon, 29 Apr 2019 12:00:33 -0400 From: Tycho Nightingale <tychon@freebsd.org> To: Niclas Zeising <zeising@FreeBSD.org> Cc: Johannes Lundberg <johalun@FreeBSD.org>, "freebsd-x11@freebsd.org" <freebsd-x11@freebsd.org> Subject: Re: dmar, dma_pool, etc Message-ID: <594E1E71-6834-431E-B122-005E64EDB1C2@freebsd.org> In-Reply-To: <60b447bb-81da-4c01-e164-bdf10e5560b0@freebsd.org> References: <e0415524-3126-5ea2-c2e2-3d3dccc6832e@FreeBSD.org> <9E2356CF-6483-4C06-B4A8-0120088063FE@freebsd.org> <60b447bb-81da-4c01-e164-bdf10e5560b0@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi, > On Apr 29, 2019, at 11:06 AM, Niclas Zeising <zeising@FreeBSD.org> = wrote: >=20 > On 2019-04-29 16:41, Tycho Nightingale wrote: >> Hi, >>> On Apr 27, 2019, at 2:46 PM, Johannes Lundberg <johalun@FreeBSD.org> = wrote: >>>=20 >>> Hi >>>=20 >>> Tycho, I'd like to understand what is the goal with the changes to >>> linuxkpi and what you plan to use it for. LinuxKPI base is tightly >>> connected to LinuxkPI GPLv2 and the DRM drivers in ports. We need to >>> make sure that any change to base LinuxKPI is compatible with what = we >>> have in ports, or patch the ports to handle the changes in base. >> Understood. The reports of the recent change to LinuxKPI having = untoward effects are concerning. I=E2=80=99m trying to get more = information to understand them and at the same time setup a machine in = an attempt to reproduce them. >> Stepping back a bit, my addition of bus_dma to LinuxKPI is to make = the mlx4/mlx5 drivers work in an environment where the IOMMU is enabled. = Before physical addresses and dma addresses were not differentiated = making this impossible. The intention of my patch was to get these = devices into compliance and perhaps also other devices; minimally when = the IOMMU is disabled things should behave as they were before. >=20 > How do I disable the IOMMU? Removal of the DMAR ACPI table (this is what typical BIOS knob does) is = the sure way to ensure the IOMMU is off, regardless of system knobs. In = the BIOS ensure that =E2=80=98VT-d=E2=80=99 or =E2=80=98Directed I/O=E2=80= =99 is disabled. >>> For example, there is a CONFIG_INTEL_IOMMU options in Linux code = that >>> enables DMAR. It turns out, ttm has it's own dma_pool = implementation. >>> This is possible since in Linux dma_pool is private to dmapool.c. >>> Enabling this option for us cause a compile error since dma_pool is >>> public in base linuxkpi. I don't know if this is really a problem if >>> CONFIG_INTEL_IOMMU (or CONFIG_SWIOTBL) are options that we'll never = use=E2=80=A6 >> Sounds like there is some redundancy there which can be eliminated. = But with respect to your question of enabling the IOMMU outside of base; = that=E2=80=99s more than what I intended to. >>> Also, we do have a problem with Firefox causing GPU hangs so I'd >>> appreciate it if Tycho could look through linuxkpi_gplv2, drm and >>> i915kms (i915kms does not use ttm so no need to look there for = problems >>> with Intel GPU) to see if there are any places needing patching. I = know >>> there's one vtophys() call in fb_mmap() but IIRC, that is never = used. >>> I'll look into that next. There are also uses of PHYS_TO_DMAP() and >>> VM_PAGE_TO_PHYS(). Would any of these need patching? >>>=20 >>> Use the default branch at https://github.com/FreeBSDDesktop/kms-drm >> I=E2=80=99m planning to have a look, but to get things working as = before nothing further should need patching even if the physical = addresses are treated as dma addresses. >> For the GPU it=E2=80=99s important to note that enabling the IOMMU = didn=E2=80=99t work before, was not a goal of this error, and would be = expected to not work right now. However, it=E2=80=99s possible this = change revealed some cases where some DMAR functionality was enabled in = the BIOS and since the support is incomplete breakage is happening. Or = something else that I am overlooking right now. >=20 > Hi! > I understand that enabling the IOMMU in the drm drivers is not your = priority, which is OK. > Since this breaks graphics drivers, however, is it possible to revert = the change (and related changes if any) until we figure out what's going = on? There are numerous reports of hard lockups or GPU hangs with this = change, and it feels like a resolution of the issue is some time off = still. >=20 > As a side note, I can readily reproduce the hang on a spare laptop, = please let me know if I can help in testing or diagnosing in any way. If you can readily reproduce the hang, since there are 2 halves that = comprised the fix (the DMA pool and non-pool mappings) it would be = instructive to try reverting either dmapool.h or dma-mapping.h = independently to see if that helps. Tycho=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?594E1E71-6834-431E-B122-005E64EDB1C2>