Date: Tue, 30 Apr 2019 14:54:37 +0200 From: Niclas Zeising <zeising@freebsd.org> To: Andrey Fesenko <f0andrey@gmail.com>, Tycho Nightingale <tychon@freebsd.org> Cc: "freebsd-x11@freebsd.org" <freebsd-x11@freebsd.org>, Johannes Lundberg <johalun@freebsd.org> Subject: Re: dmar, dma_pool, etc Message-ID: <4e89ea00-7439-b0ea-3614-ee344d3fe074@freebsd.org> In-Reply-To: <CA%2BK5SrOcqKSvyuQaC5zfT-Wum7%2B%2B_XsFQwWSmwDTVH1qu7E2kg@mail.gmail.com> References: <e0415524-3126-5ea2-c2e2-3d3dccc6832e@FreeBSD.org> <9E2356CF-6483-4C06-B4A8-0120088063FE@freebsd.org> <60b447bb-81da-4c01-e164-bdf10e5560b0@freebsd.org> <594E1E71-6834-431E-B122-005E64EDB1C2@freebsd.org> <3a07ffef-a978-2fdd-8d54-85fc0b6f3a63@freebsd.org> <23fe1183-d12c-b4b8-958f-34cee6e33977@freebsd.org> <9E61210C-4939-4D3A-8110-72023B67BBE6@freebsd.org> <C5E9EE94-00F7-4F9D-AFAC-48609F635C8D@freebsd.org> <CA%2BK5SrOcqKSvyuQaC5zfT-Wum7%2B%2B_XsFQwWSmwDTVH1qu7E2kg@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2019-04-30 14:25, Andrey Fesenko wrote: > .On Tue, Apr 30, 2019 at 2:16 AM Tycho Nightingale <tychon@freebsd.org>= wrote: >> >> >> Hi, >> >>> On Apr 29, 2019, at 4:24 PM, Tycho Nightingale <tychon@freebsd.org> w= rote: >>> >>> >>>> On Apr 29, 2019, at 2:34 PM, Niclas Zeising <zeising@freebsd.org> wr= ote: >>>> >>>> On 2019-04-29 20:27, Niclas Zeising wrote: >>>>> On 2019-04-29 18:00, Tycho Nightingale wrote: >>>>>>> On Apr 29, 2019, at 11:06 AM, Niclas Zeising <zeising@FreeBSD.org= > wrote: >>>>>>> >>>>>>> As a side note, I can readily reproduce the hang on a spare lapto= p, please let me know if I can help in testing or diagnosing in any way. >>>>>> >>>>>> >>>>>> If you can readily reproduce the hang, since there are 2 halves th= at comprised the fix (the DMA pool and non-pool mappings) it would be ins= tructive to try reverting either dmapool.h or dma-mapping.h independently= to see if that helps. >>>>>> >>>>> Hi! >>>>> I will test this and report back. Thank you! >>>>> Regards >>>> >>>> Hi! >>>> Just reverting dmapool.h or dma-mapping.h doesn't work, it won't bui= ld. I need to revert more than that. Can you help me figure out what to = revert in either case, or provide a patch? >>> >>> Thanks for trying. I managed to setup my HW and reproduce this and I= see your point that it doesn=E2=80=99t cleave in half perfectly. >>> >>> Since I=E2=80=99ve got a (non)-working test case, let me investigate = a bit further. >> >> I believe I=E2=80=99ve figured this out. At least in my case I=E2=80=99= ve been able to eliminate the hangs with the patch included at the bottom= of this email. >> >> The issue stems from the implementation of dma_map_sg(). According to= the linux DMA documentation[1], entries of the scatter/gather list may b= e coalesced: >> >> int >> dma_map_sg(struct device *dev, struct scatterlist *sg, int ne= nts, enum dma_data_direction direction) >> >> Returns: the number of DMA address segments mapped (this may = be shorter >> than <nents> passed in if some elements of the scatter/gather= list are >> physically or virtually adjacent and an IOMMU maps them with = a single >> entry). >> >> My implementation of dma_map_sg() does just that. As it turns out the= re are several consumers of dma_map_sg(), e.g. i915_gem_map_dma_buf() and= i915_gem_gtt_prepare_pages(), and mock_map_dma_buf() among others that a= ren=E2=80=99t compliant with this documented API. Going back to the non-= coalesced version is likely the only path forward as bugs in the callee r= edefine this API de facto. >> >> If this addresses the hangs you are seeing, I will post this on Phabri= cator. >> >> Tycho >> >> [1] https://www.kernel.org/doc/Documentation/DMA-API.txt >> >> Index: sys/compat/linuxkpi/common/src/linux_pci.c >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> --- sys/compat/linuxkpi/common/src/linux_pci.c (revision 346687) >> +++ sys/compat/linuxkpi/common/src/linux_pci.c (working copy) >> @@ -565,10 +565,8 @@ >> { >> struct linux_dma_priv *priv; >> struct linux_dma_obj *obj; >> - struct scatterlist *dma_sg, *sg; >> - int dma_nents, error, nseg; >> - size_t seg_len; >> - vm_paddr_t seg_phys, prev_phys_end; >> + struct scatterlist *sg; >> + int error, i, nseg; >> bus_dma_segment_t seg; >> >> priv =3D dev->dma_priv; >> @@ -580,25 +578,11 @@ >> return (0); >> } >> >> - sg =3D sgl; >> - dma_sg =3D sg; >> - dma_nents =3D 0; >> - while (nents > 0) { >> - seg_phys =3D sg_phys(sg); >> - seg_len =3D sg->length; >> - while (--nents > 0) { >> - prev_phys_end =3D sg_phys(sg) + sg->length; >> - sg =3D sg_next(sg); >> - if (prev_phys_end !=3D sg_phys(sg)) >> - break; >> - seg_len +=3D sg->length; >> - } >> - >> + for_each_sg(sgl, sg, nents, i) { >> nseg =3D -1; >> mtx_lock(&priv->dma_lock); >> if (_bus_dmamap_load_phys(priv->dmat, obj->dmamap, >> - seg_phys, seg_len, BUS_DMA_NOWAIT, >> - &seg, &nseg) !=3D 0) { >> + sg_phys(sg), sg->length, 0, &seg, &nseg) !=3D 0) { >> bus_dmamap_unload(priv->dmat, obj->dmamap); >> bus_dmamap_destroy(priv->dmat, obj->dmamap); >> mtx_unlock(&priv->dma_lock); >> @@ -607,14 +591,9 @@ >> } >> mtx_unlock(&priv->dma_lock); >> KASSERT(++nseg =3D=3D 1, ("More than one segment (nse= g=3D%d)", nseg)); >> + sg_dma_address(sg) =3D seg.ds_addr; >> + } >> >> - sg_dma_address(dma_sg) =3D seg.ds_addr; >> - sg_dma_len(dma_sg) =3D seg.ds_len; >> - >> - dma_sg =3D sg_next(dma_sg); >> - dma_nents++; >> - } >> - >> obj->dma_addr =3D sg_dma_address(sgl); >> >> mtx_lock(&priv->ptree_lock); >> @@ -629,7 +608,7 @@ >> return (0); >> } >> >> - return (dma_nents); >> + return (nents); >> } >> >> void >=20 > Thanks, this patch make system more stable Farefox apparently work infi= nitely. >=20 > However, a heavy application (3D game in wine) still hangs after a > while, once hung after about a minute, about another (after reboot) > ten minutes later, video artifacts appear before the crash >=20 Hi! This used to work before the regression? This means that this patch doesn't solve all the issues with the dmar=20 changes. Which GPU are you using? Regards --=20 Niclas Zeising
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4e89ea00-7439-b0ea-3614-ee344d3fe074>