Date: Tue, 30 Apr 2019 18:54:40 +0200 From: Niclas Zeising <zeising@freebsd.org> To: Andrey Fesenko <f0andrey@gmail.com>, Tycho Nightingale <tychon@freebsd.org> Cc: "freebsd-x11@freebsd.org" <freebsd-x11@freebsd.org>, Johannes Lundberg <johalun@freebsd.org> Subject: Re: dmar, dma_pool, etc Message-ID: <a38cb18a-4cc0-7fb9-f5fb-03e20aad2b94@freebsd.org> In-Reply-To: <4e89ea00-7439-b0ea-3614-ee344d3fe074@freebsd.org> References: <e0415524-3126-5ea2-c2e2-3d3dccc6832e@FreeBSD.org> <9E2356CF-6483-4C06-B4A8-0120088063FE@freebsd.org> <60b447bb-81da-4c01-e164-bdf10e5560b0@freebsd.org> <594E1E71-6834-431E-B122-005E64EDB1C2@freebsd.org> <3a07ffef-a978-2fdd-8d54-85fc0b6f3a63@freebsd.org> <23fe1183-d12c-b4b8-958f-34cee6e33977@freebsd.org> <9E61210C-4939-4D3A-8110-72023B67BBE6@freebsd.org> <C5E9EE94-00F7-4F9D-AFAC-48609F635C8D@freebsd.org> <CA%2BK5SrOcqKSvyuQaC5zfT-Wum7%2B%2B_XsFQwWSmwDTVH1qu7E2kg@mail.gmail.com> <4e89ea00-7439-b0ea-3614-ee344d3fe074@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2019-04-30 14:54, Niclas Zeising wrote: > On 2019-04-30 14:25, Andrey Fesenko wrote: >> .On Tue, Apr 30, 2019 at 2:16 AM Tycho Nightingale=20 >> <tychon@freebsd.org> wrote: >>> >>> >>> Hi, >>> >>>> On Apr 29, 2019, at 4:24 PM, Tycho Nightingale <tychon@freebsd.org>=20 >>>> wrote: >>>> >>>> >>>>> On Apr 29, 2019, at 2:34 PM, Niclas Zeising <zeising@freebsd.org>=20 >>>>> wrote: >>>>> >>>>> On 2019-04-29 20:27, Niclas Zeising wrote: >>>>>> On 2019-04-29 18:00, Tycho Nightingale wrote: >>>>>>>> On Apr 29, 2019, at 11:06 AM, Niclas Zeising=20 >>>>>>>> <zeising@FreeBSD.org> wrote: >>>>>>>> >>>>>>>> As a side note, I can readily reproduce the hang on a spare=20 >>>>>>>> laptop, please let me know if I can help in testing or=20 >>>>>>>> diagnosing in any way. >>>>>>> >>>>>>> >>>>>>> If you can readily reproduce the hang, since there are 2 halves=20 >>>>>>> that comprised the fix (the DMA pool and non-pool mappings) it=20 >>>>>>> would be instructive to try reverting either dmapool.h or=20 >>>>>>> dma-mapping.h independently to see if that helps. >>>>>>> >>>>>> Hi! >>>>>> I will test this and report back.=C2=A0 Thank you! >>>>>> Regards >>>>> >>>>> Hi! >>>>> Just reverting dmapool.h or dma-mapping.h doesn't work, it won't=20 >>>>> build. I need to revert more than that.=C2=A0 Can you help me figur= e out=20 >>>>> what to revert in either case, or provide a patch? >>>> >>>> Thanks for trying.=C2=A0 I managed to setup my HW and reproduce this= and=20 >>>> I see your point that it doesn=E2=80=99t cleave in half perfectly. >>>> >>>> Since I=E2=80=99ve got a (non)-working test case, let me investigate= a bit=20 >>>> further. >>> >>> I believe I=E2=80=99ve figured this out.=C2=A0 At least in my case I=E2= =80=99ve been able=20 >>> to eliminate the hangs with the patch included at the bottom of this=20 >>> email. >>> >>> The issue stems from the implementation of dma_map_sg().=C2=A0 Accord= ing=20 >>> to the linux DMA documentation[1], entries of the scatter/gather list= =20 >>> may be coalesced: >>> >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 int >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 dma_map_sg(struct de= vice *dev, struct scatterlist *sg, int=20 >>> nents, enum dma_data_direction direction) >>> >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Returns: the number = of DMA address segments mapped (this may=20 >>> be shorter >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 than <nents> passed = in if some elements of the=20 >>> scatter/gather list are >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 physically or virtua= lly adjacent and an IOMMU maps them with=20 >>> a single >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 entry). >>> >>> My implementation of dma_map_sg() does just that.=C2=A0 As it turns o= ut=20 >>> there are several consumers of dma_map_sg(), e.g.=20 >>> i915_gem_map_dma_buf() and i915_gem_gtt_prepare_pages(), and=20 >>> mock_map_dma_buf() among others that aren=E2=80=99t compliant with th= is=20 >>> documented API.=C2=A0 Going back to the non-coalesced version is like= ly=20 >>> the only path forward as bugs in the callee redefine this API de fact= o. >>> >>> If this addresses the hangs you are seeing, I will post this on=20 >>> Phabricator. >>> >>> Tycho >>> >>> [1] https://www.kernel.org/doc/Documentation/DMA-API.txt >>> >>> Index: sys/compat/linuxkpi/common/src/linux_pci.c >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>> --- sys/compat/linuxkpi/common/src/linux_pci.c=C2=A0 (revision 346687= ) >>> +++ sys/compat/linuxkpi/common/src/linux_pci.c=C2=A0 (working copy) >>> @@ -565,10 +565,8 @@ >>> =C2=A0 { >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 struct linux_dma_pri= v *priv; >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 struct linux_dma_obj= *obj; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 struct scatterlist *dma_sg, *sg= ; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 int dma_nents, error, nseg; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 size_t seg_len; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 vm_paddr_t seg_phys, prev_phys_= end; >>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 struct scatterlist *sg; >>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 int error, i, nseg; >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 bus_dma_segment_t se= g; >>> >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 priv =3D dev->dma_pr= iv; >>> @@ -580,25 +578,11 @@ >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 return (0); >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } >>> >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 sg =3D sgl; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 dma_sg =3D sg; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 dma_nents =3D 0; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 while (nents > 0) { >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 seg_phys =3D sg_phys(sg); >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 seg_len =3D sg->length; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 while (--nents > 0) { >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 prev_phys= _end =3D sg_phys(sg) + sg->length; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 sg =3D sg= _next(sg); >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (prev_= phys_end !=3D sg_phys(sg)) >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 break; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 seg_len += =3D sg->length; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 } >>> - >>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 for_each_sg(sgl, sg, nents, i) = { >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 nseg =3D -1; >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 mtx_lock(&priv->dma_lock); >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 if (_bus_dmamap_load_phys(priv->dmat, obj->dm= amap, >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 seg_phys, seg_len, BUS_DMA_NOWAIT= , >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 &seg, &nseg) !=3D 0) { >>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 sg_phys(sg), sg->length, 0, &seg,= &nseg) !=3D 0) { >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= bus_dmamap_unload(priv->dmat, obj->dmamap); >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= bus_dmamap_destroy(priv->dmat, obj->dmamap); >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= mtx_unlock(&priv->dma_lock); >>> @@ -607,14 +591,9 @@ >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 } >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 mtx_unlock(&priv->dma_lock); >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 KASSERT(++nseg =3D=3D 1, ("More than one segm= ent=20 >>> (nseg=3D%d)", nseg)); >>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 sg_dma_address(sg) =3D seg.ds_addr; >>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } >>> >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 sg_dma_address(dma_sg) =3D seg.ds_addr; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 sg_dma_len(dma_sg) =3D seg.ds_len; >>> - >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 dma_sg =3D sg_next(dma_sg); >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 dma_nents++; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } >>> - >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 obj->dma_addr =3D sg= _dma_address(sgl); >>> >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mtx_lock(&priv->ptre= e_lock); >>> @@ -629,7 +608,7 @@ >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 return (0); >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } >>> >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return (dma_nents); >>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return (nents); >>> =C2=A0 } >>> >>> =C2=A0 void >> >> Thanks, this patch make system more stable Farefox apparently work=20 >> infinitely. >> >> However, a heavy application (3D game in wine) still hangs after a >> while, once hung after about a minute, about another (after reboot) >> ten minutes later, video artifacts appear before the crash >> >=20 > Hi! > This used to work before the regression? > This means that this patch doesn't solve all the issues with the dmar=20 > changes.=C2=A0 Which GPU are you using? > Regards Hi! The patch has been updated and now lives as part of this review:=20 https://reviews.freebsd.org/D20097 Can you please test that version? It requires an updated source tree to=20 current from today. Thanks! Regards --=20 Niclas Zeising
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?a38cb18a-4cc0-7fb9-f5fb-03e20aad2b94>