From owner-freebsd-x11@freebsd.org Tue Apr 30 16:54:50 2019 Return-Path: Delivered-To: freebsd-x11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 00C7B1597D66 for ; Tue, 30 Apr 2019 16:54:50 +0000 (UTC) (envelope-from zeising@freebsd.org) Received: from mail.daemonic.se (mail.daemonic.se [176.58.89.161]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5D382835A2; Tue, 30 Apr 2019 16:54:49 +0000 (UTC) (envelope-from zeising@freebsd.org) Received: from cid.daemonic.se (localhost [IPv6:::1]) by mail.daemonic.se (Postfix) with ESMTP id 44tngk1CRszDjsy; Tue, 30 Apr 2019 16:54:42 +0000 (UTC) X-Virus-Scanned: amavisd-new at daemonic.se Received: from mail.daemonic.se ([IPv6:::1]) (using TLS with cipher ECDHE-RSA-AES128-GCM-SHA256) by cid.daemonic.se (mailscanner.daemonic.se [IPv6:::1]) (amavisd-new, port 10587) with ESMTPS id QQJUr5h76lWy; Tue, 30 Apr 2019 16:54:41 +0000 (UTC) Received: from garnet.daemonic.se (unknown [IPv6:2001:470:dca9:201:1142:67e7:d90:681b]) by mail.daemonic.se (Postfix) with ESMTPSA id 44tngh70xdzDjp8; Tue, 30 Apr 2019 16:54:40 +0000 (UTC) Subject: Re: dmar, dma_pool, etc From: Niclas Zeising To: Andrey Fesenko , Tycho Nightingale Cc: "freebsd-x11@freebsd.org" , Johannes Lundberg References: <9E2356CF-6483-4C06-B4A8-0120088063FE@freebsd.org> <60b447bb-81da-4c01-e164-bdf10e5560b0@freebsd.org> <594E1E71-6834-431E-B122-005E64EDB1C2@freebsd.org> <3a07ffef-a978-2fdd-8d54-85fc0b6f3a63@freebsd.org> <23fe1183-d12c-b4b8-958f-34cee6e33977@freebsd.org> <9E61210C-4939-4D3A-8110-72023B67BBE6@freebsd.org> <4e89ea00-7439-b0ea-3614-ee344d3fe074@freebsd.org> Message-ID: Date: Tue, 30 Apr 2019 18:54:40 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: <4e89ea00-7439-b0ea-3614-ee344d3fe074@freebsd.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 5D382835A2 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-2.97 / 15.00]; local_wl_from(0.00)[freebsd.org]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0]; NEURAL_HAM_SHORT(-0.97)[-0.970,0]; ASN(0.00)[asn:36236, ipnet:176.58.89.0/24, country:US]; NEURAL_HAM_LONG(-1.00)[-1.000,0] X-BeenThere: freebsd-x11@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: X11 on FreeBSD -- maintaining and support List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Apr 2019 16:54:50 -0000 On 2019-04-30 14:54, Niclas Zeising wrote: > On 2019-04-30 14:25, Andrey Fesenko wrote: >> .On Tue, Apr 30, 2019 at 2:16 AM Tycho Nightingale=20 >> wrote: >>> >>> >>> Hi, >>> >>>> On Apr 29, 2019, at 4:24 PM, Tycho Nightingale =20 >>>> wrote: >>>> >>>> >>>>> On Apr 29, 2019, at 2:34 PM, Niclas Zeising =20 >>>>> wrote: >>>>> >>>>> On 2019-04-29 20:27, Niclas Zeising wrote: >>>>>> On 2019-04-29 18:00, Tycho Nightingale wrote: >>>>>>>> On Apr 29, 2019, at 11:06 AM, Niclas Zeising=20 >>>>>>>> wrote: >>>>>>>> >>>>>>>> As a side note, I can readily reproduce the hang on a spare=20 >>>>>>>> laptop, please let me know if I can help in testing or=20 >>>>>>>> diagnosing in any way. >>>>>>> >>>>>>> >>>>>>> If you can readily reproduce the hang, since there are 2 halves=20 >>>>>>> that comprised the fix (the DMA pool and non-pool mappings) it=20 >>>>>>> would be instructive to try reverting either dmapool.h or=20 >>>>>>> dma-mapping.h independently to see if that helps. >>>>>>> >>>>>> Hi! >>>>>> I will test this and report back.=C2=A0 Thank you! >>>>>> Regards >>>>> >>>>> Hi! >>>>> Just reverting dmapool.h or dma-mapping.h doesn't work, it won't=20 >>>>> build. I need to revert more than that.=C2=A0 Can you help me figur= e out=20 >>>>> what to revert in either case, or provide a patch? >>>> >>>> Thanks for trying.=C2=A0 I managed to setup my HW and reproduce this= and=20 >>>> I see your point that it doesn=E2=80=99t cleave in half perfectly. >>>> >>>> Since I=E2=80=99ve got a (non)-working test case, let me investigate= a bit=20 >>>> further. >>> >>> I believe I=E2=80=99ve figured this out.=C2=A0 At least in my case I=E2= =80=99ve been able=20 >>> to eliminate the hangs with the patch included at the bottom of this=20 >>> email. >>> >>> The issue stems from the implementation of dma_map_sg().=C2=A0 Accord= ing=20 >>> to the linux DMA documentation[1], entries of the scatter/gather list= =20 >>> may be coalesced: >>> >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 int >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 dma_map_sg(struct de= vice *dev, struct scatterlist *sg, int=20 >>> nents, enum dma_data_direction direction) >>> >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Returns: the number = of DMA address segments mapped (this may=20 >>> be shorter >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 than passed = in if some elements of the=20 >>> scatter/gather list are >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 physically or virtua= lly adjacent and an IOMMU maps them with=20 >>> a single >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 entry). >>> >>> My implementation of dma_map_sg() does just that.=C2=A0 As it turns o= ut=20 >>> there are several consumers of dma_map_sg(), e.g.=20 >>> i915_gem_map_dma_buf() and i915_gem_gtt_prepare_pages(), and=20 >>> mock_map_dma_buf() among others that aren=E2=80=99t compliant with th= is=20 >>> documented API.=C2=A0 Going back to the non-coalesced version is like= ly=20 >>> the only path forward as bugs in the callee redefine this API de fact= o. >>> >>> If this addresses the hangs you are seeing, I will post this on=20 >>> Phabricator. >>> >>> Tycho >>> >>> [1] https://www.kernel.org/doc/Documentation/DMA-API.txt >>> >>> Index: sys/compat/linuxkpi/common/src/linux_pci.c >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>> --- sys/compat/linuxkpi/common/src/linux_pci.c=C2=A0 (revision 346687= ) >>> +++ sys/compat/linuxkpi/common/src/linux_pci.c=C2=A0 (working copy) >>> @@ -565,10 +565,8 @@ >>> =C2=A0 { >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 struct linux_dma_pri= v *priv; >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 struct linux_dma_obj= *obj; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 struct scatterlist *dma_sg, *sg= ; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 int dma_nents, error, nseg; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 size_t seg_len; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 vm_paddr_t seg_phys, prev_phys_= end; >>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 struct scatterlist *sg; >>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 int error, i, nseg; >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 bus_dma_segment_t se= g; >>> >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 priv =3D dev->dma_pr= iv; >>> @@ -580,25 +578,11 @@ >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 return (0); >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } >>> >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 sg =3D sgl; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 dma_sg =3D sg; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 dma_nents =3D 0; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 while (nents > 0) { >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 seg_phys =3D sg_phys(sg); >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 seg_len =3D sg->length; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 while (--nents > 0) { >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 prev_phys= _end =3D sg_phys(sg) + sg->length; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 sg =3D sg= _next(sg); >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (prev_= phys_end !=3D sg_phys(sg)) >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 break; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 seg_len += =3D sg->length; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 } >>> - >>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 for_each_sg(sgl, sg, nents, i) = { >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 nseg =3D -1; >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 mtx_lock(&priv->dma_lock); >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 if (_bus_dmamap_load_phys(priv->dmat, obj->dm= amap, >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 seg_phys, seg_len, BUS_DMA_NOWAIT= , >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 &seg, &nseg) !=3D 0) { >>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 sg_phys(sg), sg->length, 0, &seg,= &nseg) !=3D 0) { >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= bus_dmamap_unload(priv->dmat, obj->dmamap); >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= bus_dmamap_destroy(priv->dmat, obj->dmamap); >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= mtx_unlock(&priv->dma_lock); >>> @@ -607,14 +591,9 @@ >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 } >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 mtx_unlock(&priv->dma_lock); >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 KASSERT(++nseg =3D=3D 1, ("More than one segm= ent=20 >>> (nseg=3D%d)", nseg)); >>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 sg_dma_address(sg) =3D seg.ds_addr; >>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } >>> >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 sg_dma_address(dma_sg) =3D seg.ds_addr; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 sg_dma_len(dma_sg) =3D seg.ds_len; >>> - >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 dma_sg =3D sg_next(dma_sg); >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 dma_nents++; >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } >>> - >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 obj->dma_addr =3D sg= _dma_address(sgl); >>> >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mtx_lock(&priv->ptre= e_lock); >>> @@ -629,7 +608,7 @@ >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 return (0); >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } >>> >>> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return (dma_nents); >>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return (nents); >>> =C2=A0 } >>> >>> =C2=A0 void >> >> Thanks, this patch make system more stable Farefox apparently work=20 >> infinitely. >> >> However, a heavy application (3D game in wine) still hangs after a >> while, once hung after about a minute, about another (after reboot) >> ten minutes later, video artifacts appear before the crash >> >=20 > Hi! > This used to work before the regression? > This means that this patch doesn't solve all the issues with the dmar=20 > changes.=C2=A0 Which GPU are you using? > Regards Hi! The patch has been updated and now lives as part of this review:=20 https://reviews.freebsd.org/D20097 Can you please test that version? It requires an updated source tree to=20 current from today. Thanks! Regards --=20 Niclas Zeising