Date: Mon, 1 May 2023 20:41:32 +0200 From: Dimitry Andric <dim@FreeBSD.org> To: John Baldwin <jhb@freebsd.org> Cc: "Jason A. Harmening" <jah@freebsd.org>, Konstantin Belousov <kostikbel@gmail.com>, src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-branches@freebsd.org Subject: Re: git: 060699e91369 - stable/13 - Merge llvm-project release/15.x llvmorg-15.0.7-0-g8dfdcc7b7bf6 Message-ID: <E0E68380-908D-4939-B009-844BA005DB64@FreeBSD.org> In-Reply-To: <df22667e-831f-deaa-3cb0-7b9797caa833@FreeBSD.org> References: <202304092135.339LZMeJ081640@gitrepo.freebsd.org> <ZE1jEcA7iL3QrbOP@corona> <76DD2CB9-986B-4349-8F46-3B7BF63EB315@FreeBSD.org> <ZE1vtvghEV_RsD2b@corona> <ZE33_AWw0DO8-EjM@kib.kiev.ua> <ZE7-AcTXtCkSzsHb@corona> <ZE8JJQizER2WrTr2@corona> <ZE8yqfOkrrMddua1@corona> <df22667e-831f-deaa-3cb0-7b9797caa833@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--Apple-Mail=_4E63B4C0-59F7-4609-A297-F6686495A611 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii On 1 May 2023, at 18:14, John Baldwin <jhb@freebsd.org> wrote: >=20 > On 4/30/23 8:31 PM, Jason A. Harmening wrote: >> On Sun, Apr 30, 2023 at 07:34:45PM -0500, Jason A. Harmening wrote: >>> On Sun, Apr 30, 2023 at 06:47:13PM -0500, Jason A. Harmening wrote: >>>> On Sun, Apr 30, 2023 at 08:09:16AM +0300, Konstantin Belousov = wrote: >>>>> On Sat, Apr 29, 2023 at 02:27:50PM -0500, Jason A. Harmening = wrote: >>>>>> On Sat, Apr 29, 2023 at 08:49:28PM +0200, Dimitry Andric wrote: >>>>>>> On 29 Apr 2023, at 20:33, Jason A. Harmening <jah@FreeBSD.org> = wrote: >>>>>>>>=20 >>>>>>>> On Sun, Apr 09, 2023 at 09:35:22PM +0000, Dimitry Andric wrote: >>>>>>>>> The branch stable/13 has been updated by dim: >>>>>>>>>=20 >>>>>>>>> URL: = https://cgit.FreeBSD.org/src/commit/?id=3D060699e9136975d51d3f726b9785bdba= c9a62ba6 >>>>>>>>>=20 >>>>>>>>> commit 060699e9136975d51d3f726b9785bdbac9a62ba6 >>>>>>>>> Author: Dimitry Andric <dim@FreeBSD.org> >>>>>>>>> AuthorDate: 2023-01-14 16:33:24 +0000 >>>>>>>>> Commit: Dimitry Andric <dim@FreeBSD.org> >>>>>>>>> CommitDate: 2023-04-09 14:54:52 +0000 >>>>>>>>>=20 >>>>>>>>> Merge llvm-project release/15.x = llvmorg-15.0.7-0-g8dfdcc7b7bf6 >>>>>>>>>=20 >>>>>>>>> This updates llvm, clang, compiler-rt, libc++, libunwind, = lld, lldb and >>>>>>>>> openmp to llvmorg-15.0.7-0-g8dfdcc7b7bf6. >>>>>>>>>=20 >>>>>>>>> PR: 265425 >>>>>>>>> MFC after: 2 weeks >>>>>>>>=20 >>>>>>>> This MFC of llvm15 appears to have completely broken the Intel = IOMMU >>>>>>>> driver on my stable/13 machine. After this series of commits, = any >>>>>>>> downstream DMA seems to produce an IOMMU translation fault, = which >>>>>>>> renders the machine completely unusable: no nvme boot disk, no = usb >>>>>>>> keyboard, etc. >>>>>>>>=20 >>>>>>>> The faults I see look something like this: >>>>>>>>=20 >>>>>>>> DMAR4: ahci0: pci0:17:5 sid 8d fault acc 0 adt 0x0 reason 0x3 = addr 26000 >>>>>>>>=20 >>>>>>>> It's a bit surprising to see a toolchain upgrade produce = breakage like >>>>>>>> this, but that's what git bisect clearly tells me. I wonder if = some of >>>>>>>> the IOMMU control structures might be defined as C bitfields = and the new >>>>>>>> compiler is emitting them differently? Also, was any breakage = like this >>>>>>>> observed when -current was upgraded to llvm15 several months = ago? >>>>>>>=20 >>>>>>> I haven't heard anything about such breakage, no. >>>>>>>=20 >>>>>>>=20 >>>>>>>> More generally, this is the second time in as many months I've = had to >>>>>>>> deal with IOMMU breakage on -stable. I can't imagine I'm the = only >>>>>>>> person who sees value in running with DMA remapping enabled; do = we need >>>>>>>> a dedicated DMAR-enabled machine in the cluster to smoke-test = changes >>>>>>>> like this? More generally, should we avoid MFCing high-risk = changes >>>>>>>> like this? >>>>>>>=20 >>>>>>> Since there were very few bug reports, it was not deemed high = risk. >>>>>>>=20 >>>>>>> In any case, it would be good to get the bottom of what is = causing the >>>>>>> problem, so is there any way you can isolate which code seems to = be >>>>>>> going "bad"? >>>>>>>=20 >>>>>>> For example, if this problem affects code in sys/dev/iommu, is = there >>>>>>> some way you can compile that part with -O1, or with an older = version >>>>>>> of clang (from ports), to see if the problem goes away? >>>>>>=20 >>>>>> I did try removing all custom make.conf settings (previously I = just had >>>>>> CPUTYPE?=3Dicelake-server), but that didn't change the behavior. >>>>>>=20 >>>>>> Before I try further build tweaks, I'd like to ask if the IOMMU = fault >>>>>> report can provide guidance here? AFAICT all the faults I'm = getting >>>>>> show "reason 0x3". If I'm reading the VT-d spec correctly, = FR=3D0x3 >>>>>> indicates an invalid context entry, in other words there's = something the >>>>>> hardware doesn't like in the way the address width or pagetable = base is >>>>>> configured for the PCIe requestor. >>>>>=20 >>>>> I would start looking at the other direction: might be, there are = still some >>>>> left shifts for int32 values with the shift count > 30, or uint32 = with the >>>>> count > 31. >>>>>=20 >>>>> Also might be useful to dump each context entry on creation, it is = kept >>>>> constant after. >>>>=20 >>>> I did look over the constants in intel_reg.h, and didn't see = anything >>>> that looked as though it would be susceptible to sign-extension or >>>> truncation bugs. In the failing case it's much easier for me to = catch >>>> the fault messages than any initialization message, so I = instrumented >>>> the fault handler to get the context entry from the dmar_ctx object >>>> using the same logic as dmar_map_ctx_entry(), and then dump out the = ctx1 >>>> and ctx2 fields. What I see are messages like: >>>>=20 >>>> ... ctx1 0x10013b001 ctx2 0x103 >>>>=20 >>>> At first glance these "look right": the P bit is set in ctx1, and = the >>>> rest of the field looks like a valid physical address. ctx2 also >>>> doesn't have any of the reserved bits set, but in all cases it does = have >>>> AW=3D3, which would indicate 57-bit AGAW. But when I boot the last >>>> working kernel, from the revision prior to the llvm15 MFC, I see = this in >>>> dmesg: >>>>=20 >>>> ahci0: dmar4 pci0:0:17:5 rid 8d domain 1 mgaw 48 agaw 48 re-mapped >>>>=20 >>>> ...all reported devices show 48-bit MGAW/AGAW, so I would expect = ctx2 to >>>> have AW=3D2. I suspect this may be the source of the fault, but = I'm not >>>> sure how it's getting configured that way, whether it's an issue = with >>>> reading the capability register or something else. >>>>=20 >>>=20 >>> I can confirm that hacking domain_set_agaw() to always use the = settings >>> from sagaw_bits[2] eliminates the faults and at least allows the = machine >>> to boot to single-user mode. >> I see what's happening now. When I added the hack to always set >> sagaw_bits[2], I noted that the passed-in MGAW was still 57, while >> unit->hw_cap had the correct value of 0x4 (=3D> 4-level paging, = 48-bit AW) >> in bits 12:8. The problem is that sagaw_bits has agaw=3D64 in its = last >> entry. This results in dmar_maxaddr2mgaw() attempting a comparison >> against 1ULL << 64 in the final iteration of its first loop. I = suspect >> the new compiler probably determines that last iteration is = meaningless >> and simply omits it from the (probably unrolled) loop. Since the = "loop" >> terminates with i < nitems(sagaw_bits), the subsequent "allow_less = ..." >> case doesn't execute and we end up erroneously selecting a 57-bit >> address width. Just commenting out that last entry in sagaw_bits = fixes >> the problem. >> So, two questions: >> 1) Does any VT-d hardware actually support 6-level paging? The ca. = 2021 >> VT-d spec I'm looking at indicates 5-level is the greatest depth >> supported, with everything above that being reserved. >> 2) I'd expect clang to try very hard to error out in a situation like >> this, but I see that sys/conf/kern.mk sets -Wno-shift-count-overflow >> among other things, and more of them were added for clang 15. This >> seems like a really bad idea, regardless of how much of a PITA it may = be >> to fix these warnings. >=20 > FWIW, I've been working on trying to re-enable some of the warnings = that > were disabled for clang 15 in main. I'll move that one higher up on = my > todo list. In this particular case, it doesn't warn about it though. I think KASAN might be a better 'catcher' for this kind of error, or a KUBSAN, if we had one... -Dimitry --Apple-Mail=_4E63B4C0-59F7-4609-A297-F6686495A611 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.2 iF0EARECAB0WIQR6tGLSzjX8bUI5T82wXqMKLiCWowUCZFAH3AAKCRCwXqMKLiCW o8k3AJ9DkrMB167K+sSNel6qHWAMhRILBwCfYE5kwUADLOtEslYT8QQ1pzM2sYI= =/I7V -----END PGP SIGNATURE----- --Apple-Mail=_4E63B4C0-59F7-4609-A297-F6686495A611--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E0E68380-908D-4939-B009-844BA005DB64>