From owner-freebsd-arm@freebsd.org Mon Feb 29 06:14:00 2016 Return-Path: Delivered-To: freebsd-arm@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 50182AB8169 for ; Mon, 29 Feb 2016 06:14:00 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-211-154.reflexion.net [208.70.211.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 14AD31E5E for ; Mon, 29 Feb 2016 06:13:59 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 24033 invoked from network); 29 Feb 2016 06:13:58 -0000 Received: from unknown (HELO mail-cs-02.app.dca.reflexion.local) (10.81.19.2) by 0 (rfx-qmail) with SMTP; 29 Feb 2016 06:13:58 -0000 Received: by mail-cs-02.app.dca.reflexion.local (Reflexion email security v7.80.0) with SMTP; Mon, 29 Feb 2016 01:13:51 -0500 (EST) Received: (qmail 31036 invoked from network); 29 Feb 2016 06:13:50 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with SMTP; 29 Feb 2016 06:13:50 -0000 X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-76-115-7-162.hsd1.or.comcast.net [76.115.7.162]) by iron2.pdx.net (Postfix) with ESMTPSA id E470DB1E001; Sun, 28 Feb 2016 22:13:50 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Subject: Re: clang 3.8.0 can mess up __builtin_dwarf_cfa (), at least for TARGET_ARCH=armv6, powerpc, powerpc64: a bug 207325 update From: Mark Millard In-Reply-To: <33D5358F-6783-44C1-8155-86FB93CABE6F@dsl-only.net> Date: Sun, 28 Feb 2016 22:13:56 -0800 Cc: freebsd-arm , FreeBSD PowerPC ML , FreeBSD Toolchain Content-Transfer-Encoding: quoted-printable Message-Id: <9DE18EC5-3C16-4B17-A0D0-5B5386961627@dsl-only.net> References: <83B8741C-B4C9-4EFB-A3B4-473F8F165984@dsl-only.net> <80EA4460-E842-46F5-B006-2A83FBBEE845@dsl-only.net> <366B67F9-6A14-4906-8545-1B57A3FF53B8@dsl-only.net> <20160228083934.GA60222@vlakno.cz> <22AD8E4F-B3F2-455E-8EBE-2C70E428D44A@dsl-only.net> <462637FA-6FD2-4421-8F0C-0DF565E94FA6@dsl-only.net> <8D92D76D-9FBC-45F6-A20D-0C2A8B38D291@dsl-only.net> <33D5358F-6783-44C1-8155-86FB93CABE6F@dsl-only.net> To: Roman Divacky X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Feb 2016 06:14:00 -0000 Back to the "case Builtin::BI__builtin_dwarf_cfa" and = "PPCTargetLowering::LowerFRAMEADDR" context: I made the wrong change and need to retry. The detail. . . Passing a 1 through instead of zero did not do what I expected to the = code generated. Instead it added one instruction: addi r3,r3,1 resulting in (objdump -d --prefix-addresses on the .o): > Disassembly of section .text: > 00000000 <_Z1fv> mflr r0 > 00000004 <_Z1fv+0x4> stw r31,-4(r1) > 00000008 <_Z1fv+0x8> stw r0,4(r1) > 0000000c <_Z1fv+0xc> stwu r1,-16(r1) > 00000010 <_Z1fv+0x10> mr r31,r1 > 00000014 <_Z1fv+0x14> mr r3,r31 > 00000018 <_Z1fv+0x18> addi r3,r3,1 > 0000001c <_Z1fv+0x1c> bl 0000001c <_Z1fv+0x1c> > 00000020 <_Z1fv+0x20> addi r1,r1,16 > 00000024 <_Z1fv+0x24> lwz r0,4(r1) > 00000028 <_Z1fv+0x28> lwz r31,-4(r1) > 0000002c <_Z1fv+0x2c> mtlr r0 > 00000030 <_Z1fv+0x30> blr In other words: it added the 1 as a byte offset like the comments that I = thought were wrong said. Since it does not appear that PPCTargetLowering::LowerFRAMEADDR would do = that with a 1 I conclude that PPCTargetLowering::LowerFRAMEADDR is not = involved with that figure. So looking around. . . /usr/src/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp = has: > case Intrinsic::eh_dwarf_cfa: { > SDValue CfaArg =3D = DAG.getSExtOrTrunc(getValue(I.getArgOperand(0)), sdl, > = TLI.getPointerTy(DAG.getDataLayout())); > SDValue Offset =3D DAG.getNode(ISD::ADD, sdl, > CfaArg.getValueType(), > = DAG.getNode(ISD::FRAME_TO_ARGS_OFFSET, sdl, > CfaArg.getValueType()), > CfaArg); > SDValue FA =3D DAG.getNode( > ISD::FRAMEADDR, sdl, TLI.getPointerTy(DAG.getDataLayout()), > DAG.getConstant(0, sdl, = TLI.getPointerTy(DAG.getDataLayout()))); > setValue(&I, DAG.getNode(ISD::ADD, sdl, FA.getValueType(), > FA, Offset)); > return nullptr; And so sure enough the argument is an offset as used by this code. And what I call the frame depth is plugged in as 0 here via = "DAG.getConstant(0, sdl, TLI.getPointerTy(DAG.getDataLayout()))". The = offset is applied after getting the frame address. So I get to revert my change and try again changing the above call to = use a 1 instead. It does not look like this changes the time frames in my history notes: = it has been using frame depth zero since V2.7 when "case = Builtin::BI__builtin_dwarf_cfa" appeared. In general my overall questions about the target triple controlling = which value to use (in DAG.getConstant hrere) still apply: It is not = obvious that something that has been using frame depth 0 since V2.7 can = be immediately changed to frame depth 1 for all contexts. =3D=3D=3D Mark Millard markmi at dsl-only.net On 2016-Feb-28, at 8:49 PM, Mark Millard wrote: Here is what the "ABI for the ARM 32 32-bit Architecture" "DWARF for the = ARM Architecture" document says about the CFA: > 3.4 Canonical Frame Address >=20 > The term Canonical Frame Address (CFA) is defined in [GDWARF], =C2=A76.4= , Call Frame Information. This ABI adopts the typical definition of CFA = given there. > =EF=81=AF The CFA is the value of the stack pointer (r13) at the call = site in the previous frame. This, with the armv6 code I've shown via "objdump -d", indicates that = for armv6 clang++'s __builtin_dwarf_cfa() return value is not the same = value as the official ARM ABI indicates. It also indicates that what g++ = returns does match the official ARM ABI. =3D=3D=3D Mark Millard markmi at dsl-only.net On 2016-Feb-28, at 5:40 PM, Mark Millard wrote: Looking some at clang/llvm history shows releases/branches: V2.6 did not have "case Builtin::BI__builtin_dwarf_cfa". V2.7 did have "case Builtin::BI__builtin_dwarf_cfa" but = PPCTargetLowering::LowerFRAMEADDR ignored the argument. V2.8 had PPCTargetLowering::LowerFRAMEADDR using its argument (as a = frame depth, not a byte offset). The apparently incorrect (not matching g++ frame depth returned) = comments, naming, and value (when viewed as a frame depth) for "case = Builtin::BI__builtin_dwarf_cfa" started in V2.7 and continues to this = day. That is a lot of time for various dependencies on the clang = (mis)definition to accumulate across everything that uses clang. It may be that limiting any change to specific TARGET_ARCH's for FreeBSD = is appropriate. FreeBSD would likely need to list the appropriate = TARGET_ARCH's, likely including powerpc and powerpc64 since clang before = 3.8.0 was not in use for buildworld for powerpc/powerpc64. Still this may have consequences for ports that use clang and might = reference clang-compiled __builtin_dwarf_cfa() use, possibly from a = lang/clang* instead of the system clang. My guess is that the = interoperability with being able to use g++ vintages as well may lead to = (modern?) lang/clang*'s tracking the fix for FreeBSD TARGET_ARCH's that = are fixed. I can ignore all this and build a system based on using 1 as the frame = depth just to test, just as a matter of proof of concept for powerpc. = (Powerpc64 hits a system libgcc_s defect and so needs more before C++ = exceptions can be tested overall.) =3D=3D=3D Mark Millard markmi at dsl-only.net On 2016-Feb-28, at 2:20 PM, Mark Millard wrote: In /usr/src/contrib/llvm/tools/clang/lib/CodeGen/CGBuiltin.cpp there is: > case Builtin::BI__builtin_dwarf_cfa: { > // The offset in bytes from the first argument to the CFA. > // > // Why on earth is this in the frontend? Is there any reason at > // all that the backend can't reasonably determine this while > // lowering llvm.eh.dwarf.cfa()? > // > // TODO: If there's a satisfactory reason, add a target hook for > // this instead of hard-coding 0, which is correct for most targets. > int32_t Offset =3D 0; >=20 > Value *F =3D CGM.getIntrinsic(Intrinsic::eh_dwarf_cfa); > return RValue::get(Builder.CreateCall(F, > llvm::ConstantInt::get(Int32Ty, = Offset))); > } I would have guessed that the internal argument was how many frames away = on the stack to go from what 0 produces (high er address direction). = g++'s __builtin_dwarf_cfa() returns the address for the next frame = compared to clang 3.8.0 (higher address direction). I'd call that more of a frame depth than an offset. .eh_frame and its = cfa material use offset terminology as byte offsets. And the comments = above talk of an offset in bytes --but "next frame" distances in bytes = would not be constant. Looking at a use of LowerFRAMEADDR in a LowerRETURNADDR, for example, > SDValue ARMTargetLowering::LowerRETURNADDR(SDValue Op, SelectionDAG = &DAG) const{ > . . . > EVT VT =3D Op.getValueType(); > SDLoc dl(Op); > unsigned Depth =3D = cast(Op.getOperand(0))->getZExtValue(); > if (Depth) { > SDValue FrameAddr =3D LowerFRAMEADDR(Op, DAG); > SDValue Offset =3D DAG.getConstant(4, dl, MVT::i32); > return DAG.getLoad(VT, dl, DAG.getEntryNode(), > DAG.getNode(ISD::ADD, dl, VT, FrameAddr, Offset), > MachinePointerInfo(), false, false, false, 0); > } > . . . > } (PPCTargetLowering::LowerRETURNADDR is similar.)=20 This has a mix of Depth and Offset overall, with the depth going to = LowerFRAMEADDR via Op but Offset used later in GAG.getLoad via adding to = the FrameAddr. This would lead me to guess that the terminology and comments in "case = Builtin::BI__builtin_dwarf_cfa" are wrong and that the = Builder.CreateCall has been given a frame depth, not an offset. > SDValue PPCTargetLowering::LowerFRAMEADDR(SDValue Op, > SelectionDAG &DAG) const { > SDLoc dl(Op); > unsigned Depth =3D = cast(Op.getOperand(0))->getZExtValue(); >=20 > MachineFunction &MF =3D DAG.getMachineFunction(); > MachineFrameInfo *MFI =3D MF.getFrameInfo(); > MFI->setFrameAddressIsTaken(true); >=20 > EVT PtrVT =3D = DAG.getTargetLoweringInfo().getPointerTy(MF.getDataLayout()); > bool isPPC64 =3D PtrVT =3D=3D MVT::i64; >=20 > // Naked functions never have a frame pointer, and so we use r1. For = all > // other functions, this decision must be delayed until during PEI. > unsigned FrameReg; > if (MF.getFunction()->hasFnAttribute(Attribute::Naked)) > FrameReg =3D isPPC64 ? PPC::X1 : PPC::R1; > else > FrameReg =3D isPPC64 ? PPC::FP8 : PPC::FP; >=20 > SDValue FrameAddr =3D DAG.getCopyFromReg(DAG.getEntryNode(), dl, = FrameReg, > PtrVT); > while (Depth--) > FrameAddr =3D DAG.getLoad(Op.getValueType(), dl, DAG.getEntryNode(), > FrameAddr, MachinePointerInfo(), false, = false, > false, 0); > return FrameAddr; > }=20 Again Op is called Depth --and is used to get from one frame pointer = value to the next: a frame depth. To match g++ 4.2.1 the value to use is 1 for depth. Overall, at least applied to powerpc/powerpc64: > . . . > // TODO: If there's a satisfactory reason, add a target hook for > // this instead of hard-coding 0, which is correct for most targets. > int32_t Offset =3D 0; I think the comments in this area are actually talking about byte = offsets, not depths and are just wrong. A byte offset of 0 would make = sense relative to hardcoding but the value is actually a frame depth --a = very different context. I think that the naming of the variable is just wrong: it should be = called Depth. And I think that the comments should have originally talked about using = a hard coded Depth 1 to match g++ and preexisting library usage of = __builtin_dwarf_cfa() for C++ and other exception handling (.eh_frame = usage). ANd the code should avhe matched. As far as I can tell this error in the "case = Builtin::BI__builtin_dwarf_cfa:" design was not caught until now. But since the mess has been around a longtime just switching everything = to match the g++ context now likely has its own problems. (Not just a = FreeBSD issue.) For FreeBSD I expect that Depth 1 could be used for powerpc and = powerpc64: if it has been wrong for a long time (not just 3.8.0) then = powerpc/powerpc64 FreeBSD has likely been broken for C++ exception = handling when buildworld was via clang for just as long. (But only = recently has clang gotten this near working for buildworld for at least = one of powerpc/powerpc64. Currently powerpc is closer, given that = powerpc64 does not support softfloat last I knew.) For other TARGET_ARCH's: For FreeBSD armv6 it is less clear to me: it is based on clang as it is = and I do not know what C++ exception ABI it uses. If a modern gcc/g++ = buildworld had problems with C++ exception handling, does anything need = to be done about it? For FreeBSD armv6 and the like: is xtoolchain like = support important? FreeBSD may have similar questions for other TARGET_ARCH's. =3D=3D=3D Mark Millard markmi at dsl-only.net On 2016-Feb-28, at 2:46 AM, Mark Millard wrote: On 2016-Feb-28, at 12:39 AM, Roman Divacky = wrote: >=20 > Mark, >=20 > __builtin_dwarf_cfa() is lowered in clang to llvm intrinsic = eh_dwarf_cfa. > There's a depth argument (which defaults to 0, saying it's correct for = most > targets).=20 >=20 > Then the intrinsic gets lowered in SelectionDAG using > PPCTargetLowering::LowerFRAMEADDR() >=20 >=20 > Can you check that 1) the depth should be 0 for ppc64/ppc32 2) that > LowerFRAMEADDR() does something sensible? >=20 > There's a loop depth-times, so I wonder if that makes a difference. >=20 > Thanks, Roman "Lowered"? I'm not familiar with the clang code base or its terminology. = Handed off to a lower level interface, may be? As near as I can tell libgcc_s could be made to deal with clang 3.8.0's = way of __builtin_dwarf_cfa() working for powerpc/powerpc64. But then use = with g++ would be broken on powerpc/powerpc64 unless there were some = sort of live "which compiler's type of code" test also involved. Having only one libgcc_s and multiple compilers using it for a given = TARGET_ARCH=3D (for example, devel/powerpc64-xtoolchain-gcc like uses) = suggests sticking to one convention per TARGET_ARCH=3D for = __builtin_dwarf_cfa(). I would guess that g++ conventions win in this type of context for = FreeBSD, under the guideline of trying to be gcc 4.2.1 "ABI" compatible. = libgcc_s from FreeBSD works for C++ exceptions with its gcc 4.2.1 for = powerpc and powerpc64 as things are as far as I know. But for clang++ FreeBSD is one context among many and other libraries = may be based on clang 3.8.0's existing interpretation, without gcc/g++ = compatibility constraints. (I've no experience with earlier clang = vintages for the issue.) It may be that any change needs to be FreeBSD = target-triple specific for all I know. In essence: making the convention = part of the ABI chosen. I'll probably get some sleep before looking much at the code that you = reference. A quick look at part of it suggests a fair amount of = research/study for me to interpret things reliably. The loop may be somewhat analogous to _Unwind_RaiseException's loop, but = for a specific depth. I would currently guess that depth 1 would match = gcc 4.2.1's result for __builtin_dwarf_cfa(). But there was also some other "address"(?) builtin support routine = nearby that seemed to call into LowerFRAMEADDR() and I've no clue if g++ = 4.2.1 uses the same depth-figure standard for both sorts of things or = not. For all I know both types of builtins(?) might have mismatches with = gcc/g++ 4.2.1 and both might need fixes. I do vaguely remember seeing a builtin in FreeBSD code for something = that had an explicit number in the argument list, possibly = __builtin_frame_address(n)(?). But I only saw __builtin_dwarf_cfa() with = no argument in the argument list as far as I remember. If clang 3.8.0 and gcc 4.2.1 disagreed about what the numbering standard = referred to for __builtin_frame_address(n) (or whatever it was), that = would not be good and compatibility would imply conversions between the = conventions for the 2 FreeBSD contexts. I have not checked for armv6 related clang++ vs. g++ compatibility for = C++ exception-handling. If anything is not operating for that context I = expect that it would be g++ that generated buildworld code that did not = work based on the FreeBSD source code: clang/clang++ is the normal = compiler and kyua seemed to operate, unlike on the powerpc/powerpc64. I've never tried to build armv6 via an equivalent of = devel/powerpc64-gcc. I do not know if armv6 even uses the same sort of = C++ exception-handling ABI or not. But I do know that = __builtin_dwarf_cfa() is not compatible between clang++ and g++ from the = 2-line source and comparing objdump -d results. So more than powerpc/powerpc64 might be involved overall. =3D=3D=3D Mark Millard markmi at dsl-only.net