From owner-freebsd-toolchain@freebsd.org Sat Feb 20 09:03:00 2016 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2CBC5AAE4A5 for ; Sat, 20 Feb 2016 09:03:00 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-211-153.reflexion.net [208.70.211.153]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E439F1BBD for ; Sat, 20 Feb 2016 09:02:59 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 28353 invoked from network); 20 Feb 2016 09:03:13 -0000 Received: from unknown (HELO rtc-sm-01.app.dca.reflexion.local) (10.81.150.1) by 0 (rfx-qmail) with SMTP; 20 Feb 2016 09:03:13 -0000 Received: by rtc-sm-01.app.dca.reflexion.local (Reflexion email security v7.80.0) with SMTP; Sat, 20 Feb 2016 04:03:01 -0500 (EST) Received: (qmail 27690 invoked from network); 20 Feb 2016 09:03:01 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with SMTP; 20 Feb 2016 09:03:01 -0000 X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-76-115-7-162.hsd1.or.comcast.net [76.115.7.162]) by iron2.pdx.net (Postfix) with ESMTPSA id E153E1C405F; Sat, 20 Feb 2016 01:02:56 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Subject: Re: I've submitted 207175 for a clang 3.8.0 va_list handling problem for powerpc [fpr use also tested] From: Mark Millard In-Reply-To: <20160220083450.GA55777@vlakno.cz> Date: Sat, 20 Feb 2016 01:02:56 -0800 Cc: FreeBSD Toolchain , FreeBSD PowerPC ML Content-Transfer-Encoding: quoted-printable Message-Id: <601B33C1-D258-4F23-9814-1B4291C57A5F@dsl-only.net> References: <20160215191100.GA17387@vlakno.cz> <3A260EC5-E69A-4980-8F74-C04395F4E5F4@dsl-only.net> <20160215201800.GA20796@vlakno.cz> <74577A87-3006-43A9-9EAB-F51D946B6245@dsl-only.net> <28FF474D-2109-4605-8B2B-C5374CBCCF42@dsl-only.net> <8EB46124-3335-4643-8C64-16DA56D481F5@dsl-only.net> <8C40A5D7-C0B8-4142-89D4-228017C446CE@dsl-only.net> <20160220083450.GA55777@vlakno.cz> To: Roman Divacky X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 20 Feb 2016 09:03:00 -0000 Thanks! llvm bugzilla's 26605 did not having anything yet for this so I've = copied over your note. But I've left the status alone. The next thing that I ran into looks nastier: c++'s exception handling = is broken. #include int main(void) { try { throw std::exception(); } catch (std::exception& e) {} // same result without & return 0; } does not work on powerpc (SEGV) or powerpc64 (unbounded loop, never = returning from _Unwind_RaiseException). (The powerpc64 context is using = devel/powerpc64-gcc or g++49 as the compiler with the system's headers = and libraries. powerpc64-gcc was used for buildworld/buildkernel as well = for this context.) [g++49 using its own headers and libraries works fine for the above = program.] =3D=3D=3D Mark Millard markmi at dsl-only.net On 2016-Feb-20, at 12:34 AM, Roman Divacky wrote: Fwiw, I've just committed the patch to clang in r261422. You might want to keep using a local modification or ask dim@ to import that patch to our copy of 3.8. Thanks for your diagnosis and testing! Roman On Thu, Feb 18, 2016 at 02:29:29PM -0800, Mark Millard wrote: > On 2016-Feb-17, at 9:23 PM, Mark Millard wrote: >>=20 >> My fpr related notes/worries about the fix were wrong. >>=20 >> I finally got some time to look at this again and I see that I = somehow missed the following code when I looked before: >>=20 >> // The calling convention either uses 1-2 GPRs or 1 FPR. >> Address NumRegsAddr =3D Address::invalid(); >> if (isInt || IsSoftFloatABI) { >> NumRegsAddr =3D Builder.CreateStructGEP(VAList, 0, = CharUnits::Zero(), "gpr"); >> } else { >> NumRegsAddr =3D Builder.CreateStructGEP(VAList, 1, = CharUnits::One(), "fpr"); >> } >>=20 >> So the >>=20 >> Builder.CreateStore(Builder.getInt8(8), NumRegsAddr); >>=20 >> in Case 2 is tracking gpr vs. fpr usage contexts as it should. Also: >>=20 >> llvm::Value *NumRegs =3D Builder.CreateLoad(NumRegsAddr, = "numUsedRegs"); >>=20 >> // "Align" the register count when TY is i64. >> if (isI64 || (isF64 && IsSoftFloatABI)) { >> NumRegs =3D Builder.CreateAdd(NumRegs, Builder.getInt8(1)); >> NumRegs =3D Builder.CreateAnd(NumRegs, Builder.getInt8((uint8_t) = ~1U)); >> } >>=20 >> llvm::Value *CC =3D >> Builder.CreateICmpULT(NumRegs, Builder.getInt8(8), "cond"); >>=20 >> is using the same bounds check figure (8) for gpr and fpr. >>=20 >> Apparently that common bound is one reason that the clang numbering = is not the same as the ABI document's numbering: clang's numbering = allows using the same figure for both contexts. (Given the prior = alignment for isI64 (or isF64 with IsSoftFloatABI).) >>=20 >> Sorry for the prior noise about fpr. >>=20 >> It is still true that DOUBLE_OR_FLOAT is untested so far. >>=20 >> =3D=3D=3D >> Mark Millard >> markmi at dsl-only.net >=20 > I finally got some time to apply to some basic testing involving = double as well (for involving fpr use). . . >=20 > No problems with exceptions. Looking at the memory contents at various = stages in gdb looks good. va_list's gpr, fpr, overflow_arg_area changes = as its va_args use progresses look good. Values extracted by va_args use = looks good. Both default and -O2. >=20 > The added >=20 >> Builder.CreateStore(Builder.getInt8(8), NumRegsAddr); >=20 >=20 > passes my checks. I've not observed any problems from buildworld = materials, unlike when that line is missing. >=20 > [Note: I run with the signal delivery modified to have a "red zone" to = deal with other aspects of clang 3.8.0 code generation that are not ABI = compliant for when the stack pointer is moved. Having a "red zone" is = still operationally correct for an ABI compliant code generation, it = just temporarily wastes more bytes. Also: the kernel was built with gcc = 4.2.1 but world was built with clang 3.8.0.] >=20 >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 > . . . [bad fpr related material omitted] . . . >=20 > On 2016-Feb-16, at 2:45 AM, Mark Millard wrote: >=20 > I used: >=20 >> Index: /usr/src/contrib/llvm/tools/clang/lib/CodeGen/TargetInfo.cpp >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> --- /usr/src/contrib/llvm/tools/clang/lib/CodeGen/TargetInfo.cpp = (revision 295601) >> +++ /usr/src/contrib/llvm/tools/clang/lib/CodeGen/TargetInfo.cpp = (working copy) >> @@ -3569,6 +3569,8 @@ >> { >> CGF.EmitBlock(UsingOverflow); >>=20 >> + Builder.CreateStore(Builder.getInt8(8), NumRegsAddr); >> + >> // Everything in the overflow area is rounded up to a size of at = least 4. >> CharUnits OverflowAreaAlign =3D CharUnits::fromQuantity(4); >=20 > as my test change. (Get evidence of operation before potential cleanup = of the duplicated 8's.) >=20 > After a full buildworld/installworld based on the updated compiler. . = . >=20 > My simple example of the problem no longer fails. >=20 > "ls -l -n /" now works. >=20 > "svnlite update -r295601 /usr/src" now works. >=20 > So whatever you want to do for the details of any submitted code, the = basics of the change do avoid the SEGVs and allow these programs to = work. >=20 >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 > On 2016-Feb-15, at 4:27 PM, Mark Millard wrote: >=20 > On 2016-Feb-15, at 1:20 PM, Mark Millard = wrote: >>=20 >> On 2016-Feb-15, at 12:18 PM, Roman Divacky = wrote: >>>=20 >>> On Mon, Feb 15, 2016 at 12:17:50PM -0800, Mark Millard wrote: >>>> On 2016-Feb-15, at 11:11 AM, Roman Divacky = wrote: >>>>>=20 >>>>> Mark, I believe you're right. What do you think about this patch? >>>>>=20 >>>>> Index: tools/clang/lib/CodeGen/TargetInfo.cpp >>>>> = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>>> --- tools/clang/lib/CodeGen/TargetInfo.cpp (revision = 260852) >>>>> +++ tools/clang/lib/CodeGen/TargetInfo.cpp (working copy) >>>>> @@ -3599,6 +3599,8 @@ >>>>> { >>>>> CGF.EmitBlock(UsingOverflow); >>>>>=20 >>>>> + Builder.CreateStore(Builder.getInt8(11), NumRegsAddr); >>>>> + >>>>> // Everything in the overflow area is rounded up to a size of at = least 4. >>>>> CharUnits OverflowAreaAlign =3D CharUnits::fromQuantity(4); >>>>>=20 >>>>>=20 >>>>> Can you test it? >>>>=20 >>>> It may be later today before I can start the the test process. >>>>=20 >>>> While your change is not wrong as presented, it does seem to be = based on the ABI document's numbering with the range 3 <=3D gr <12, = where 3 <=3D gr < 11 cover r3-r10 use and gr=3D11 implies overflow stack = area use. (gr being the ABI documents name.) >>>>=20 >>>> The clang code generation that I saw while analyzing the problem = and the clang source that you had me look at did not use that numbering. = Instead it seems to be based on 0 <=3D gpr < 9, where 0 <=3D gpr < 8 = cover r3-r10 use and gpr=3D8 implies overflow stack area use. (gpr being = what gdb showed me as I remember.) In other words: clang counts the = number of "parameter registers" already in use as it goes along instead = of tracking register numbers that have been used. >>>>=20 >>>> So assigning any value that appears to be positive and >=3D 8 = should work, such as: >>>>=20 >>>> Builder.CreateStore(Builder.getInt8(8), NumRegsAddr); >>>=20 >>> Can you check what number gcc uses? We want to be interoperable with = gcc. >>>=20 >>> Anyway, thanks for testing! >>>=20 >>> Roman >>=20 >> I'll do that check of gcc 4.2.1 code generation before starting the = test later today. >>=20 >> But if the clang numbering is different in gcc 4.2.1 then far more = than just adding a >>=20 >>> Builder.CreateStore(Builder.getInt8(?), NumRegsAddr) >>=20 >>=20 >> for some "?" would need to be involved in the changes in order to = reach compatibility. >>=20 >>=20 >> I'll note that for clang 3.8.0 the actual comparison instruction = generated is of the form >>=20 >>> cmplwi r?,7 >>=20 >>=20 >> for some r?, such as r5 or r4, and the conditional branch generated = is a bgt instruction. >>=20 >> =3D=3D=3D >> Mark Millard >> markmi at dsl-only.net >=20 > gcc 4.2.1 generates comparison instructions for va_arg of the form: >=20 > cmplwi cr7,r0,8 >=20 > and the conditional branch generated is a "bge cr7, . . ." = instruction. >=20 > So the same number range is in use by both compilers: They are = compatible for the bounds checks for reg vs. overflow for how they = count, equality inclusion/exclusion matching up with the specific number = (8 vs. 7) used to make things the same overall. >=20 > Other aspects of the code generation distinctions would take me time = to analyze. It will be a while before I will be looking at other points. >=20 >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 >=20 >=20 >=20 >=20 >=20 > _______________________________________________ > freebsd-toolchain@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain > To unsubscribe, send any mail to = "freebsd-toolchain-unsubscribe@freebsd.org"