From owner-freebsd-hackers@freebsd.org Sun Oct 29 23:13:35 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A5DABE4ED56 for ; Sun, 29 Oct 2017 23:13:35 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-210-130.reflexion.net [208.70.210.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6A9716AC21 for ; Sun, 29 Oct 2017 23:13:34 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 1734 invoked from network); 29 Oct 2017 22:46:48 -0000 Received: from unknown (HELO mail-cs-01.app.dca.reflexion.local) (10.81.19.1) by 0 (rfx-qmail) with SMTP; 29 Oct 2017 22:46:48 -0000 Received: by mail-cs-01.app.dca.reflexion.local (Reflexion email security v8.40.3) with SMTP; Sun, 29 Oct 2017 18:46:48 -0400 (EDT) Received: (qmail 20373 invoked from network); 29 Oct 2017 22:46:48 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 29 Oct 2017 22:46:48 -0000 Received: from [192.168.1.25] (c-76-115-7-162.hsd1.or.comcast.net [76.115.7.162]) by iron2.pdx.net (Postfix) with ESMTPSA id D1102EC8683; Sun, 29 Oct 2017 15:46:47 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: Question for powerpc64 lib32 (powerpc) support: what ABI is the powerpc code supposed to be using? From: Mark Millard In-Reply-To: <67C51163-D178-4CAC-AA3C-1178EDD22E01@dsl-only.net> Date: Sun, 29 Oct 2017 15:46:47 -0700 Cc: FreeBSD PowerPC ML , freebsd-hackers Content-Transfer-Encoding: quoted-printable Message-Id: <68CE11EB-1C94-4F3F-B593-162E0B9E0537@dsl-only.net> References: <618F5419-0BB7-496E-B1B8-DA8BE6D54A58@dsl-only.net> <299784B1-55F3-4C39-B07B-CE6C9E9BB2A8@dsl-only.net> <67C51163-D178-4CAC-AA3C-1178EDD22E01@dsl-only.net> To: Justin Hibbits X-Mailer: Apple Mail (2.3273) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Oct 2017 23:13:35 -0000 [I add notes about what I see in the SysVR4 powerpc ABI document about r30 vs. _GLOBAL_OFFSET_TABLE_ use --and some notes about more modern Power Architecture=C2=AE 32-bit Application Binary Interface Supplement 1.0 material that does, by contrast, specify a type of context where r30 must be used for the purpose. Bugzilla 206123 now has this material as well.] On 2017-Oct-29, at 6:50 AM, Mark Millard wrote: > [Message history removed as this does not flow well with > the prior material.] >=20 > I've figured out the mismatch and, so, why/how lib32 > fails for devel/powerpc64-gcc based builds: the > .init code generation for devel/powerpc64-gcc is tied > to the glibc crti.S for powerpc and that does not > match what FreeBSD has for the interface between > the two parts. >=20 > As of 6 years ago or so glibc has code like (I'm > only dealing with the init side of things as an > example): >=20 > .section .init,"ax",@progbits > stwu r1, -16(r1) > . . . > bcl 29,31,.LMAGIC_LABEL > .LMAGIC_LABEL: > mflr r30 > addis r30, r30, _GLOBAL_OFFSET_TABLE_-.LMAGIC_LABEL@ha > addi r30, r30, _GLOBAL_OFFSET_TABLE_-.LMAGIC_LABEL@l > . . . (some pre-init function code) . . . >=20 > that comes before code that is from frame_dummy > and __do_global_ctors_aux (that are from > = /wrkdirs/usr/ports/devel/powerpc64-gcc/work/gcc-6.3.0/libgcc/crtstuff.c = ). >=20 > The code generated for frame_dummy and > __do_global_ctors_aux expects r30 to already > be set up for _GLOBAL_OFFSET_TABLE_ related > use by the kind of code that I showed above. >=20 >=20 > Instead FreeBSD has for powerpc just: > (things are configured for devel/powerpc64-gcc > to use it) >=20 > #include > __FBSDID("$FreeBSD: head/lib/csu/powerpc/crti.S 217399 2011-01-14 = 11:34:58Z kib $"); >=20 > .section .init,"ax",@progbits > .align 2 > .globl _init > .type _init,@function > _init: > stwu 1,-16(1) > mflr 0 > stw 31,12(1) > stw 0,20(1) > mr 31,1 >=20 > The overall result ends up being (from > an example .so): >=20 > 0000214c <_init> stwu r1,-16(r1) > 00002150 <_init+0x4> mflr r0 > 00002154 <_init+0x8> stw r31,12(r1) > 00002158 <_init+0xc> stw r0,20(r1) > 0000215c <_init+0x10> mr r31,r1 > (The above is the FreeBSD crti.S code.) > (Note the lack of initialization of r30 > to the _GLOBAL_OFFSET_TABLE_ related value.) >=20 > (The below is the crtstuff.c frame_dummy code > inlined in a way that the function prolog code > is not present here.) > (Note the dependence on r30 having already been > initialized.) > 00002160 <_init+0x14> lwz r3,-712(r30) > 00002164 <_init+0x18> lwz r9,0(r3) > 00002168 <_init+0x1c> cmpwi cr7,r9,0 > 0000216c <_init+0x20> beq- cr7,00002184 <_init+0x38> > 00002170 <_init+0x24> lwz r9,-16(r30) > 00002174 <_init+0x28> cmpwi cr7,r9,0 > 00002178 <_init+0x2c> beq- cr7,00002184 <_init+0x38> > 0000217c <_init+0x30> mtctr r9 > 00002180 <_init+0x34> bctrl >=20 > (The below is the crtstuff.c __do_global_ctors_aux > loop code but inlined. . .) > 00002184 <_init+0x38> lwz r29,-36(r30) > 00002188 <_init+0x3c> lwzu r9,-4(r29) > 0000218c <_init+0x40> cmpwi cr7,r9,-1 > 00002190 <_init+0x44> beq- cr7,000021a8 <_init+0x5c> > 00002194 <_init+0x48> mtctr r9 > 00002198 <_init+0x4c> bctrl > 0000219c <_init+0x50> lwzu r9,-4(r29) > 000021a0 <_init+0x54> cmpwi cr7,r9,-1 > 000021a4 <_init+0x58> bne+ cr7,00002194 <_init+0x48> >=20 > (The rest of the .init code follows.) > 000021a8 <_init+0x5c> lwz r11,0(r1) > 000021ac <_init+0x60> lwz r0,4(r11) > 000021b0 <_init+0x64> mtlr r0 > 000021b4 <_init+0x68> lwz r31,-4(r11) > 000021b8 <_init+0x6c> mr r1,r11 > 000021bc <_init+0x70> blr >=20 > The way the compiler's source code is structured > and works it looks to me like the crti.S used needs > to have the initialization code for r30. I've looked an a copy of the 1995 Sun Microsystems PPC SYSVR4 ABI document and its table 3-3 about processor register usage is not explicit about a specific register being used relative to _GLOBAL_OFFSET_TABLE_ handling. There are examples that say things like: Assumes GOT pointer in r31 but as far as I can tell no place requires that. In fact there is wording like: Combining the offset with the global offset table address in a general = register (for example, r31 loaded in the sample prologue in Figure 3-33) = gives the absolute address of the table entry holding the desired = address. which explicitly indicates "in a general register". It appears that crti.S type code for share libraries requires a local convention for the choice of register that the compiler and library must agree on. So r30 looks to be a valid choice but possibly compiler specific for a FreeBSD context. But there may be a reason to stick with r30. . . Looking in Power-Arch-32-bit-ABI-supp-1.0-Linux.pdf reports that: Under the Secure-PLT ABI, when using the Position-Independent Code (PIC) = addressing model, register r30 is used (by convention between compiler & = link editor) in nonleaf functions to hold the Global Offset Table (GOT) = pointer. . . . Using r30 to hold the address of the _GLOBAL_OFFSET_TABLE_ symbol is the = current convention used by the compiler and link-editor and is only required for nonleaf = routines which use the PIC addressing model. Leaf routines or code not = using the PIC addressing model may use any available unreserved = general-purpose register to hold the address of the = _GLOBAL_OFFSET_TABLE_ symbol. . . . The PIC call stub sequence requires that the compiler ensure that the = register used to hold the _GLOBAL_OFFSET_TABLE_ pointer is set before = any calls are made from the PLT. The current convention between the = compiler and link editor is that r30 be used for this purpose. This is a = change from the BSS-PLT ABI which only required GOT addressing to access = static storage. . . . For non-PIC code, r30 will not hold the GOT pointer; so the stubs must = be different, as shown in the following implementation. . . . Note: This ABI does not require a fixed GOT register, or even one = register used throughout a binary. Non-PIC code does not set the = _GLOBAL_OFFSET_TABLE_ pointer and does not need to reserve a register = for that purpose. Code under the PIC addressing model that accesses = static storage or calls nonlocal functions will need a register to hold = the _GLOBAL_OFFSET_TABLE_ pointer. However, leaf functions or functions = that only call other functions which are static (@local) may use any = general-purpose register within the constraints for the existing ABI. [End quotes.] By contrast Power-Arch-32-bit-ABI-supp-1.0-Embedded.pdf does not say much about r30 use, including not specifying the Secure-PLT ABI material. The referenced documents are examples of: Power Architecture=C2=AE 32-bit Application Binary Interface Supplement = 1.0 documents published in, for example, 2011 (Power.org copyright for that date but various other copyrights for various earlier dates). =3D=3D=3D Mark Millard markmi at dsl-only.net