From owner-freebsd-arch@FreeBSD.ORG Mon Jul 21 17:53:21 2014 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D66EE8EF; Mon, 21 Jul 2014 17:53:21 +0000 (UTC) Received: from mail105.syd.optusnet.com.au (mail105.syd.optusnet.com.au [211.29.132.249]) by mx1.freebsd.org (Postfix) with ESMTP id 8158B21F3; Mon, 21 Jul 2014 17:53:21 +0000 (UTC) Received: from c122-106-147-133.carlnfd1.nsw.optusnet.com.au (c122-106-147-133.carlnfd1.nsw.optusnet.com.au [122.106.147.133]) by mail105.syd.optusnet.com.au (Postfix) with ESMTPS id D809D10525BD; Tue, 22 Jul 2014 03:53:11 +1000 (EST) Date: Tue, 22 Jul 2014 03:53:10 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Ian Lepore Subject: Re: [CFR] mge driver / elf reloc In-Reply-To: <1405955048.85788.74.camel@revolution.hippie.lan> Message-ID: <20140722022100.S2586@besplex.bde.org> References: <14D22EA6-B73C-47BA-9A86-A957D24F23B8@freebsd.org> <1405810447.85788.41.camel@revolution.hippie.lan> <20140720220514.GP45513@funkthat.com> <20140720231056.GQ45513@funkthat.com> <9464C309-B390-4A27-981A-E854921B1C98@bsdimp.com> <1405955048.85788.74.camel@revolution.hippie.lan> MIME-Version: 1.0 X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=eojmkOZX c=1 sm=1 tr=0 a=7NqvjVvQucbO2RlWB8PEog==:117 a=PO7r1zJSAAAA:8 a=5zrNXb6StqYA:10 a=JzwRw_2MAAAA:8 a=nlC_4_pT8q9DhB4Ho9EA:9 a=cz2ZRIgtxKwA:10 a=wJWlkF7cXJYA:10 a=4FFUHJPBAAAA:8 a=SwFuXEZpslA4kWn8nBEA:9 a=nDTWHVfmmjYssBZ0:21 a=zWtDtkundl5VsItk:21 a=45ClL6m2LaAA:10 a=WglDhLyllLQA:10 Content-Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE X-Content-Filtered-By: Mailman/MimeDel 2.1.18 Cc: arch@freebsd.org, freebsd-arm X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jul 2014 17:53:22 -0000 On Mon, 21 Jul 2014, Ian Lepore wrote: > On Mon, 2014-07-21 at 08:46 -0600, Warner Losh wrote: >> On Jul 20, 2014, at 5:10 PM, John-Mark Gurney wrote: >> >>> Tim Kientzle wrote this message on Sun, Jul 20, 2014 at 15:25 -0700: >>>> >>>> On Jul 20, 2014, at 3:05 PM, John-Mark Gurney wrote= : >>>> >>>>> Ian Lepore wrote this message on Sat, Jul 19, 2014 at 16:54 -0600: >>>>>> Sorry to take so long to reply to this, I'm trying to get caught up.= I >>>>>> see you've already committed the mge fixes. I think the ELF alignme= nt >>>>>> fix looks good and should also be committed. >>>>> >>>>> So, re the elf alignment... >>>>> >>>>> I think we should get a set of macros that handle load/stores to/from >>>>> unaligned addresses that are transparent to the caller.... I need >>>>> these for some other code I'm writing... >>>>> >>>>> I thought Open/Net had these available, but I can't seem to find them >>>>> right now... >>>> >>>> $ man 9 byteorder >>>> >>>> is most of what you want, lacking only some aliases to pick >>>> the correct macro for native byte order. >>> >>> Um, those doesn't help if you want native endian order=85 >> >> Ummm, yes they do. enc converts from native order. dec decodes to native= byte >> order. They are more general cases than the ntoh* functions that are mor= e traditional >> since they also work on byte streams that may not be completely aligned = when >> sitting in memory. Which is what you are asking for. >> >>> Also, only the enc/dec functions are documented to work on non-aligned >>> address, so that doesn't help in most cases=85 >> >> They work on all addresses. They are even documented to work on any addr= ess: >> >> The be16enc(), be16dec(), be32enc(), be32dec(), be64enc(), be64dec(= ), >> le16enc(), le16dec(), le32enc(), le32dec(), le64enc(), and le64dec(= ) >> functions encode and decode integers to/from byte strings on any al= ign- >> ment in big/little endian format. >> >> So they are quite useful in general. Peeking under the covers at the imp= lementation >> also shows they will work for any alignment, so I=92m having trouble und= erstanding >> where this objection is really coming from. > > The functionality requested was alignment-safe copy/assign without any > endian change. The code in question was conceptually something like > > if (pointer & 0x03) > do-alignment-safe-thing > else > directly-deref-the-pointer The enc/dec functions could be pessimized like that, but are actually pessimized in other ways. Pessimizations in the above include conditional branches for the check and large code. Everything has to be inlined else it is much slower than direct dereference, but then it has scattered branches that may exhaust the branch prediction cache (if any) and large code that may exhaust other caches. The enc/dec functions don't have any branches, but they have large code like: % static __inline void % le32enc(void *pp, uint32_t u) % { % =09uint8_t *p =3D (uint8_t *)pp; %=20 % =09p[0] =3D u & 0xff; % =09p[1] =3D (u >> 8) & 0xff; % =09p[2] =3D (u >> 16) & 0xff; % =09p[3] =3D (u >> 24) & 0xff; % } Fastest on x86 (unless alignment check exceptions are enabled in CR0) is to just do the access and let the hardware combine the bytes. The accesses should probably be written someything like le32enc() above, but more carefully. Compilers should convert the above to a single 32-bit access on x86 (unless alignment check exceptions...). The pessimizations are that compilers aren't that smart, and/or the enc/dec functions are written in such a way that compilers are constrained from doing this. IIRC, clang does this for one direction only and gcc-4.2.1 is not smart enough to do this for either direction. IIRC, the problematic direction is the above one, where the value is returned indirectly. Sprinkling __restrict might help. When there is endianness conversion, there must be both an access (hopefully 32 bits) and swapping operation on a register, unless all the bytes are copied from memory to memory 1 or 2 at a time. clang is smart about converting the large expressions in __bswapNN_gen() into single bswap instructions if the original rvalue is in a register, but IIRC it is not so smart for the equivalent conversions written with bytes and indirections. (x86 and IIRC some other arches have the __bswapNN_gen() macros. These macros are just as MI as the enc/dec inlines and more carefully written but they have not been deduplicated due to namespace problems inhibiting a complete cleanup of the MD endian.h files.) Accesses are often pessimized using the __packed mistake. This bug is still uncommon in ipv4 headers, but is used with minimal damage in struct ip. struct ip is declared as __packed and __aligned(4). Here __aligned(4) tells that the struct aligned normally although it is packed. ipv6 headers ask for full pessimizations by declaring almost everything as __packed without __aligned(N). This tells the compiler that the struct might only be 1-byte aligned, so all accesses to it must be 1 byte at a time except on arches like x86 where the above optimization applies (the optimization is much easier to do when the accesses are not expressed bytewise in the code). Bytewise accesses are also less inherently atomic. I think there used to be a problem with __packed and __aligned() attributes not being inherited by function pointers, but can't fdind any problem now. ia64 code for loading an int from a __packed struct (p.x): % addl r16 =3D @ltoffx(p#), r1 % ;; % ld8.mov r16 =3D [r16], p# % ;; % mov r14 =3D r16 % ;; % ld1 r15 =3D [r14], 1 % ;; % ld1 r14 =3D [r14] % ;; % shl r14 =3D r14, 8 % ;; % or r14 =3D r15, r14 % adds r15 =3D 2, r16 % ;; % ld1 r15 =3D [r15] % ;; % shl r15 =3D r15, 16 % ;; % or r15 =3D r14, r15 % adds r16 =3D 3, r16 % ;; % ld1 r8 =3D [r16] % ;; % shl r8 =3D r8, 24 % ;; % or r8 =3D r15, r8 It seems to have 5 memory references. The enc/dec32 functions produce a similar mess on x86 when the compiler can't optimize them. This is with gcc. clang doesn't work on ia64 and/or pluto. ia64 code for loading an int from a __packed __aligned(4) struct (p.x): addl r14 =3D @ltoffx(p#), r1 ;; ld8.mov r14 =3D [r14], p# ;; ld4 r8 =3D [r14] This behaviour can probable be exploited in the enc/dec functions. When there is no endianness conversion, write 32-bit accesses as p->x where p is a pointer to a __packed __aligned(1) struct containing x. The compiler will then produce the above mess on ia64 and a single 32-bit access if possible. No ifdefs required. When there is an endianess conversion, do it on a register with value p->x. Lightly tested implementation: % struct _le32 { % =09uint32_t _x; % } __packed; %=20 % #define=09le32enc(p, u)=09(((struct _le32 *)(p))->_x =3D (u)) This is simpler than the inline version, and fixes some namespace errors. If you want avoid the mess on ia64, this can be done at compile time if the alignment is known to be more than 1 then. I think if can often be known, as for struct ip (don't generate misaligned struct ip, and if you start with one then copy to an aligned one). The enc/dec inline functions could handle this by being converted to macros that take the alignment parameter. If the alignment is not known until runtime...don't do that. To add an alignment parameter to the above, use something like: #define=09_sle32(a)=09struct _le32 { uint32_t _x; } __packed __aligned(a) #define=09le32enc(p, u, a)=09(((_sle32(a) *)(p))->_x =3D (u)) Macroizing the struct declaration to add the alignment parameter to it and avoiding backslashes made the code less readable but even shorter. This was tested on i386 and ia64 with alignments 1 and 4 and and gave the expected output in asm. Bruce From owner-freebsd-arch@FreeBSD.ORG Mon Jul 21 18:28:22 2014 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 58A3F2AB; Mon, 21 Jul 2014 18:28:22 +0000 (UTC) Received: from mho-01-ewr.mailhop.org (mho-03-ewr.mailhop.org [204.13.248.66]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 140C624C6; Mon, 21 Jul 2014 18:28:21 +0000 (UTC) Received: from c-50-155-136-3.hsd1.co.comcast.net ([50.155.136.3] helo=ilsoft.org) by mho-01-ewr.mailhop.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.72) (envelope-from ) id 1X9IJv-000FsY-QT; Mon, 21 Jul 2014 18:28:20 +0000 Received: from [172.22.42.240] (revolution.hippie.lan [172.22.42.240]) by ilsoft.org (8.14.9/8.14.9) with ESMTP id s6LISIeP033020; Mon, 21 Jul 2014 12:28:18 -0600 (MDT) (envelope-from ian@FreeBSD.org) X-Mail-Handler: Dyn Standard SMTP by Dyn X-Originating-IP: 50.155.136.3 X-Report-Abuse-To: abuse@dyndns.com (see http://www.dyndns.com/services/sendlabs/outbound_abuse.html for abuse reporting information) X-MHO-User: U2FsdGVkX1/IbHu2Ip3HrQmS1MVVZC3X X-Authentication-Warning: paranoia.hippie.lan: Host revolution.hippie.lan [172.22.42.240] claimed to be [172.22.42.240] Subject: Re: [CFR] mge driver / elf reloc From: Ian Lepore To: Warner Losh In-Reply-To: References: <14D22EA6-B73C-47BA-9A86-A957D24F23B8@freebsd.org> <1405810447.85788.41.camel@revolution.hippie.lan> <20140720220514.GP45513@funkthat.com> <20140720231056.GQ45513@funkthat.com> <9464C309-B390-4A27-981A-E854921B1C98@bsdimp.com> <20140721162559.GS45513@funkthat.com> <467619B1-F530-49AF-91BF-14CA3A31908B@bsdimp.com> <20140721165619.GT45513@funkthat.com> Content-Type: text/plain; charset="windows-1251" Date: Mon, 21 Jul 2014 12:28:17 -0600 Message-ID: <1405967297.85788.78.camel@revolution.hippie.lan> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by ilsoft.org id s6LISIeP033020 Cc: arch@freebsd.org, freebsd-arm X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jul 2014 18:28:22 -0000 On Mon, 2014-07-21 at 11:16 -0600, Warner Losh wrote: > On Jul 21, 2014, at 10:56 AM, John-Mark Gurney wrote= : >=20 > > Warner Losh wrote this message on Mon, Jul 21, 2014 at 10:31 -0600: > >>=20 > >> On Jul 21, 2014, at 10:25 AM, John-Mark Gurney wr= ote: > >>=20 > >>> Warner Losh wrote this message on Mon, Jul 21, 2014 at 08:46 -0600: > >>>>=20 > >>>> On Jul 20, 2014, at 5:10 PM, John-Mark Gurney w= rote: > >>>>=20 > >>>>> Tim Kientzle wrote this message on Sun, Jul 20, 2014 at 15:25 -07= 00: > >>>>>>=20 > >>>>>> On Jul 20, 2014, at 3:05 PM, John-Mark Gurney = wrote: > >>>>>>=20 > >>>>>>> Ian Lepore wrote this message on Sat, Jul 19, 2014 at 16:54 -06= 00: > >>>>>>>> Sorry to take so long to reply to this, I'm trying to get caug= ht up. I > >>>>>>>> see you've already committed the mge fixes. I think the ELF a= lignment > >>>>>>>> fix looks good and should also be committed. > >>>>>>>=20 > >>>>>>> So, re the elf alignment... > >>>>>>>=20 > >>>>>>> I think we should get a set of macros that handle load/stores t= o/from > >>>>>>> unaligned addresses that are transparent to the caller.... I n= eed > >>>>>>> these for some other code I'm writing...=20 > >>>>>>>=20 > >>>>>>> I thought Open/Net had these available, but I can't seem to fin= d them > >>>>>>> right now... > >>>>>>=20 > >>>>>> $ man 9 byteorder > >>>>>>=20 > >>>>>> is most of what you want, lacking only some aliases to pick > >>>>>> the correct macro for native byte order. > >>>>>=20 > >>>>> Um, those doesn't help if you want native endian order? > >>>>=20 > >>>> Ummm, yes they do. enc converts from native order. dec decodes to = native byte > >>>=20 > >>> No they don't.. If you want to read a value in memory that is nativ= e > >>> endian order to native endian order (no conversion), they cannot be > >>> used w/o using something like below? > >>=20 > >> Missed the native to native. bcopy works, but is ugly, as you note b= elow. > >>=20 > >>>> order. They are more general cases than the ntoh* functions that a= re more traditional > >>>> since they also work on byte streams that may not be completely al= igned when > >>>> sitting in memory. Which is what you are asking for. > >>>=20 > >>> So, you're saying that I now need to write code like: > >>> #if LITTLE_ENDIAN /* or how ever this is spelled*/ > >>> var =3D le32enc(foo); > >>> #else > >>> var =3D be32enc(foo); > >>> #endif > >>>=20 > >>> If I want to read a arch native endian value? No thank you? > >>=20 > >> I?m not saying that at all. > >>=20 > >>>>> Also, only the enc/dec functions are documented to work on non-al= igned > >>>>> address, so that doesn't help in most cases? > >>>>=20 > >>>> They work on all addresses. They are even documented to work on an= y address: > >>>>=20 > >>>> The be16enc(), be16dec(), be32enc(), be32dec(), be64enc(), be64= dec(), > >>>> le16enc(), le16dec(), le32enc(), le32dec(), le64enc(), and le64= dec() > >>>> functions encode and decode integers to/from byte strings on an= y align- > >>>> ment in big/little endian format. > >>>>=20 > >>>> So they are quite useful in general. Peeking under the covers at t= he implementation > >>>> also shows they will work for any alignment, so I?m having trouble= understanding > >>>> where this objection is really coming from. > >>>=20 > >>> There are places where you write code such as: > >>> int i; > >>> memcpy(&i, inp, sizeof i); > >>> /* use i */ > >>>=20 > >>> In order to avoid alignment faults... None of the functions in byt= eorder > >>> do NO conversion of endian, or you have to know which endian you ar= e but > >>> that doesn't work on MI code... > >>>=20 > >>> Did you read what the commited code did? > >>=20 > >> No, I missed that bit beaded on your reply (which seemed to imply yo= u needed > >> endian conversion) which implied the enc/dec are only documented to = work on non-aligned > >> which is what I was correcting. > >=20 > > Hmm... It appears that byteorder(9) leaks to userland though isn't > > documented to be available in userland... Should we fix this and mak= e > > it an offical userland API? How to document it? In my case, the > > enc/dec version would have been enough for what I need, but not "part= of > > the userland API"... There is an xref from byteorder(3), but no > > comment on either that they will work.. >=20 > Yes. We should fix this now that it isn=92t confined to the kernel. >=20 > >> But maybe the more basic question is why do you even have packed > >> structures that are native endian that you want to access as natural= ly > >> aligned structures? How did they become native endian and why weren?= t > >> they converted to a more natural layout at that time? > >=20 > > The original message said why=85 >=20 > I personally think the original code should unconditionally call load_p= tr() and store_ptr() > and if the optimization for aligned access is actually worth doing, the= test for it should be > in those functions rather than inline throughout the code. The code wil= l be clearer, and > it would be easier to optimize those cases that actually matter. >=20 > I=92m frankly surprised that these relocations are being generated unal= igned. Perhaps that=92s > the real bug here that should be fixed. While I=92m OK with the origina= l patch (subject to the > above), I=92d be curious what other cases there are for this functional= ity. You had said that > you had additional use cases in the network stack, but I=92m having tro= uble groking the > use cases. >=20 > If this is a huge deal, then defining functions to do this is trivial. = I=92m just not sure it is common > enough to need a special macro/function call in the base. >=20 > Warner >=20 I was trying to figure out how unaligned relocs even happen, and I think one possibility would be something like this... char thing[] =3D "x"; struct { char x; char *something; } __packed foo =3D {'a', thing}; Which seems contrived enough that I wonder how it happens in real life. -- Ian