Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 26 Dec 2015 09:45:31 -0700
From:      Warner Losh <imp@bsdimp.com>
To:        Mark Millard <markmi@dsl-only.net>
Cc:        freebsd-arm <freebsd-arm@freebsd.org>, FreeBSD Toolchain <freebsd-toolchain@freebsd.org>, Ian Lepore <ian@FreeBSD.org>, mat@FreeBSD.org, sbruno@FreeBSD.org
Subject:   Re: 11.0-CURRENT (r292413) on a rpi2b: arm-gnueabi-freebsd/bin/ar, _fseeko, and memset vs memory alignment (SCTRL bit[1]=1?): Explains the Bus error?
Message-ID:  <D38C49E3-B622-49EA-9B30-3B1B2FA2E569@bsdimp.com>
In-Reply-To: <DC9EE7C3-2763-44EF-91DA-AFE63C48E537@dsl-only.net>
References:  <4CC6220D-72FB-45AD-AE70-5EB4EF0BCF5C@dsl-only.net> <DB75F0D6-86CB-4383-8653-6017C76729F9@dsl-only.net> <A338272B-982F-4E1F-B87D-1E33815EA212@dsl-only.net> <0D81C2CA-BF1C-4C14-B816-A8C5F68715B5@bsdimp.com> <51EB4AAB-BC81-4282-BA4D-D329C41D660B@dsl-only.net> <8B52074F-FDEF-4119-BB04-630F9BE9E6DB@bsdimp.com> <BBAAE33E-BD65-40A3-A0B3-F3346FC08112@dsl-only.net> <DC9EE7C3-2763-44EF-91DA-AFE63C48E537@dsl-only.net>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail=_3A881A22-5ED1-4B30-8C08-758FA0B9A7D1
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8

Thanks, it sounds like I fixed a bug, but there=E2=80=99s more.

What were the specific port so I can test it here?

And to be clear, this is a buildworld on the RPi 2 using the cross-built =
world with CPUTYPE=3Darmv7a or some such, right?

Warner

> On Dec 25, 2015, at 9:32 PM, Mark Millard <markmi@dsl-only.net> wrote:
>=20
> [I am again breaking off another section of older material.]
>=20
> Mixed news I'm afraid.
>=20
> The specific couple of ports that I attempted did build, the same ones =
that originally got the Bus Error in ar using (indirectly) _fseeko and =
memset that I reported. So I expect that you fixed one error.
>=20
> But when I tried to buildworld, clang++ 3.7 processing =
usr/src/lib/clang/libllvmtablegen/ materials quickly got a Bus Error at =
nearly the same type of instruction (it has a "!" below that the earlier =
one did not), but with r4 holding the misaligned address this time:
>=20
>> --- _bootstrap-tools-lib/clang/libllvmsupport ---
>> --- APFloat.o ---
>> clang++: error: unable to execute command: Bus error (core dumped)
>> . . .
>> # gdb clang++ usr/src/lib/clang/libllvmtablegen/clang++.core
>> . . .
>> Core was generated by `clang++'.
>> Program terminated with signal 10, Bus error.
>> #0  0x00c3bb9c in =
clang::DependentTemplateSpecializationType::DependentTemplateSpecializatio=
nType ()
>> [New Thread 22a18000 (LWP 100128/<unknown>)]
>> (gdb) x/40i 0x00c3bb60
>> . . .
>> 0xc3bb9c =
<_ZN5clang35DependentTemplateSpecializationTypeC2ENS_21ElaboratedTypeKeywo=
rdEPNS_19NestedNameSpecifierEPKNS_14IdentifierInfoEjPKNS_16TemplateArgumen=
tENS_8QualTypeE+356>:
>>    vst1.64	{d16-d17}, [r4]!
>> . . .
>> (gdb) info all-registers
>> r0             0xbfbf81a8	-1077968472
>> r1             0x22f07e14	586186260
>> r2             0xc416bc	12850876
>> r3             0x2	2
>> r4             0x22f07dfc	586186236
>> . . .
>=20
>=20
> Thus it appears that there is more code around that likely generates =
pointers not aligned so to allow the code generation that is in use for =
what is pointed to.
>=20
> At this point I have no clue if the issue is just inside clang itself =
vs. if it is in something that clang is layered on top of. Nor if there =
is just one bad thing or many.
>=20
> Note: I had not yet tried buildworld/buildkernel for the context of =
the "-f" option that I was experimenting with earlier. So I do not have =
a direct compare and contrast at this point.
>=20
>=20
>=20
> Older material:
>=20
> On 2015-Dec-25, at 5:21 PM, Mark Millard <markmi@dsl-only.net> wrote:
>=20
>> On 2015-Dec-25, at 3:42 PM, Warner Losh <imp@bsdimp.com> wrote:
>>=20
>>=20
>>> On Dec 25, 2015, at 3:14 PM, Mark Millard <markmi@dsl-only.net> =
wrote:
>>>=20
>>> [I'm going to break much of the earlier "original material" text to =
tail of the message.]
>>>=20
>>>> On 2015-Dec-25, at 11:53 AM, Warner Losh <imp@bsdimp.com> wrote:
>>>>=20
>>>> So what happens if we actually fix the underlying bug?
>>>>=20
>>>> I see two ways of doing this. In findfp.c, we allocate an array of =
FILE * today like:
>>>>    g =3D (struct glue *)malloc(sizeof(*g) + ALIGNBYTES + n * =
sizeof(FILE));
>>>> but that assumes that FILE just has normal pointer alignment =
requirements. However,
>>>> due to the mbstate having int64_t alignment requirements, this is =
wrong. Maybe we
>>>> need to do something like
>>>> 	g =3D (struct glue *)malloc(sizeof(*g) + =
max(sizeof(int64_t),ALIGNBYTES) + n * sizeof(FILE));
>>>> which wouldn=E2=80=99t change anything on LP64 systems, but would =
result in proper alignment
>>>> for ILP32 systems. We=E2=80=99d have to fix the loop that uses =
ALIGN afterwards to use
>>>> roundup. Instead, we=E2=80=99d need to round up to the neared =
8-byte aligned offset (or technically,
>>>> the max of ALIGNBYTES and 8, but that=E2=80=99s always 8 on =
today=E2=80=99s systems. If we do this,
>>>> we can make sure that each file is 8-byte aligned or better. We may =
need to round up
>>>> sizeof(FILE) to a multiple of 8 as well. I believe that since it =
has the 8-byte alignment
>>>> for a member, its size must be a multiple of 8, but I=E2=80=99ve =
not chased that belief to ground.
>>>> If not, we may need another decorator (__aligned(8), I think, =
spelled with the ugly
>>>> max expression above). That way, the contract we=E2=80=99re making =
with the compiler will
>>>> always be true. ALIGN BYTES is 4 on Arm anyway, so that bit is =
clearly wrong.
>>>>=20
>>>> This wouldn=E2=80=99t be an ABI change, since you can only get a =
valid FILE * from fopen (and
>>>> friends), plus stdin, stdout, and stderr. Those addresses aren=E2=80=99=
t hard coded into binaries,
>>>> so even if we have to tweak the last three and deal with some =
=E2=80=98fake=E2=80=99 FILE abuse in libc
>>>> (which I don=E2=80=99t think suffers from this issue, btw, given =
the alignment requirements that would
>>>> naturally follow from something on the stack), we=E2=80=99d still =
be ahead. At least for all CONFORMING
>>>> implementations[*]...
>>>>=20
>>>> TL;DR: Why not make FILE * always 8-byte aligned? The compiler =
options are a band-aide.
>>>>=20
>>>> Warner
>>>>=20
>>>> [*] There=E2=80=99s at least on popular package that has a copy of =
the FILE structure in one of its
>>>> .h files and uses that to do unnatural optimization things, but =
even that=E2=80=99s cool, I think,
>>>> since it never allocates a new one.
>>>>=20
>>>=20
>>> The ARM documentation mentions cases of 16 byte alignment =
requirements. I've no clue if the clang code generation ever creates =
such code. There might be wider requirements possible in arm code as =
well. (I'm not an arm expert.) As an example of an implication: "The =
malloc() function returns a pointer to a block of at least size bytes =
suitably aligned for any use." In other words: aligned to some figure =
that is a multiple of *every* alignment requirement that the code =
generator can produce, possibly being the least common multiple.
>>>=20
>>> "-fmax-type-align=3D. . ." is a means of controlling/limiting the =
range of potential alignments to no more than a fixed, predefined value. =
Above that and the code generation has to work in small size accesses =
and build-up/split-up bigger values. Using "-fmax-type-align=3D. . ." =
allows defining a figure as part of an ABI that is then not subject to =
code generator updates that could increase the maximum alignment figure =
and break things: It turns off such new capabilities. Other options need =
not work that way to preserve the ABI.
>>=20
>> That=E2=80=99s true, as far as it goes=E2=80=A6 But I=E2=80=99m not =
sure it goes far enough. The premise here is that the problem is =
wide-spread, when in fact I think it is quite narrow.
>>=20
>>> But in the most fundamental terms process wise as far as I can tell. =
. .
>>>=20
>>> While the FILE case that occurred is a specific example, every =
memory-allocation-like operation is at a potential issue for all such =
"allocated" objects where the related code generation requires alignment =
to avoid Bus Error (given the SCTLR bit[1] in use).
>>=20
>> The problem isn=E2=80=99t general. The problem isn=E2=80=99t malloc. =
Malloc will generally return the right thing on arm (and if it =
doesn=E2=80=99t,
>> then we need to make sure it does).
>>=20
>> The problem is we get a boatload of FILEs from the system all at =
once, and those are misaligned because of a bug in the code. One =
that=E2=80=99s fixed, I believe, in https://reviews.freebsd.org/D4708.
>>=20
>>=20
>>> How many other places in FreeBSD might sometimes return mis-aligned =
pointers for the existing code generation and ABI combination?
>>=20
>> It isn=E2=80=99t an ABI thing, just a code bug thing. The only reason =
it was an issue was due to the optimizing nature of clang.
>>=20
>> We=E2=80=99ve had to deal with the arm alignment issues for years. I =
wager there are very few indeed. The only reason this was was brought to =
light was better code-gen from clang.
>>=20
>>> How many other places are subject to breakage when "internal" =
structs/unions/fields involved are changed to be of a different size =
because the code is not fully auto-adjusting to match the code =
generation yet --even if right now "it works"? How fragile will things =
be for future work?
>>=20
>> If there are others, I=E2=80=99ll bet they could be counted on one =
hand since very few things do the =E2=80=98slab=E2=80=99 allocator that =
FILE does.
>>=20
>>> What would it take to find out and deal with them all? (I do not =
have the background knowledge to span much.)
>>>=20
>>> My experiment avoided potentially changing parts of the ABI and also =
avoided dealing with such a "lots of code to investigate" issue. It may =
not be the long term 11.0-RELEASE solution. Even if not, it may be =
appropriate for various temporary purposes that need to avoid Bus Errors =
in the process. For example if Ian has a good reason to use clang 3.7 =
instead of gcc 4.2.1.
>>=20
>> The review above doesn=E2=80=99t change the ABI either.
>>=20
>>> Other notes:
>>>=20
>>>> I believe that since it has the 8-byte alignment
>>>> for a member, its size must be a multiple of 8
>>>=20
>>> There are some C/C++ language rules about the address of a structure =
equalling the address of the first field, uniformity of the offsets, and =
the like. But. . .
>>>=20
>>> The C and C++ languages specify no specific numerical alignment =
figures, not even relative to specific sizeof(...) expressions. To use =
an old example: a 68010 only needs alignment for >=3D 2 byte things and =
even alignment is all that is then required. Some other contexts take a =
lot more to meet the specifications. There are some implications of the =
modern memory model(s) created to cover concurrency explicitly, such as =
avoiding interactions that can happen via, for example, separate objects =
(in part) sharing a cache line. (I've only looked at C++ for this, and =
only to a degree.)
>>>=20
>>> The detailed alignment rules are more "implementation defined" than =
"predefined by the standard". But the definition is trying to meet =
language criteria. It is not a fully independent choice.
>>=20
>> Many of them are actually defined by a combination of the standard =
language definition, as well as the ABI standard. This is why we know =
that mbstate_t must be 8 byte aligned.
>>=20
>>> May be some other standards that FreeBSD is tied to specify more =
specifics, such as a N byte integer always aligns to some multiple of N =
(a waste on the 68010), including the alignment for union or struct that =
it may be a part of tracking. But such rules force padding that may or =
may not be required to meet the language's more abstract criteria and =
such rules may not match the existing/in-use ABI.
>>=20
>> It is all spelled out in the ARM EABI docs.
>>=20
>>> So far as I can tell explicitly declared alignments may well be =
necessary. If that one "popular package", say, formed an array of FILE =
copies then the resultant alignments need not all match the ones =
produced by your example code unless the FILE declaration forces the =
compiler to match, causing sizeof(FILE) to track as well. FILE need not =
be the only such issue.
>>=20
>> Arrays of FILEs isn=E2=80=99t an issue (except that it encodes the =
size of FILE into the app). It=E2=80=99s the specifically quirky way =
that libc does it that=E2=80=99s the problem.
>>=20
>>> My background and reference material are mostly tied the languages =
--and so my notes tend to be limited to that much context.
>>=20
>> Understood. While there may be issues with alignment still, tossing a =
big hammer at the problem because they might exist will likely mean they =
will persist far longer than fixing them one at a time. When we first =
ported to arm, there were maybe half a dozen places that needed fixing. =
I doubt there=E2=80=99s more now.
>>=20
>> Can you try the patch in the above code review w/o the -f switch and =
let me know if it works for you?
>>=20
>> Warner
>=20
> buildworld/buildkernel has been started on amd64 for a rpi2 target. =
That and install kernel/world and starting up a port rebuild on the rpi2 =
and waiting for it means it will be a few hours even if I start the next =
thing just as each prior thing finishes. I may give up and go to sleep =
first.
>=20
> As for presumptions: I'll take your word on expected status of things. =
I've no clue. But absent even the hear-say status information at the =
time I did not presume that what was in front of me was all there is to =
worry about --nor did I try to go figure it all out on my own. I took a =
path to cover both possibilities for local-only vs. more-wide-spread (so =
long as that path did not force a split-up of some larger form of atomic =
action).
>=20
> In my view "-mno-unaligned-access" is an even bigger hammer than I =
used. I find no clang statement about what its ABI consequences would =
be, unlike for what I did: What mix of more padding for alignment vs. =
more but smaller accesses? But as I remember I've seen =
"-mno-unaligned-access" in use in ports and the like so its consequences =
may be familiar material for some folks.
>=20
> Absent any questions about ABI consequences "-mno-unaligned-access" =
does well mark the expected SCTLR bit[1] status, far better than what I =
did. Again: I was covering my ignorance while making any significant =
investigation/debugging as unlikely as I could.
>=20
>=20
>> Original material:
>>=20
>>> On Dec 25, 2015, at 7:24 AM, Mark Millard <markmi@dsl-only.net> =
wrote:
>>>=20
>>> [Good News Summary: Rebuilding buildworld/buildkernel for rpi2 =
11.0-CURRENT 292413 from amd64 based on adding -fmax-type-align=3D4 has =
so far removed the crashes during the toolchain activity: no more =
misaligned accesses in libc's _fseeko or elsewhere.]
>>>=20
>>> On 2015-Dec-25, at 12:31 AM, Mark Millard <markmi@dsl-only.net> =
wrote:
>>>=20
>>>> On 2015-Dec-24, at 10:39 PM, Mark Millard <markmi@dsl-only.net> =
wrote:
>>>>=20
>>>>> [I do not know if this partial crash analysis related to on-arm =
clang-associated activity is good enough and appropriate to submit or =
not.]
>>>>>=20
>>>>> The /usr/local/arm-gnueabi-freebsd/bin/ar on the rpi2b involved =
below came from pkg install activity instead of port building. Used =
as-is.
>>>>>=20
>>>>> When I just tried my first from-rpi2b builds (ports for a rpi2b), =
/usr/local/arm-gnueabi-freebsd/bin/ar crashed. I believe that the =
following suggests an alignment error for the type of instructions that =
memset for 128 bytes was translated to (sizeof(mbstate_t)) in the code =
used by /usr/local/arm-gnueabi-freebsd/bin/ar. (But I do not know how to =
check SCTLR bit[1] to be directly sure that alignment was being =
enforced.)
>>>>>=20
>>>>> The crash was a Bus error in /usr/local/arm-gnueabi-freebsd/bin/ar =
:
>>>>>=20
>>>>>> libtool: link: /usr/local/arm-gnueabi-freebsd/bin/ar cru =
.libs/libgnuintl.a  bindtextdom.o dcgettext.o dgettext.o gettext.o =
finddomain.o hash-string.o loadmsgcat.o localealias.o textdomain.o =
l10nflist.o explodename.o dcigettext.o dcngettext.o dngettext.o =
ngettext.o pluralx.o plural-exp.o localcharset.o threadlib.o lock.o =
relocatable.o langprefs.o localename.o log.o printf.o setlocale.o =
version.o xsize.o osdep.o intl-compat.o
>>>>>> Bus error (core dumped)
>>>>>> *** [libgnuintl.la] Error code 138
>>>>>=20
>>>>> It failed in _fseeko doing a memset that turned into uses of =
"vst1.64	{d16-d17}, [r0]" instructions, for an address in =
register r0 that ended in 0xa4, so was not aligned to 8 byte boundaries. =
=46rom what I read such "VSTn (multiple n-element structures)" that have =
.64 require 8 byte alignment. The evidence of the code and register =
value follow.
>>>>>=20
>>>>>> # gdb /usr/local/arm-gnueabi-freebsd/bin/ar =
/usr/obj/portswork/usr/ports/devel/gettext-tools/work/gettext-0.19.6/gette=
xt-tools/intl/ar.core
>>>>>> . . .
>>>>>> #0  0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D<value =
optimized out>, whence=3D<value optimized out>, ltest=3D<value optimized =
out>) at /usr/src/lib/libc/stdio/fseek.c:299
>>>>>> 299		memset(&fp->_mbstate, 0, sizeof(mbstate_t));
>>>>>> . . .
>>>>>> (gdb) x/24i 0x2033adb0
>>>>>> 0x2033adb0 <_fseeko+836>:	vmov.i32	q8, #0	; =
0x00000000
>>>>>> 0x2033adb4 <_fseeko+840>:	movw	r1, #65503	; 0xffdf
>>>>>> 0x2033adb8 <_fseeko+844>:	stm	r4, {r0, r7}
>>>>>> 0x2033adbc <_fseeko+848>:	ldrh	r0, [r4, #12]
>>>>>> 0x2033adc0 <_fseeko+852>:	and	r0, r0, r1
>>>>>> 0x2033adc4 <_fseeko+856>:	strh	r0, [r4, #12]
>>>>>> 0x2033adc8 <_fseeko+860>:	add	r0, r4, #216	; 0xd8
>>>>>> 0x2033adcc <_fseeko+864>:	vst1.64	{d16-d17}, [r0]
>>>>>> 0x2033add0 <_fseeko+868>:	add	r0, r4, #200	; 0xc8
>>>>>> 0x2033add4 <_fseeko+872>:	vst1.64	{d16-d17}, [r0]
>>>>>> 0x2033add8 <_fseeko+876>:	add	r0, r4, #184	; 0xb8
>>>>>> 0x2033addc <_fseeko+880>:	vst1.64	{d16-d17}, [r0]
>>>>>> 0x2033ade0 <_fseeko+884>:	add	r0, r4, #168	; 0xa8
>>>>>> 0x2033ade4 <_fseeko+888>:	vst1.64	{d16-d17}, [r0]
>>>>>> 0x2033ade8 <_fseeko+892>:	add	r0, r4, #152	; 0x98
>>>>>> 0x2033adec <_fseeko+896>:	vst1.64	{d16-d17}, [r0]
>>>>>> 0x2033adf0 <_fseeko+900>:	add	r0, r4, #136	; 0x88
>>>>>> 0x2033adf4 <_fseeko+904>:	vst1.64	{d16-d17}, [r0]
>>>>>> 0x2033adf8 <_fseeko+908>:	add	r0, r4, #120	; 0x78
>>>>>> 0x2033adfc <_fseeko+912>:	vst1.64	{d16-d17}, [r0]
>>>>>> 0x2033ae00 <_fseeko+916>:	add	r0, r4, #104	; 0x68
>>>>>> 0x2033ae04 <_fseeko+920>:	vst1.64	{d16-d17}, [r0]
>>>>>> 0x2033ae08 <_fseeko+924>:	b	0x2033b070 =
<_fseeko+1540>
>>>>>> 0x2033ae0c <_fseeko+928>:	cmp	r5, #0	; 0x0
>>>>>> (gdb) info all-registers
>>>>>> r0             0x20651ea4	543497892
>>>>>> r1             0xffdf	65503
>>>>>> r2             0x0	0
>>>>>> r3             0x0	0
>>>>>> r4             0x20651dcc	543497676
>>>>>> r5             0x0	0
>>>>>> r6             0x0	0
>>>>>> r7             0x0	0
>>>>>> r8             0x20359df4	540384756
>>>>>> r9             0x0	0
>>>>>> r10            0x0	0
>>>>>> r11            0xbfbfb948	-1077954232
>>>>>> r12            0x2037b208	540520968
>>>>>> sp             0xbfbfb898	-1077954408
>>>>>> lr             0x2035a004	540385284
>>>>>> pc             0x2033adcc	540257740
>>>>>> f0             0	(raw 0x000000000000000000000000)
>>>>>> f1             0	(raw 0x000000000000000000000000)
>>>>>> f2             0	(raw 0x000000000000000000000000)
>>>>>> f3             0	(raw 0x000000000000000000000000)
>>>>>> f4             0	(raw 0x000000000000000000000000)
>>>>>> f5             0	(raw 0x000000000000000000000000)
>>>>>> f6             0	(raw 0x000000000000000000000000)
>>>>>> f7             0	(raw 0x000000000000000000000000)
>>>>>> fps            0x0	0
>>>>>> cpsr           0x60000010	1610612752
>>>>>=20
>>>>> The syntax in use for vst1.64 instructions does not explicitly =
have the alignment notation. Presuming that the decoding is correct then =
from what I read the following applies:
>>>>>=20
>>>>>> Home > NEON and VFP Programming > NEON load and store element and =
structure instructions > Alignment restrictions in load and store, =
element and structure instructions
>>>>>>=20
>>>>>> . . . When the alignment is not specified in the instruction, the =
alignment restriction is controlled by the A bit (SCTLR bit[1]):
>>>>>> 	=E2=80=A2	if the A bit is 0, there are no alignment =
restrictions (except for strongly ordered or device memory, where =
accesses must be element aligned or the result is unpredictable)
>>>>>> 	=E2=80=A2	if the A bit is 1, accesses must be element =
aligned.
>>>>>> If an address is not correctly aligned, an alignment fault =
occurs.
>>>>>=20
>>>>> So if at the time the "A bit" (SCTLR bit[1]) is 1 then the Bus =
error would have the context to happen because of the mis-alignment.
>>>>>=20
>>>>> The following shows the make.conf context that explains how =
/usr/local/arm-gnueabi-freebsd/bin/ar came to be invoked:
>>>>>=20
>>>>>> # more /etc/make.conf
>>>>>> WRKDIRPREFIX=3D/usr/obj/portswork
>>>>>> WITH_DEBUG=3D
>>>>>> WITH_DEBUG_FILES=3D
>>>>>> MALLOC_PRODUCTION=3D
>>>>>> #
>>>>>> TO_TYPE=3Darmv6
>>>>>> TOOLS_TO_TYPE=3Darm-gnueabi
>>>>>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/
>>>>>> .if ${.MAKE.LEVEL} =3D=3D 0
>>>>>> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a
>>>>>> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a
>>>>>> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a
>>>>>> .export CC
>>>>>> .export CXX
>>>>>> .export CPP
>>>>>> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as
>>>>>> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar
>>>>>> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld
>>>>>> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm
>>>>>> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy
>>>>>> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump
>>>>>> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib
>>>>>> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size
>>>>>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings=

>>>>>> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings
>>>>>> .export AS
>>>>>> .export AR
>>>>>> .export LD
>>>>>> .export NM
>>>>>> .export OBJCOPY
>>>>>> .export OBJDUMP
>>>>>> .export RANLIB
>>>>>> .export SIZE
>>>>>> .export STRINGS
>>>>>> .endif
>>>>>=20
>>>>>=20
>>>>> Other context:
>>>>>=20
>>>>>> # freebsd-version -ku; uname -aKU
>>>>>> 11.0-CURRENT
>>>>>> 11.0-CURRENT
>>>>>> FreeBSD rpi2 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r292413M: Tue =
Dec 22 22:02:21 PST 2015     =
root@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG  arm =
1100091 1100091
>>>>>=20
>>>>>=20
>>>>>=20
>>>>> I will note that world and kernel are my own build of -r292413 =
(earlier experiment) --a build made from an amd64 host context and put =
in place via DESTDIR=3D. My expectation would be that the amd64 context =
would not be likely to have similar alignment restrictions involved in =
its ar activity (or other activity). That would explain how I got this =
far using such a clang 3.7 related toolchain for targeting an rpi2 =
before finding such a problem.
>>>>=20
>>>>=20
>>>> I realized re-reading the all above that it seems to suggest that =
the _fseeko code involved is from /usr/local/arm-gnueabi-freebsd/bin/ar =
but that was not my intent.
>>>>=20
>>>> libc.so.7 is from my buildworld, including the fseeko =
implementation:
>>>>=20
>>>> Reading symbols from /lib/libc.so.7...Reading symbols from =
/usr/lib/debug//lib/libc.so.7.debug...done.
>>>> done.
>>>> Loaded symbols for /lib/libc.so.7
>>>>=20
>>>>=20
>>>> head/sys/sys/_types.h has:
>>>>=20
>>>> /*
>>>> * mbstate_t is an opaque object to keep conversion state during =
multibyte
>>>> * stream conversions.
>>>> */
>>>> typedef union {
>>>>  char            __mbstate8[128];
>>>>  __int64_t       _mbstateL;      /* for alignment */
>>>> } __mbstate_t;
>>>>=20
>>>> suggesting an implicit alignment of the union to whatever the =
implementation defines for __int64_t --which need not be 8 byte =
alignment (in the abstract, general case). But 8 byte alignment is a =
possibility as well (in the abstract).
>>>>=20
>>>> But printing *fp in gdb for the fp argument to _fseeko reports the =
same not-8-byte aligned address for __mbstate8 that was in r0:
>>>>=20
>>>>> (gdb) bt
>>>>> #0  0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D<value =
optimized out>, whence=3D<value optimized out>, ltest=3D<value optimized =
out>) at /usr/src/lib/libc/stdio/fseek.c:299
>>>>> #1  0x2033b108 in fseeko (fp=3D0x20651dcc, offset=3D18571438587904, =
whence=3D0) at /usr/src/lib/libc/stdio/fseek.c:82
>>>>> #2  0x00016138 in ?? ()
>>>>> (gdb) print fp
>>>>> $2 =3D (FILE *) 0x20651dcc
>>>>> (gdb) print *fp
>>>>> $3 =3D {_p =3D 0x2069a240 "", _r =3D 0, _w =3D 0, _flags =3D 5264, =
_file =3D 36, _bf =3D {_base =3D 0x2069a240 "", _size =3D 32768}, =
_lbfsize =3D 0, _cookie =3D 0x20651dcc, _close =3D 0x20359dfc =
<__sclose>,
>>>>> _read =3D 0x20359de4 <__sread>, _seek =3D 0x20359df4 <__sseek>, =
_write =3D 0x20359dec <__swrite>, _ub =3D {_base =3D 0x0, _size =3D 0}, =
_up =3D 0x0, _ur =3D 0, _ubuf =3D 0x20651e0c "", _nbuf =3D 0x20651e0f =
"", _lb =3D {
>>>>> _base =3D 0x0, _size =3D 0}, _blksize =3D 32768, _offset =3D 0, =
_fl_mutex =3D 0x0, _fl_owner =3D 0x0, _fl_count =3D 0, _orientation =3D =
0, _mbstate =3D {__mbstate8 =3D 0x20651e34 "", _mbstateL =3D 0}, _flags2 =
=3D 0}
>>>>=20
>>>> The overall FILE struct containing the _mbstate field is also not =
8-byte aligned. But the offset from the start of the FILE struct to =
__mbstate8 is a multiple of 8 bytes.
>>>>=20
>>>> It is my interpretation that there is nothing here to justify the =
memset implementation combination:
>>>>=20
>>>> SCTLR bit[1]=3D=3D1
>>>>=20
>>>> mixed with
>>>>=20
>>>> vst1.64 instructions
>>>>=20
>>>> I.e.: one or both needs to change unless some way for forcing =
8-byte alignment is introduced.
>>>>=20
>>>> I have not managed to track down anything that would indicate =
FreeBSD's intent for SCTLR bit[1]. I do not even know if it is required =
by the design to be constant (once initialized).
>>>=20
>>>=20
>>> I have (so far) removed the build tool crashes based on adding =
-fmax-type-align=3D4 to avoid the misaligned accesses. Details follow.
>>>=20
>>> src.conf on amd64 for the rpi2 targeting buildworld/buildkernel now =
looks like:
>>>=20
>>>> # more ~/src.configs/src.conf.rpi2-clang.amd64-host
>>>> TO_TYPE=3Darmv6
>>>> TOOLS_TO_TYPE=3Darm-gnueabi
>>>> FROM_TYPE=3Damd64
>>>> TOOLS_FROM_TYPE=3Dx86_64
>>>> VERSION_CONTEXT=3D11.0
>>>> #
>>>> KERNCONF=3DRPI2-NODBG
>>>> TARGET=3Darm
>>>> .if ${.MAKE.LEVEL} =3D=3D 0
>>>> TARGET_ARCH=3D${TO_TYPE}
>>>> .export TARGET_ARCH
>>>> .endif
>>>> #
>>>> WITHOUT_CROSS_COMPILER=3D
>>>> #
>>>> # For WITH_BOOT=3D . . .
>>>> # arm-gnueabi-freebsd/bin/ld reports bootinfo.o: relocation =
R_ARM_MOVW_ABS_NC against `a local symbol' can not be used when making a =
shared object; recompile with -fPIC
>>>> WITHOUT_BOOT=3D
>>>> #
>>>> WITH_FAST_DEPEND=3D
>>>> WITH_LIBCPLUSPLUS=3D
>>>> WITH_CLANG=3D
>>>> WITH_CLANG_IS_CC=3D
>>>> WITH_CLANG_FULL=3D
>>>> WITH_LLDB=3D
>>>> WITH_CLANG_EXTRAS=3D
>>>> #
>>>> WITHOUT_LIB32=3D
>>>> WITHOUT_GCC=3D
>>>> WITHOUT_GNUCXX=3D
>>>> #
>>>> NO_WERROR=3D
>>>> MALLOC_PRODUCTION=3D
>>>> #CFLAGS+=3D -DELF_VERBOSE
>>>> #
>>>> WITH_DEBUG=3D
>>>> WITH_DEBUG_FILES=3D
>>>> #
>>>> # TOOLS_TO_TYPE based on ${TO_TYPE}-xtoolchain-gcc related =
bintutils...
>>>> #
>>>> #CROSS_TOOLCHAIN=3D${TO_TYPE}-gcc
>>>> X_COMPILER_TYPE=3Dclang
>>>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/
>>>> .if ${.MAKE.LEVEL} =3D=3D 0
>>>> XCC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a -fmax-type-align=3D4
>>>> XCXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a -fmax-type-align=3D4
>>>> XCPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a -fmax-type-align=3D4
>>>> .export XCC
>>>> .export XCXX
>>>> .export XCPP
>>>> XAS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as
>>>> XAR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar
>>>> XLD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld
>>>> XNM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm
>>>> XOBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy
>>>> XOBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump
>>>> XRANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib
>>>> XSIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size
>>>> #NO-SUCH: XSTRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings
>>>> XSTRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings
>>>> .export XAS
>>>> .export XAR
>>>> .export XLD
>>>> .export XNM
>>>> .export XOBJCOPY
>>>> .export XOBJDUMP
>>>> .export XRANLIB
>>>> .export XSIZE
>>>> .export XSTRINGS
>>>> .endif
>>>> #
>>>> # Host compiler stuff:
>>>> .if ${.MAKE.LEVEL} =3D=3D 0
>>>> CC=3D/usr/bin/clang -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin
>>>> CXX=3D/usr/bin/clang++ -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin
>>>> CPP=3D/usr/bin/clang-cpp =
-B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin
>>>> .export CC
>>>> .export CXX
>>>> .export CPP
>>>> AS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/as
>>>> AR=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ar
>>>> LD=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ld
>>>> NM=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/nm
>>>> OBJCOPY=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objcopy
>>>> OBJDUMP=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objdump
>>>> RANLIB=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ranlib
>>>> SIZE=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/size
>>>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/strings=

>>>> STRINGS=3D/usr/local/bin/${TOOLS_FROM_TYPE}-freebsd-strings
>>>> .export AS
>>>> .export AR
>>>> .export LD
>>>> .export NM
>>>> .export OBJCOPY
>>>> .export OBJDUMP
>>>> .export RANLIB
>>>> .export SIZE
>>>> .export STRINGS
>>>> .endif
>>>=20
>>> make.conf for during the on-rpi2 port builds now looks like:
>>>=20
>>>> $ more /etc/make.conf
>>>> WRKDIRPREFIX=3D/usr/obj/portswork
>>>> WITH_DEBUG=3D
>>>> WITH_DEBUG_FILES=3D
>>>> MALLOC_PRODUCTION=3D
>>>> #
>>>> TO_TYPE=3Darmv6
>>>> TOOLS_TO_TYPE=3Darm-gnueabi
>>>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/
>>>> .if ${.MAKE.LEVEL} =3D=3D 0
>>>> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a -fmax-type-align=3D4
>>>> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a -fmax-type-align=3D4
>>>> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a -fmax-type-align=3D4
>>>> .export CC
>>>> .export CXX
>>>> .export CPP
>>>> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as
>>>> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar
>>>> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld
>>>> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm
>>>> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy
>>>> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump
>>>> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib
>>>> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size
>>>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings
>>>> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings
>>>> .export AS
>>>> .export AR
>>>> .export LD
>>>> .export NM
>>>> .export OBJCOPY
>>>> .export OBJDUMP
>>>> .export RANLIB
>>>> .export SIZE
>>>> .export STRINGS
>>>> .endif
>>>=20
>>>=20
>>>=20
>>> =3D=3D=3D
>>> Mark Millard
>>> markmi at dsl-only.net
>>>=20
>>>=20
>>>=20
>>> _______________________________________________
>>> freebsd-toolchain@freebsd.org mailing list
>>> https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain
>>> To unsubscribe, send any mail to =
"freebsd-toolchain-unsubscribe@freebsd.org"
>=20
>=20
>=20


--Apple-Mail=_3A881A22-5ED1-4B30-8C08-758FA0B9A7D1
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename=signature.asc
Content-Type: application/pgp-signature;
	name=signature.asc
Content-Description: Message signed with OpenPGP using GPGMail

-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org

iQIcBAEBCgAGBQJWfsQsAAoJEGwc0Sh9sBEAXM0QAKRH78oT70ZQFQDPar9s9qIc
LJBzzu5FaK4R+Ztv+t1ypx1dx8CLUUOkji/GXrKGnEblY2AwAoGv2deAC5Y6Q5AE
N/vd5p4V8Z11iKrH/YqFakBoFabdtbtl+gjHLyEBZMH6jjqI9s+8oA71LOwyPsX6
5rutJPHFAP4HYGTNMv9Jn9vF5mr4CyCtgHw6VNyB8PW9rbNX+a9Ox3dlwkgWqLes
RDpVri0Sc0QYCaagfvnZHqsuww8W+MYL9TnT2ioArQZVhECZCrYh5NcCP2JiiyX7
Tq/4+lXroggDDY45BTZq+M1dhlJf57DCJf84oqFx0f/+ygoEODC83FRfnQPFAA2M
y+5HwlyDaThfY7387/IgyBPIH2T6zC/xj1JiXwRNPVjtJPsSRlYfomCyNeEQTZwu
gM3LJXfJCvxCyYh2f7yYbPAdZu9AbLDJctVsqgRhy22jy/l/xfcOmdi7UzaLMdDq
gaBbekcp+VwAojeciJU0baOu1uwWBT4FNx3ErhzmjND+2oDCWWt7JH2jpEiUVpt/
Wbo0avYU4QZW2K5qJjgcIwDRWLZRFnR+fWQJ81accWMOAYANHcWZDrYK97lAPaUV
QfQf7fXOtZ3lfxdLqoS/oBEzmmpnj/qR9Uni8faFzrQJQvt9CBbJDXo2vZCjmcT7
CjUy0akjLstzZ12O7RN8
=l7Th
-----END PGP SIGNATURE-----

--Apple-Mail=_3A881A22-5ED1-4B30-8C08-758FA0B9A7D1--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?D38C49E3-B622-49EA-9B30-3B1B2FA2E569>