Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 25 Dec 2015 06:24:57 -0800
From:      Mark Millard <markmi@dsl-only.net>
To:        freebsd-arm@freebsd.org, FreeBSD Toolchain <freebsd-toolchain@freebsd.org>
Subject:   Re: 11.0-CURRENT (r292413) on a rpi2b: arm-gnueabi-freebsd/bin/ar, _fseeko, and memset vs memory alignment (SCTRL bit[1]=1?): Explains the Bus error?
Message-ID:  <E84CD08B-253B-4EF3-B878-5246E6A6DC5F@dsl-only.net>
In-Reply-To: <DB75F0D6-86CB-4383-8653-6017C76729F9@dsl-only.net>
References:  <4CC6220D-72FB-45AD-AE70-5EB4EF0BCF5C@dsl-only.net> <DB75F0D6-86CB-4383-8653-6017C76729F9@dsl-only.net>

next in thread | previous in thread | raw e-mail | index | archive | help
[Good News Summary: Rebuilding buildworld/buildkernel for rpi2 =
11.0-CURRENT 292413 from amd64 based on adding -fmax-type-align=3D4 has =
so far removed the crashes during the toolchain activity: no more =
misaligned accesses in libc's _fseeko or elsewhere.]

On 2015-Dec-25, at 12:31 AM, Mark Millard <markmi@dsl-only.net> wrote:

> On 2015-Dec-24, at 10:39 PM, Mark Millard <markmi@dsl-only.net> wrote:
>=20
>> [I do not know if this partial crash analysis related to on-arm =
clang-associated activity is good enough and appropriate to submit or =
not.]
>>=20
>> The /usr/local/arm-gnueabi-freebsd/bin/ar on the rpi2b involved below =
came from pkg install activity instead of port building. Used as-is.
>>=20
>> When I just tried my first from-rpi2b builds (ports for a rpi2b), =
/usr/local/arm-gnueabi-freebsd/bin/ar crashed. I believe that the =
following suggests an alignment error for the type of instructions that =
memset for 128 bytes was translated to (sizeof(mbstate_t)) in the code =
used by /usr/local/arm-gnueabi-freebsd/bin/ar. (But I do not know how to =
check SCTLR bit[1] to be directly sure that alignment was being =
enforced.)
>>=20
>> The crash was a Bus error in /usr/local/arm-gnueabi-freebsd/bin/ar :
>>=20
>>> libtool: link: /usr/local/arm-gnueabi-freebsd/bin/ar cru =
.libs/libgnuintl.a  bindtextdom.o dcgettext.o dgettext.o gettext.o =
finddomain.o hash-string.o loadmsgcat.o localealias.o textdomain.o =
l10nflist.o explodename.o dcigettext.o dcngettext.o dngettext.o =
ngettext.o pluralx.o plural-exp.o localcharset.o threadlib.o lock.o =
relocatable.o langprefs.o localename.o log.o printf.o setlocale.o =
version.o xsize.o osdep.o intl-compat.o
>>> Bus error (core dumped)
>>> *** [libgnuintl.la] Error code 138
>>=20
>> It failed in _fseeko doing a memset that turned into uses of "vst1.64	=
{d16-d17}, [r0]" instructions, for an address in register r0 that ended =
in 0xa4, so was not aligned to 8 byte boundaries. =46rom what I read =
such "VSTn (multiple n-element structures)" that have .64 require 8 byte =
alignment. The evidence of the code and register value follow.
>>=20
>>> # gdb /usr/local/arm-gnueabi-freebsd/bin/ar =
/usr/obj/portswork/usr/ports/devel/gettext-tools/work/gettext-0.19.6/gette=
xt-tools/intl/ar.core
>>> . . .
>>> #0  0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D<value =
optimized out>, whence=3D<value optimized out>, ltest=3D<value optimized =
out>) at /usr/src/lib/libc/stdio/fseek.c:299
>>> 299		memset(&fp->_mbstate, 0, sizeof(mbstate_t));
>>> . . .
>>> (gdb) x/24i 0x2033adb0
>>> 0x2033adb0 <_fseeko+836>:	vmov.i32	q8, #0	; 0x00000000
>>> 0x2033adb4 <_fseeko+840>:	movw	r1, #65503	; 0xffdf
>>> 0x2033adb8 <_fseeko+844>:	stm	r4, {r0, r7}
>>> 0x2033adbc <_fseeko+848>:	ldrh	r0, [r4, #12]
>>> 0x2033adc0 <_fseeko+852>:	and	r0, r0, r1
>>> 0x2033adc4 <_fseeko+856>:	strh	r0, [r4, #12]
>>> 0x2033adc8 <_fseeko+860>:	add	r0, r4, #216	; 0xd8
>>> 0x2033adcc <_fseeko+864>:	vst1.64	{d16-d17}, [r0]
>>> 0x2033add0 <_fseeko+868>:	add	r0, r4, #200	; 0xc8
>>> 0x2033add4 <_fseeko+872>:	vst1.64	{d16-d17}, [r0]
>>> 0x2033add8 <_fseeko+876>:	add	r0, r4, #184	; 0xb8
>>> 0x2033addc <_fseeko+880>:	vst1.64	{d16-d17}, [r0]
>>> 0x2033ade0 <_fseeko+884>:	add	r0, r4, #168	; 0xa8
>>> 0x2033ade4 <_fseeko+888>:	vst1.64	{d16-d17}, [r0]
>>> 0x2033ade8 <_fseeko+892>:	add	r0, r4, #152	; 0x98
>>> 0x2033adec <_fseeko+896>:	vst1.64	{d16-d17}, [r0]
>>> 0x2033adf0 <_fseeko+900>:	add	r0, r4, #136	; 0x88
>>> 0x2033adf4 <_fseeko+904>:	vst1.64	{d16-d17}, [r0]
>>> 0x2033adf8 <_fseeko+908>:	add	r0, r4, #120	; 0x78
>>> 0x2033adfc <_fseeko+912>:	vst1.64	{d16-d17}, [r0]
>>> 0x2033ae00 <_fseeko+916>:	add	r0, r4, #104	; 0x68
>>> 0x2033ae04 <_fseeko+920>:	vst1.64	{d16-d17}, [r0]
>>> 0x2033ae08 <_fseeko+924>:	b	0x2033b070 <_fseeko+1540>
>>> 0x2033ae0c <_fseeko+928>:	cmp	r5, #0	; 0x0
>>> (gdb) info all-registers
>>> r0             0x20651ea4	543497892
>>> r1             0xffdf	65503
>>> r2             0x0	0
>>> r3             0x0	0
>>> r4             0x20651dcc	543497676
>>> r5             0x0	0
>>> r6             0x0	0
>>> r7             0x0	0
>>> r8             0x20359df4	540384756
>>> r9             0x0	0
>>> r10            0x0	0
>>> r11            0xbfbfb948	-1077954232
>>> r12            0x2037b208	540520968
>>> sp             0xbfbfb898	-1077954408
>>> lr             0x2035a004	540385284
>>> pc             0x2033adcc	540257740
>>> f0             0	(raw 0x000000000000000000000000)
>>> f1             0	(raw 0x000000000000000000000000)
>>> f2             0	(raw 0x000000000000000000000000)
>>> f3             0	(raw 0x000000000000000000000000)
>>> f4             0	(raw 0x000000000000000000000000)
>>> f5             0	(raw 0x000000000000000000000000)
>>> f6             0	(raw 0x000000000000000000000000)
>>> f7             0	(raw 0x000000000000000000000000)
>>> fps            0x0	0
>>> cpsr           0x60000010	1610612752
>>=20
>> The syntax in use for vst1.64 instructions does not explicitly have =
the alignment notation. Presuming that the decoding is correct then from =
what I read the following applies:
>>=20
>>> Home > NEON and VFP Programming > NEON load and store element and =
structure instructions > Alignment restrictions in load and store, =
element and structure instructions
>>>=20
>>> . . . When the alignment is not specified in the instruction, the =
alignment restriction is controlled by the A bit (SCTLR bit[1]):
>>> 	=E2=80=A2	if the A bit is 0, there are no alignment =
restrictions (except for strongly ordered or device memory, where =
accesses must be element aligned or the result is unpredictable)
>>> 	=E2=80=A2	if the A bit is 1, accesses must be element =
aligned.
>>> If an address is not correctly aligned, an alignment fault occurs.
>>=20
>> So if at the time the "A bit" (SCTLR bit[1]) is 1 then the Bus error =
would have the context to happen because of the mis-alignment.
>>=20
>> The following shows the make.conf context that explains how =
/usr/local/arm-gnueabi-freebsd/bin/ar came to be invoked:
>>=20
>>> # more /etc/make.conf=20
>>> WRKDIRPREFIX=3D/usr/obj/portswork
>>> WITH_DEBUG=3D
>>> WITH_DEBUG_FILES=3D
>>> MALLOC_PRODUCTION=3D
>>> #
>>> TO_TYPE=3Darmv6
>>> TOOLS_TO_TYPE=3Darm-gnueabi
>>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/
>>> .if ${.MAKE.LEVEL} =3D=3D 0
>>> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a
>>> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a
>>> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a
>>> .export CC
>>> .export CXX
>>> .export CPP
>>> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as
>>> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar
>>> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld
>>> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm
>>> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy
>>> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump
>>> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib
>>> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size
>>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings
>>> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings
>>> .export AS
>>> .export AR
>>> .export LD
>>> .export NM
>>> .export OBJCOPY
>>> .export OBJDUMP
>>> .export RANLIB
>>> .export SIZE
>>> .export STRINGS
>>> .endif
>>=20
>>=20
>> Other context:
>>=20
>>> # freebsd-version -ku; uname -aKU
>>> 11.0-CURRENT
>>> 11.0-CURRENT
>>> FreeBSD rpi2 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r292413M: Tue Dec =
22 22:02:21 PST 2015     =
root@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG  arm =
1100091 1100091
>>=20
>>=20
>>=20
>> I will note that world and kernel are my own build of -r292413 =
(earlier experiment) --a build made from an amd64 host context and put =
in place via DESTDIR=3D. My expectation would be that the amd64 context =
would not be likely to have similar alignment restrictions involved in =
its ar activity (or other activity). That would explain how I got this =
far using such a clang 3.7 related toolchain for targeting an rpi2 =
before finding such a problem.
>=20
>=20
> I realized re-reading the all above that it seems to suggest that the =
_fseeko code involved is from /usr/local/arm-gnueabi-freebsd/bin/ar but =
that was not my intent.
>=20
> libc.so.7 is from my buildworld, including the fseeko implementation:
>=20
> Reading symbols from /lib/libc.so.7...Reading symbols from =
/usr/lib/debug//lib/libc.so.7.debug...done.
> done.
> Loaded symbols for /lib/libc.so.7
>=20
>=20
> head/sys/sys/_types.h has:
>=20
> /*
> * mbstate_t is an opaque object to keep conversion state during =
multibyte
> * stream conversions.
> */
> typedef union {
>        char            __mbstate8[128];
>        __int64_t       _mbstateL;      /* for alignment */
> } __mbstate_t;
>=20
> suggesting an implicit alignment of the union to whatever the =
implementation defines for __int64_t --which need not be 8 byte =
alignment (in the abstract, general case). But 8 byte alignment is a =
possibility as well (in the abstract).
>=20
> But printing *fp in gdb for the fp argument to _fseeko reports the =
same not-8-byte aligned address for __mbstate8 that was in r0:
>=20
>> (gdb) bt
>> #0  0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D<value optimized =
out>, whence=3D<value optimized out>, ltest=3D<value optimized out>) at =
/usr/src/lib/libc/stdio/fseek.c:299
>> #1  0x2033b108 in fseeko (fp=3D0x20651dcc, offset=3D18571438587904, =
whence=3D0) at /usr/src/lib/libc/stdio/fseek.c:82
>> #2  0x00016138 in ?? ()
>> (gdb) print fp
>> $2 =3D (FILE *) 0x20651dcc
>> (gdb) print *fp
>> $3 =3D {_p =3D 0x2069a240 "", _r =3D 0, _w =3D 0, _flags =3D 5264, =
_file =3D 36, _bf =3D {_base =3D 0x2069a240 "", _size =3D 32768}, =
_lbfsize =3D 0, _cookie =3D 0x20651dcc, _close =3D 0x20359dfc =
<__sclose>,=20
>>  _read =3D 0x20359de4 <__sread>, _seek =3D 0x20359df4 <__sseek>, =
_write =3D 0x20359dec <__swrite>, _ub =3D {_base =3D 0x0, _size =3D 0}, =
_up =3D 0x0, _ur =3D 0, _ubuf =3D 0x20651e0c "", _nbuf =3D 0x20651e0f =
"", _lb =3D {
>>    _base =3D 0x0, _size =3D 0}, _blksize =3D 32768, _offset =3D 0, =
_fl_mutex =3D 0x0, _fl_owner =3D 0x0, _fl_count =3D 0, _orientation =3D =
0, _mbstate =3D {__mbstate8 =3D 0x20651e34 "", _mbstateL =3D 0}, _flags2 =
=3D 0}
>=20
> The overall FILE struct containing the _mbstate field is also not =
8-byte aligned. But the offset from the start of the FILE struct to =
__mbstate8 is a multiple of 8 bytes.
>=20
> It is my interpretation that there is nothing here to justify the =
memset implementation combination:
>=20
> SCTLR bit[1]=3D=3D1
>=20
> mixed with
>=20
> vst1.64 instructions
>=20
> I.e.: one or both needs to change unless some way for forcing 8-byte =
alignment is introduced.
>=20
> I have not managed to track down anything that would indicate =
FreeBSD's intent for SCTLR bit[1]. I do not even know if it is required =
by the design to be constant (once initialized).


I have (so far) removed the build tool crashes based on adding =
-fmax-type-align=3D4 to avoid the misaligned accesses. Details follow.

src.conf on amd64 for the rpi2 targeting buildworld/buildkernel now =
looks like:

> # more ~/src.configs/src.conf.rpi2-clang.amd64-host
> TO_TYPE=3Darmv6
> TOOLS_TO_TYPE=3Darm-gnueabi
> FROM_TYPE=3Damd64
> TOOLS_FROM_TYPE=3Dx86_64
> VERSION_CONTEXT=3D11.0
> #
> KERNCONF=3DRPI2-NODBG
> TARGET=3Darm
> .if ${.MAKE.LEVEL} =3D=3D 0
> TARGET_ARCH=3D${TO_TYPE}
> .export TARGET_ARCH
> .endif
> #
> WITHOUT_CROSS_COMPILER=3D
> #
> # For WITH_BOOT=3D . . .
> # arm-gnueabi-freebsd/bin/ld reports bootinfo.o: relocation =
R_ARM_MOVW_ABS_NC against `a local symbol' can not be used when making a =
shared object; recompile with -fPIC=20
> WITHOUT_BOOT=3D
> #
> WITH_FAST_DEPEND=3D
> WITH_LIBCPLUSPLUS=3D
> WITH_CLANG=3D
> WITH_CLANG_IS_CC=3D
> WITH_CLANG_FULL=3D
> WITH_LLDB=3D
> WITH_CLANG_EXTRAS=3D
> #
> WITHOUT_LIB32=3D
> WITHOUT_GCC=3D
> WITHOUT_GNUCXX=3D
> #
> NO_WERROR=3D
> MALLOC_PRODUCTION=3D
> #CFLAGS+=3D -DELF_VERBOSE
> #
> WITH_DEBUG=3D
> WITH_DEBUG_FILES=3D
> #
> # TOOLS_TO_TYPE based on ${TO_TYPE}-xtoolchain-gcc related =
bintutils...
> #
> #CROSS_TOOLCHAIN=3D${TO_TYPE}-gcc
> X_COMPILER_TYPE=3Dclang
> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/
> .if ${.MAKE.LEVEL} =3D=3D 0
> XCC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a -fmax-type-align=3D4
> XCXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a -fmax-type-align=3D4
> XCPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a -fmax-type-align=3D4
> .export XCC
> .export XCXX
> .export XCPP
> XAS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as
> XAR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar
> XLD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld
> XNM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm
> XOBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy
> XOBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump
> XRANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib
> XSIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size
> #NO-SUCH: XSTRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings
> XSTRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings
> .export XAS
> .export XAR
> .export XLD
> .export XNM
> .export XOBJCOPY
> .export XOBJDUMP
> .export XRANLIB
> .export XSIZE
> .export XSTRINGS
> .endif
> #
> # Host compiler stuff:
> .if ${.MAKE.LEVEL} =3D=3D 0
> CC=3D/usr/bin/clang -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin
> CXX=3D/usr/bin/clang++ -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin
> CPP=3D/usr/bin/clang-cpp -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin
> .export CC
> .export CXX
> .export CPP
> AS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/as
> AR=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ar
> LD=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ld
> NM=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/nm
> OBJCOPY=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objcopy
> OBJDUMP=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objdump
> RANLIB=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ranlib
> SIZE=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/size
> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/strings
> STRINGS=3D/usr/local/bin/${TOOLS_FROM_TYPE}-freebsd-strings
> .export AS
> .export AR
> .export LD
> .export NM
> .export OBJCOPY
> .export OBJDUMP
> .export RANLIB
> .export SIZE
> .export STRINGS
> .endif

make.conf for during the on-rpi2 port builds now looks like:

> $ more /etc/make.conf=20
> WRKDIRPREFIX=3D/usr/obj/portswork
> WITH_DEBUG=3D
> WITH_DEBUG_FILES=3D
> MALLOC_PRODUCTION=3D
> #
> TO_TYPE=3Darmv6
> TOOLS_TO_TYPE=3Darm-gnueabi
> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/
> .if ${.MAKE.LEVEL} =3D=3D 0
> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a -fmax-type-align=3D4
> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a -fmax-type-align=3D4
> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi =
-march=3Darmv7a -fmax-type-align=3D4
> .export CC
> .export CXX
> .export CPP
> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as
> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar
> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld
> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm
> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy
> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump
> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib
> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size
> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings
> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings
> .export AS
> .export AR
> .export LD
> .export NM
> .export OBJCOPY
> .export OBJDUMP
> .export RANLIB
> .export SIZE
> .export STRINGS
> .endif



=3D=3D=3D
Mark Millard
markmi at dsl-only.net






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E84CD08B-253B-4EF3-B878-5246E6A6DC5F>