Date: Fri, 25 Dec 2015 00:31:41 -0800 From: Mark Millard <markmi@dsl-only.net> To: freebsd-arm@freebsd.org, FreeBSD Toolchain <freebsd-toolchain@freebsd.org> Subject: Re: 11.0-CURRENT (r292413) on a rpi2b: arm-gnueabi-freebsd/bin/ar, _fseeko, and memset vs memory alignment (SCTRL bit[1]=1?): Explains the Bus error? Message-ID: <DB75F0D6-86CB-4383-8653-6017C76729F9@dsl-only.net> In-Reply-To: <4CC6220D-72FB-45AD-AE70-5EB4EF0BCF5C@dsl-only.net> References: <4CC6220D-72FB-45AD-AE70-5EB4EF0BCF5C@dsl-only.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2015-Dec-24, at 10:39 PM, Mark Millard <markmi@dsl-only.net> wrote: > [I do not know if this partial crash analysis related to on-arm = clang-associated activity is good enough and appropriate to submit or = not.] >=20 > The /usr/local/arm-gnueabi-freebsd/bin/ar on the rpi2b involved below = came from pkg install activity instead of port building. Used as-is. >=20 > When I just tried my first from-rpi2b builds (ports for a rpi2b), = /usr/local/arm-gnueabi-freebsd/bin/ar crashed. I believe that the = following suggests an alignment error for the type of instructions that = memset for 128 bytes was translated to (sizeof(mbstate_t)) in the code = used by /usr/local/arm-gnueabi-freebsd/bin/ar. (But I do not know how to = check SCTLR bit[1] to be directly sure that alignment was being = enforced.) >=20 > The crash was a Bus error in /usr/local/arm-gnueabi-freebsd/bin/ar : >=20 >> libtool: link: /usr/local/arm-gnueabi-freebsd/bin/ar cru = .libs/libgnuintl.a bindtextdom.o dcgettext.o dgettext.o gettext.o = finddomain.o hash-string.o loadmsgcat.o localealias.o textdomain.o = l10nflist.o explodename.o dcigettext.o dcngettext.o dngettext.o = ngettext.o pluralx.o plural-exp.o localcharset.o threadlib.o lock.o = relocatable.o langprefs.o localename.o log.o printf.o setlocale.o = version.o xsize.o osdep.o intl-compat.o >> Bus error (core dumped) >> *** [libgnuintl.la] Error code 138 >=20 > It failed in _fseeko doing a memset that turned into uses of "vst1.64 = {d16-d17}, [r0]" instructions, for an address in register r0 that ended = in 0xa4, so was not aligned to 8 byte boundaries. =46rom what I read = such "VSTn (multiple n-element structures)" that have .64 require 8 byte = alignment. The evidence of the code and register value follow. >=20 >> # gdb /usr/local/arm-gnueabi-freebsd/bin/ar = /usr/obj/portswork/usr/ports/devel/gettext-tools/work/gettext-0.19.6/gette= xt-tools/intl/ar.core >> . . . >> #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D<value optimized = out>, whence=3D<value optimized out>, ltest=3D<value optimized out>) at = /usr/src/lib/libc/stdio/fseek.c:299 >> 299 memset(&fp->_mbstate, 0, sizeof(mbstate_t)); >> . . . >> (gdb) x/24i 0x2033adb0 >> 0x2033adb0 <_fseeko+836>: vmov.i32 q8, #0 ; 0x00000000 >> 0x2033adb4 <_fseeko+840>: movw r1, #65503 ; 0xffdf >> 0x2033adb8 <_fseeko+844>: stm r4, {r0, r7} >> 0x2033adbc <_fseeko+848>: ldrh r0, [r4, #12] >> 0x2033adc0 <_fseeko+852>: and r0, r0, r1 >> 0x2033adc4 <_fseeko+856>: strh r0, [r4, #12] >> 0x2033adc8 <_fseeko+860>: add r0, r4, #216 ; 0xd8 >> 0x2033adcc <_fseeko+864>: vst1.64 {d16-d17}, [r0] >> 0x2033add0 <_fseeko+868>: add r0, r4, #200 ; 0xc8 >> 0x2033add4 <_fseeko+872>: vst1.64 {d16-d17}, [r0] >> 0x2033add8 <_fseeko+876>: add r0, r4, #184 ; 0xb8 >> 0x2033addc <_fseeko+880>: vst1.64 {d16-d17}, [r0] >> 0x2033ade0 <_fseeko+884>: add r0, r4, #168 ; 0xa8 >> 0x2033ade4 <_fseeko+888>: vst1.64 {d16-d17}, [r0] >> 0x2033ade8 <_fseeko+892>: add r0, r4, #152 ; 0x98 >> 0x2033adec <_fseeko+896>: vst1.64 {d16-d17}, [r0] >> 0x2033adf0 <_fseeko+900>: add r0, r4, #136 ; 0x88 >> 0x2033adf4 <_fseeko+904>: vst1.64 {d16-d17}, [r0] >> 0x2033adf8 <_fseeko+908>: add r0, r4, #120 ; 0x78 >> 0x2033adfc <_fseeko+912>: vst1.64 {d16-d17}, [r0] >> 0x2033ae00 <_fseeko+916>: add r0, r4, #104 ; 0x68 >> 0x2033ae04 <_fseeko+920>: vst1.64 {d16-d17}, [r0] >> 0x2033ae08 <_fseeko+924>: b 0x2033b070 <_fseeko+1540> >> 0x2033ae0c <_fseeko+928>: cmp r5, #0 ; 0x0 >> (gdb) info all-registers >> r0 0x20651ea4 543497892 >> r1 0xffdf 65503 >> r2 0x0 0 >> r3 0x0 0 >> r4 0x20651dcc 543497676 >> r5 0x0 0 >> r6 0x0 0 >> r7 0x0 0 >> r8 0x20359df4 540384756 >> r9 0x0 0 >> r10 0x0 0 >> r11 0xbfbfb948 -1077954232 >> r12 0x2037b208 540520968 >> sp 0xbfbfb898 -1077954408 >> lr 0x2035a004 540385284 >> pc 0x2033adcc 540257740 >> f0 0 (raw 0x000000000000000000000000) >> f1 0 (raw 0x000000000000000000000000) >> f2 0 (raw 0x000000000000000000000000) >> f3 0 (raw 0x000000000000000000000000) >> f4 0 (raw 0x000000000000000000000000) >> f5 0 (raw 0x000000000000000000000000) >> f6 0 (raw 0x000000000000000000000000) >> f7 0 (raw 0x000000000000000000000000) >> fps 0x0 0 >> cpsr 0x60000010 1610612752 >=20 > The syntax in use for vst1.64 instructions does not explicitly have = the alignment notation. Presuming that the decoding is correct then from = what I read the following applies: >=20 >> Home > NEON and VFP Programming > NEON load and store element and = structure instructions > Alignment restrictions in load and store, = element and structure instructions >>=20 >> . . . When the alignment is not specified in the instruction, the = alignment restriction is controlled by the A bit (SCTLR bit[1]): >> =E2=80=A2 if the A bit is 0, there are no alignment = restrictions (except for strongly ordered or device memory, where = accesses must be element aligned or the result is unpredictable) >> =E2=80=A2 if the A bit is 1, accesses must be element = aligned. >> If an address is not correctly aligned, an alignment fault occurs. >=20 > So if at the time the "A bit" (SCTLR bit[1]) is 1 then the Bus error = would have the context to happen because of the mis-alignment. >=20 > The following shows the make.conf context that explains how = /usr/local/arm-gnueabi-freebsd/bin/ar came to be invoked: >=20 >> # more /etc/make.conf=20 >> WRKDIRPREFIX=3D/usr/obj/portswork >> WITH_DEBUG=3D >> WITH_DEBUG_FILES=3D >> MALLOC_PRODUCTION=3D >> # >> TO_TYPE=3Darmv6 >> TOOLS_TO_TYPE=3Darm-gnueabi >> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >> .if ${.MAKE.LEVEL} =3D=3D 0 >> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >> .export CC >> .export CXX >> .export CPP >> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >> .export AS >> .export AR >> .export LD >> .export NM >> .export OBJCOPY >> .export OBJDUMP >> .export RANLIB >> .export SIZE >> .export STRINGS >> .endif >=20 >=20 > Other context: >=20 >> # freebsd-version -ku; uname -aKU >> 11.0-CURRENT >> 11.0-CURRENT >> FreeBSD rpi2 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r292413M: Tue Dec = 22 22:02:21 PST 2015 = root@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG arm = 1100091 1100091 >=20 >=20 >=20 > I will note that world and kernel are my own build of -r292413 = (earlier experiment) --a build made from an amd64 host context and put = in place via DESTDIR=3D. My expectation would be that the amd64 context = would not be likely to have similar alignment restrictions involved in = its ar activity (or other activity). That would explain how I got this = far using such a clang 3.7 related toolchain for targeting an rpi2 = before finding such a problem. I realized re-reading the all above that it seems to suggest that the = _fseeko code involved is from /usr/local/arm-gnueabi-freebsd/bin/ar but = that was not my intent. libc.so.7 is from my buildworld, including the fseeko implementation: Reading symbols from /lib/libc.so.7...Reading symbols from = /usr/lib/debug//lib/libc.so.7.debug...done. done. Loaded symbols for /lib/libc.so.7 head/sys/sys/_types.h has: /* * mbstate_t is an opaque object to keep conversion state during = multibyte * stream conversions. */ typedef union { char __mbstate8[128]; __int64_t _mbstateL; /* for alignment */ } __mbstate_t; suggesting an implicit alignment of the union to whatever the = implementation defines for __int64_t --which need not be 8 byte = alignment (in the abstract, general case). But 8 byte alignment is a = possibility as well (in the abstract). But printing *fp in gdb for the fp argument to _fseeko reports the same = not-8-byte aligned address for __mbstate8 that was in r0: > (gdb) bt > #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D<value optimized = out>, whence=3D<value optimized out>, ltest=3D<value optimized out>) at = /usr/src/lib/libc/stdio/fseek.c:299 > #1 0x2033b108 in fseeko (fp=3D0x20651dcc, offset=3D18571438587904, = whence=3D0) at /usr/src/lib/libc/stdio/fseek.c:82 > #2 0x00016138 in ?? () > (gdb) print fp > $2 =3D (FILE *) 0x20651dcc > (gdb) print *fp > $3 =3D {_p =3D 0x2069a240 "", _r =3D 0, _w =3D 0, _flags =3D 5264, = _file =3D 36, _bf =3D {_base =3D 0x2069a240 "", _size =3D 32768}, = _lbfsize =3D 0, _cookie =3D 0x20651dcc, _close =3D 0x20359dfc = <__sclose>,=20 > _read =3D 0x20359de4 <__sread>, _seek =3D 0x20359df4 <__sseek>, = _write =3D 0x20359dec <__swrite>, _ub =3D {_base =3D 0x0, _size =3D 0}, = _up =3D 0x0, _ur =3D 0, _ubuf =3D 0x20651e0c "", _nbuf =3D 0x20651e0f = "", _lb =3D { > _base =3D 0x0, _size =3D 0}, _blksize =3D 32768, _offset =3D 0, = _fl_mutex =3D 0x0, _fl_owner =3D 0x0, _fl_count =3D 0, _orientation =3D = 0, _mbstate =3D {__mbstate8 =3D 0x20651e34 "", _mbstateL =3D 0}, _flags2 = =3D 0} The overall FILE struct containing the _mbstate field is also not 8-byte = aligned. But the offset from the start of the FILE struct to __mbstate8 = is a multiple of 8 bytes. It is my interpretation that there is nothing here to justify the memset = implementation combination: SCTLR bit[1]=3D=3D1 mixed with vst1.64 instructions I.e.: one or both needs to change unless some way for forcing 8-byte = alignment is introduced. I have not managed to track down anything that would indicate FreeBSD's = intent for SCTLR bit[1]. I do not even know if it is required by the = design to be constant (once initialized). =3D=3D=3D Mark Millard markmi at dsl-only.net
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?DB75F0D6-86CB-4383-8653-6017C76729F9>