Date: Mon, 28 Dec 2015 00:01:02 -0800 From: Mark Millard <markmi@dsl-only.net> To: Warner Losh <imp@bsdimp.com> Cc: freebsd-arm <freebsd-arm@freebsd.org>, FreeBSD Toolchain <freebsd-toolchain@freebsd.org>, Ian Lepore <ian@FreeBSD.org>, mat@FreeBSD.org, sbruno@FreeBSD.org Subject: Re: 11.0-CURRENT (r292413) on a rpi2b: arm-gnueabi-freebsd/bin/ar, _fseeko, and memset vs memory alignment (SCTRL bit[1]=1?): Explains the Bus error? Message-ID: <118D2970-4799-46B1-81A1-0101B907C1BE@dsl-only.net> In-Reply-To: <D38C49E3-B622-49EA-9B30-3B1B2FA2E569@bsdimp.com> References: <4CC6220D-72FB-45AD-AE70-5EB4EF0BCF5C@dsl-only.net> <DB75F0D6-86CB-4383-8653-6017C76729F9@dsl-only.net> <A338272B-982F-4E1F-B87D-1E33815EA212@dsl-only.net> <0D81C2CA-BF1C-4C14-B816-A8C5F68715B5@bsdimp.com> <51EB4AAB-BC81-4282-BA4D-D329C41D660B@dsl-only.net> <8B52074F-FDEF-4119-BB04-630F9BE9E6DB@bsdimp.com> <BBAAE33E-BD65-40A3-A0B3-F3346FC08112@dsl-only.net> <DC9EE7C3-2763-44EF-91DA-AFE63C48E537@dsl-only.net> <D38C49E3-B622-49EA-9B30-3B1B2FA2E569@bsdimp.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2015-Dec-26, at 8:45 AM, Warner Losh <imp@bsdimp.com> wrote: > Thanks, it sounds like I fixed a bug, but there=E2=80=99s more. >=20 > What were the specific port so I can test it here? >=20 > And to be clear, this is a buildworld on the RPi 2 using the = cross-built world with CPUTYPE=3Darmv7a or some such, right? >=20 > Warner >=20 >> On Dec 25, 2015, at 9:32 PM, Mark Millard <markmi@dsl-only.net> = wrote: >>=20 >> [I am again breaking off another section of older material.] >>=20 >> Mixed news I'm afraid. >>=20 >> The specific couple of ports that I attempted did build, the same = ones that originally got the Bus Error in ar using (indirectly) _fseeko = and memset that I reported. So I expect that you fixed one error. >>=20 >> But when I tried to buildworld, clang++ 3.7 processing = usr/src/lib/clang/libllvmtablegen/ materials quickly got a Bus Error at = nearly the same type of instruction (it has a "!" below that the earlier = one did not), but with r4 holding the misaligned address this time: >>=20 >>> --- _bootstrap-tools-lib/clang/libllvmsupport --- >>> --- APFloat.o --- >>> clang++: error: unable to execute command: Bus error (core dumped) >>> . . . >>> # gdb clang++ usr/src/lib/clang/libllvmtablegen/clang++.core >>> . . . >>> Core was generated by `clang++'. >>> Program terminated with signal 10, Bus error. >>> #0 0x00c3bb9c in = clang::DependentTemplateSpecializationType::DependentTemplateSpecializatio= nType () >>> [New Thread 22a18000 (LWP 100128/<unknown>)] >>> (gdb) x/40i 0x00c3bb60 >>> . . . >>> 0xc3bb9c = <_ZN5clang35DependentTemplateSpecializationTypeC2ENS_21ElaboratedTypeKeywo= rdEPNS_19NestedNameSpecifierEPKNS_14IdentifierInfoEjPKNS_16TemplateArgumen= tENS_8QualTypeE+356>: >>> vst1.64 {d16-d17}, [r4]! >>> . . . >>> (gdb) info all-registers >>> r0 0xbfbf81a8 -1077968472 >>> r1 0x22f07e14 586186260 >>> r2 0xc416bc 12850876 >>> r3 0x2 2 >>> r4 0x22f07dfc 586186236 >>> . . . >>=20 >>=20 >> Thus it appears that there is more code around that likely generates = pointers not aligned so to allow the code generation that is in use for = what is pointed to. >>=20 >> At this point I have no clue if the issue is just inside clang itself = vs. if it is in something that clang is layered on top of. Nor if there = is just one bad thing or many. >>=20 >> Note: I had not yet tried buildworld/buildkernel for the context of = the "-f" option that I was experimenting with earlier. So I do not have = a direct compare and contrast at this point. Somehow I did not notice your E-mail at the time. Meanwhile I've more = evidence. . . [Initial context for notes: Before updating to 11.0-CURRENT -r292756 and = its clang/clang++ 3.7.1.] Example c++ program that clang++ got an internal Bus Error for: > # more main.cc > #include <iostream> > int > main () > { > std::ostream *o; return 0; > } Of course the include makes the source being processed non-trivial. Going in a different direction. . . dmesg -a | grep "core dumped" on the = rpi2 showed: > pid 22238 (msgfmt), uid 0: exited on signal 11 (core dumped) > pid 22250 (xgettext), uid 0: exited on signal 11 (core dumped) > pid 22259 (msgmerge), uid 0: exited on signal 11 (core dumped) > pid 26149 (msgfmt), uid 0: exited on signal 11 (core dumped) > pid 26161 (xgettext), uid 0: exited on signal 11 (core dumped) > pid 26170 (msgmerge), uid 0: exited on signal 11 (core dumped) > pid 28826 (c++), uid 0: exited on signal 10 (core dumped) > pid 29202 (c++), uid 0: exited on signal 10 (core dumped) > pid 29282 (c++), uid 0: exited on signal 10 (core dumped) > pid 29292 (clang++), uid 0: exited on signal 10 (core dumped) Only the c++/clang++ contexts (same but for name) seemed to be leaving = .core files behind. The older log files also showed examples like the following from ports = building activity: > /var/log/dmesg.today:pid 18763 (conftest), uid 0: exited on signal 11 = (core dumped) > /var/log/dmesg.today:pid 18916 (conftest), uid 0: exited on signal 11 = (core dumped) (The original ar that I started with showed as well, the records went = back that far at the time.) [New -r292756 context. . .] After the above I updated to: > $ freebsd-version -ku; uname -aKU > 11.0-CURRENT > 11.0-CURRENT > FreeBSD rpi2 11.0-CURRENT FreeBSD 11.0-CURRENT #4 r292756M: Sun Dec 27 = 02:55:57 PST 2015 = root@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG arm = 1100092 1100092 in order to pick up clang 3.7.1. I used -fmax-type-align=3D4 = -mno-unaligned-access in the src.conf file for the buildworld = buildkernel amd64->rpi2 cross build before installing both parts on the = rpi2 media. On the rpi2 itself the resulting c++/clang++ still gets Bus Error during = buildworld despite the use of -fmax-type-align=3D4 -mno-unaligned-acces = in the amd64 hosted cross build (and in the rpi2 attempted rebuild). An = example crash report is: > /usr/bin/clang++ -B/usr/local/arm-gnueabi-freebsd/bin -march=3Darmv7a = -fmax-type-align=3D4 -mno-unaligned-access -O -pipe -mfloat-abi=3Dsoftfp = -I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/include = -I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/tools/clang/incl= ude = -I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support = -I. = -I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/../../lib/clang/= include -DLLVM_ON_UNIX -DLLVM_ON_FREEBSD -D__STDC_LIMIT_MACROS = -D__STDC_CONSTANT_MACROS -fno-strict-aliasing = -DLLVM_DEFAULT_TARGET_TRIPLE=3D\"armv6-gnueabi-freebsd11.0\" = -DLLVM_HOST_TRIPLE=3D\"armv6-unknown-freebsd11.0\" = -DDEFAULT_SYSROOT=3D\"\" -MD -MP -MF.depend.APFloat.o -MTAPFloat.o = -Qunused-arguments = -I/usr/obj/clang/arm.armv6/usr/src/tmp/legacy/usr/include -std=3Dc++11 = -fno-exceptions -fno-rtti -stdlib=3Dlibc++ -Wno-c++11-extensions -c = /usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support/APFloa= t.cpp -o APFloat.o > clang++: error: unable to execute command: Bus error (core dumped) > clang++: error: clang frontend command failed due to signal (use -v to = see invocation) > FreeBSD clang version 3.7.1 (tags/RELEASE_371/final 255217) 20151225 > Target: armv6--freebsd11.0-gnueabi > Thread model: posix > clang++: note: diagnostic msg: PLEASE submit a bug report to = https://bugs.freebsd.org/submit/ and include the crash backtrace, = preprocessed source, and associated run script. > clang++: note: diagnostic msg:=20 > ******************** >=20 > PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT: > Preprocessed source(s) and associated run script(s) are located at: > clang++: note: diagnostic msg: /tmp/APFloat-04544c.cpp > clang++: note: diagnostic msg: /tmp/APFloat-04544c.sh > clang++: note: diagnostic msg:=20 >=20 > ******************** > *** Error code 254 >=20 > Stop. > make[3]: stopped in /usr/src/lib/clang/libllvmsupport > *** Error code 1 An earlier -j 6 buildworld had failures for ARMBuildAttrs, APSInt, = APInt, and Error before stopping, in addition to the APFloat indicated = above (no -j makes for easier reading above): > # ls -lt /tmp > total 41516 > -rw-r--r-- 1 root wheel 4057 Dec 28 03:05 APFloat-04544c.sh > -rw-r--r-- 1 root wheel 2155452 Dec 28 03:05 APFloat-04544c.cpp > -rw-r--r-- 1 root wheel 4081 Dec 28 02:53 = ARMBuildAttrs-432569.sh > -rw-r--r-- 1 root wheel 1276912 Dec 28 02:53 = ARMBuildAttrs-432569.cpp > -rw-r--r-- 1 root wheel 3997 Dec 28 02:53 APSInt-a2927e.sh > -rw-r--r-- 1 root wheel 1943445 Dec 28 02:53 APSInt-a2927e.cpp > -rw-r--r-- 1 root wheel 3985 Dec 28 02:53 APInt-d0389a.sh > -rw-r--r-- 1 root wheel 2115595 Dec 28 02:53 APInt-d0389a.cpp > -rw-r--r-- 1 root wheel 4009 Dec 28 02:53 APFloat-33be1b.sh > -rw-r--r-- 1 root wheel 2155452 Dec 28 02:53 APFloat-33be1b.cpp > -rw-r--r-- 1 root wheel 4001 Dec 28 02:53 Error-777068.sh > -rw-r--r-- 1 root wheel 1925065 Dec 28 02:53 Error-777068.cpp The earlier "iostream" program example also still gets its Bus Error = during its clang++ based compilation in this new -r292756 context. The above -r292756 material avoids involving ports software with its own = set of additional questions, compilers, tools, etc.: it sticks to = buildworld/buildkernel material (and never gets to buildkernel). When I tried -j 5 buildkernel by itself on the rpi2 there were no Bus = Errors, no Segmentation Faults, and no core dumps. The buildkernel took = about 51 minutes. (I've not tried installing what it built.) (I have a SSD on a USB hub in use for world/root on the rpi2. The = /etc/fstab on the micro-SD lists / as mounting from the SSD instead. I = installkernel and installworld via the amd64 context to both the = micro-SD and the SSD so that they track. I can boot from just the = micro-SD if I want to but normally involve the SSD.) Another kind of experiment would be to omit -fmax-type-align=3D4 but use = -mno-unaligned-access (for handling any packed data structures) and see = if buildkernel can still finish on the rpi2 (if enough of the = amd64->rpi2 buildworld still operates on the rpi2 to allow the test). A potential experiment for buildworld would be to use -fmax-type-align=3D1= with -mno-unaligned-access as the amd64->rpi2 cross build context. A = misalignment Bus Error from that context might well be a clang++ code = generation error of not paying attention to the alignment rules where = clang++ should. A potentially interesting (but independent) set of warnings during the = buildkernel was: > WARNING: hwpmc_mod.c: enum pmc_event has too many values: 2588 > 1023 > WARNING: hwpmc_logging.c: enum pmc_event has too many values: 2588 > = 1023 > WARNING: hwpmc_soft.c: enum pmc_event has too many values: 2588 > 1023 > WARNING: hwpmc_arm.c: enum pmc_event has too many values: 2588 > 1023 (I've not investigated.) Before this -r292756 update the following ports seemed to have built = without generating core dumps or Bus Error reports or other such in the = process: devel/gettext-tools devel/gmake-lite devel/p5-Locale-gettext lang/perl5.22 security/sudo Note that I did not use make.conf to force -f. . . and -m. . . for = these. But the test was if they could build, not if they operated = correctly when built. My guess is that they are primarily C instead of C++ and/or happen to = avoid the parts of C++ where clang++ is having internal data structure = alignment problems vs. SCTLR bit[1]=3D=3D1. Generally the pkg installs on the rpi2 seemed to have been operating = okay. But they do nto test compiling/linking with the system = clang/clang++ involved. In general building ports can have other issues that block completion so = I had not tried much in that direction and happened to pick on a few = things that worked (see above). Getting through a self-hosting rpi2 = buildworld buildkernel first likely is a better path before involving = ports. But my way of working has involved using devel/arm-gnueabi-binutils , = which seemed to build and work fine. One thing of note from all my rpi2 builds: I've learned to avoid doing a = "svnlite status /usr/src/" and similar commands. Fairly frequently they = do not complete and each existing ssh connection to the rpi2 quits = responding once some new program is attempted from the connection. The = same for directly at the rpi2 (via USB devices). Unfortunately /var/log/messages only shows the following boot, no = messages from the hang-up context. I'd guess that USB (and other such?) = communication stopped operating. The src.conf for on the rpi2 has (the amd64->rpi2 cross compile was very = similar but the amd64-host-targets-self clang/clang++ commands do not = need the -f. . . and -m. . . ): > TO_TYPE=3Darmv6 > TOOLS_TO_TYPE=3Darm-gnueabi > FROM_TYPE=3D${TO_TYPE} > TOOLS_FROM_TYPE=3D${TOOLS_TO_TYPE} > VERSION_CONTEXT=3D11.0 > # > KERNCONF=3DRPI2-NODBG > TARGET=3Darm > .if ${.MAKE.LEVEL} =3D=3D 0 > TARGET_ARCH=3D${TO_TYPE} > .export TARGET_ARCH > .endif > # > WITHOUT_CROSS_COMPILER=3D > # > # For WITH_BOOT=3D . . . (amd64 cross compile context) > # arm-gnueabi-freebsd/bin/ld reports bootinfo.o: relocation = R_ARM_MOVW_ABS_NC against `a local symbol' can not be used when making a = shared object; recompile with -fPIC=20 > WITHOUT_BOOT=3D > # > WITH_FAST_DEPEND=3D > WITH_LIBCPLUSPLUS=3D > WITH_CLANG=3D > WITH_CLANG_IS_CC=3D > WITH_CLANG_FULL=3D > WITH_LLDB=3D > WITH_CLANG_EXTRAS=3D > # > WITHOUT_LIB32=3D > WITHOUT_GCC=3D > WITHOUT_GNUCXX=3D > # > NO_WERROR=3D > MALLOC_PRODUCTION=3D > #CFLAGS+=3D -DELF_VERBOSE > # > WITH_DEBUG=3D > WITH_DEBUG_FILES=3D > # > # TOOLS_TO_TYPE based on ${TO_TYPE}-xtoolchain-gcc related = bintutils... > # > #CROSS_TOOLCHAIN=3D${TO_TYPE}-gcc > X_COMPILER_TYPE=3Dclang > CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ > .if ${.MAKE.LEVEL} =3D=3D 0 > XCC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 -mno-unaligned-access > XCXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 -mno-unaligned-access > XCPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 -mno-unaligned-access > .export XCC > .export XCXX > .export XCPP > XAS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as > XAR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar > XLD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld > XNM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm > XOBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy > XOBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump > XRANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib > XSIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size > #NO-SUCH: XSTRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings > XSTRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings > .export XAS > .export XAR > .export XLD > .export XNM > .export XOBJCOPY > .export XOBJDUMP > .export XRANLIB > .export XSIZE > .export XSTRINGS > .endif > # > # =46rom clang (via system)... > # > .if ${.MAKE.LEVEL} =3D=3D 0 > CC=3D/usr/bin/clang -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin = -march=3Darmv7a -fmax-type-align=3D4 -mno-unaligned-access > CXX=3D/usr/bin/clang++ -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin = -march=3Darmv7a -fmax-type-align=3D4 -mno-unaligned-access > CPP=3D/usr/bin/clang-cpp -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin = -march=3Darmv7a -fmax-type-align=3D4 -mno-unaligned-access > .export CC > .export CXX > .export CPP > .endif > # > # > # TOOLS_FROM_TYPE binutils from xtoolchain like context... > # > .if ${.MAKE.LEVEL} =3D=3D 0 > AS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/as > AR=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ar > LD=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ld > NM=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/nm > OBJCOPY=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objcopy > OBJDUMP=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objdump > RANLIB=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ranlib > SIZE=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/size > #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/strings > STRINGS=3D/usr/local/bin/${TOOLS_FROM_TYPE}-freebsd-strings > .export AS > .export AR > .export LD > .export NM > .export OBJCOPY > .export OBJDUMP > .export RANLIB > .export SIZE > .export STRINGS > .endif This technique does require devel/arm-gnueabi-binutils to have been = built and operating okay on amd64 and later on the rpi2. So far I've no = hints of any problems in that area. The RPI2-NODBG config is shown below: > # more /usr/src/sys/arm/conf/RPI2-NODBG=20 > ident RPI2-NODBG >=20 > include "RPI2" >=20 > makeoptions DEBUG=3D-g # Build kernel with gdb(1) = debug symbols > options ALT_BREAK_TO_DEBUGGER > #options VERBOSE_SYSINIT # Enable verbose sysinit = messages >=20 > options KDB # Enable kernel debugger = support >=20 > # For minimum debugger support (stable branch) use: > #options KDB_TRACE # Print a stack trace for a = panic > options DDB # Enable the kernel debugger >=20 > nooptions INVARIANTS # Enable calls of extra sanity = checking > nooptions INVARIANT_SUPPORT # Extra sanity checks of = internal structures, required by INVARIANTS > nooptions WITNESS # Enable checks to detect = deadlocks and cycles > nooptions WITNESS_SKIPSPIN # Don't run witness on = spinlocks for speed > nooptions DIAGNOSTIC Most of my /usr/src/ tailoring is tied to powerpc and powerpc64 issues: > # svnlite status /usr/src/ > ? /usr/src/.snap > M /usr/src/contrib/libcxxrt/guard.cc > M /usr/src/lib/csu/powerpc64/Makefile > M /usr/src/lib/libc/stdio/findfp.c > ? /usr/src/lib/libc/stdio/findfp.c.orig > ? /usr/src/restoresymtable > ? /usr/src/sys/arm/conf/RPI2-NODBG > M /usr/src/sys/boot/ofw/Makefile.inc > M /usr/src/sys/boot/powerpc/Makefile.inc > M /usr/src/sys/boot/uboot/Makefile.inc > ? /usr/src/sys/powerpc/conf/GENERIC64vtsc > ? /usr/src/sys/powerpc/conf/GENERIC64vtsc-NODEBUG > ? /usr/src/sys/powerpc/conf/GENERICvtsc > ? /usr/src/sys/powerpc/conf/GENERICvtsc-NODEBUG > M /usr/src/sys/powerpc/ofw/ofw_machdep.c lib/libc/stdio/findfp.c has the patch I was asked to test. contrib/libcxxrt/guard.cc is to avoid bad C++ source code (use of = C11-specific notation in C++ that is reported syntax errors in = powerpc64-xtoolchain-gcc/powerpc64-gcc compilation contexts): > # svnlite diff /usr/src/contrib/libcxxrt/guard.cc > Index: /usr/src/contrib/libcxxrt/guard.cc > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/contrib/libcxxrt/guard.cc (revision 292756) > +++ /usr/src/contrib/libcxxrt/guard.cc (working copy) > @@ -101,7 +101,7 @@ > uint32_t init_half; > uint32_t lock_half; > } guard_t; > -_Static_assert(sizeof(guard_t) =3D=3D sizeof(uint64_t), ""); > +//_Static_assert(sizeof(guard_t) =3D=3D sizeof(uint64_t), ""); > static const uint32_t LOCKED =3D 1; > static const uint32_t INITIALISED =3D static_cast<guard_lock_t>(1) << = 24; > # endif The sys/boot/. . . examples are just use of -Wl, notation in LDFLAGS = where the original notation was rejected, such as: > # svnlite diff /usr/src/sys/boot/uboot/Makefile.inc > Index: /usr/src/sys/boot/uboot/Makefile.inc > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/sys/boot/uboot/Makefile.inc (revision 292756) > +++ /usr/src/sys/boot/uboot/Makefile.inc (working copy) > @@ -2,7 +2,7 @@ > =20 > .if ${MACHINE_ARCH} =3D=3D "powerpc64" > CFLAGS+=3D -m32 -mcpu=3Dpowerpc > -LDFLAGS+=3D -m elf32ppc_fbsd > +LDFLAGS+=3D -Wl,-m -Wl,elf32ppc_fbsd > .endif > =20 > .include "../Makefile.inc" All 3 are powerpc64 specific changes. =3D=3D=3D Mark Millard markmi at dsl-only.net >=20 > Older material: >=20 > On 2015-Dec-25, at 5:21 PM, Mark Millard <markmi@dsl-only.net> wrote: >=20 >> On 2015-Dec-25, at 3:42 PM, Warner Losh <imp@bsdimp.com> wrote: >>=20 >>=20 >>> On Dec 25, 2015, at 3:14 PM, Mark Millard <markmi@dsl-only.net> = wrote: >>>=20 >>> [I'm going to break much of the earlier "original material" text to = tail of the message.] >>>=20 >>>> On 2015-Dec-25, at 11:53 AM, Warner Losh <imp@bsdimp.com> wrote: >>>>=20 >>>> So what happens if we actually fix the underlying bug? >>>>=20 >>>> I see two ways of doing this. In findfp.c, we allocate an array of = FILE * today like: >>>> g =3D (struct glue *)malloc(sizeof(*g) + ALIGNBYTES + n * = sizeof(FILE)); >>>> but that assumes that FILE just has normal pointer alignment = requirements. However, >>>> due to the mbstate having int64_t alignment requirements, this is = wrong. Maybe we >>>> need to do something like >>>> g =3D (struct glue *)malloc(sizeof(*g) + = max(sizeof(int64_t),ALIGNBYTES) + n * sizeof(FILE)); >>>> which wouldn=E2=80=99t change anything on LP64 systems, but would = result in proper alignment >>>> for ILP32 systems. We=E2=80=99d have to fix the loop that uses = ALIGN afterwards to use >>>> roundup. Instead, we=E2=80=99d need to round up to the neared = 8-byte aligned offset (or technically, >>>> the max of ALIGNBYTES and 8, but that=E2=80=99s always 8 on = today=E2=80=99s systems. If we do this, >>>> we can make sure that each file is 8-byte aligned or better. We may = need to round up >>>> sizeof(FILE) to a multiple of 8 as well. I believe that since it = has the 8-byte alignment >>>> for a member, its size must be a multiple of 8, but I=E2=80=99ve = not chased that belief to ground. >>>> If not, we may need another decorator (__aligned(8), I think, = spelled with the ugly >>>> max expression above). That way, the contract we=E2=80=99re making = with the compiler will >>>> always be true. ALIGN BYTES is 4 on Arm anyway, so that bit is = clearly wrong. >>>>=20 >>>> This wouldn=E2=80=99t be an ABI change, since you can only get a = valid FILE * from fopen (and >>>> friends), plus stdin, stdout, and stderr. Those addresses aren=E2=80=99= t hard coded into binaries, >>>> so even if we have to tweak the last three and deal with some = =E2=80=98fake=E2=80=99 FILE abuse in libc >>>> (which I don=E2=80=99t think suffers from this issue, btw, given = the alignment requirements that would >>>> naturally follow from something on the stack), we=E2=80=99d still = be ahead. At least for all CONFORMING >>>> implementations[*]... >>>>=20 >>>> TL;DR: Why not make FILE * always 8-byte aligned? The compiler = options are a band-aide. >>>>=20 >>>> Warner >>>>=20 >>>> [*] There=E2=80=99s at least on popular package that has a copy of = the FILE structure in one of its >>>> .h files and uses that to do unnatural optimization things, but = even that=E2=80=99s cool, I think, >>>> since it never allocates a new one. >>>>=20 >>>=20 >>> The ARM documentation mentions cases of 16 byte alignment = requirements. I've no clue if the clang code generation ever creates = such code. There might be wider requirements possible in arm code as = well. (I'm not an arm expert.) As an example of an implication: "The = malloc() function returns a pointer to a block of at least size bytes = suitably aligned for any use." In other words: aligned to some figure = that is a multiple of *every* alignment requirement that the code = generator can produce, possibly being the least common multiple. >>>=20 >>> "-fmax-type-align=3D. . ." is a means of controlling/limiting the = range of potential alignments to no more than a fixed, predefined value. = Above that and the code generation has to work in small size accesses = and build-up/split-up bigger values. Using "-fmax-type-align=3D. . ." = allows defining a figure as part of an ABI that is then not subject to = code generator updates that could increase the maximum alignment figure = and break things: It turns off such new capabilities. Other options need = not work that way to preserve the ABI. >>=20 >> That=E2=80=99s true, as far as it goes=E2=80=A6 But I=E2=80=99m not = sure it goes far enough. The premise here is that the problem is = wide-spread, when in fact I think it is quite narrow. >>=20 >>> But in the most fundamental terms process wise as far as I can tell. = . . >>>=20 >>> While the FILE case that occurred is a specific example, every = memory-allocation-like operation is at a potential issue for all such = "allocated" objects where the related code generation requires alignment = to avoid Bus Error (given the SCTLR bit[1] in use). >>=20 >> The problem isn=E2=80=99t general. The problem isn=E2=80=99t malloc. = Malloc will generally return the right thing on arm (and if it = doesn=E2=80=99t, >> then we need to make sure it does). >>=20 >> The problem is we get a boatload of FILEs from the system all at = once, and those are misaligned because of a bug in the code. One = that=E2=80=99s fixed, I believe, in https://reviews.freebsd.org/D4708. >>=20 >>=20 >>> How many other places in FreeBSD might sometimes return mis-aligned = pointers for the existing code generation and ABI combination? >>=20 >> It isn=E2=80=99t an ABI thing, just a code bug thing. The only reason = it was an issue was due to the optimizing nature of clang. >>=20 >> We=E2=80=99ve had to deal with the arm alignment issues for years. I = wager there are very few indeed. The only reason this was was brought to = light was better code-gen from clang. >>=20 >>> How many other places are subject to breakage when "internal" = structs/unions/fields involved are changed to be of a different size = because the code is not fully auto-adjusting to match the code = generation yet --even if right now "it works"? How fragile will things = be for future work? >>=20 >> If there are others, I=E2=80=99ll bet they could be counted on one = hand since very few things do the =E2=80=98slab=E2=80=99 allocator that = FILE does. >>=20 >>> What would it take to find out and deal with them all? (I do not = have the background knowledge to span much.) >>>=20 >>> My experiment avoided potentially changing parts of the ABI and also = avoided dealing with such a "lots of code to investigate" issue. It may = not be the long term 11.0-RELEASE solution. Even if not, it may be = appropriate for various temporary purposes that need to avoid Bus Errors = in the process. For example if Ian has a good reason to use clang 3.7 = instead of gcc 4.2.1. >>=20 >> The review above doesn=E2=80=99t change the ABI either. >>=20 >>> Other notes: >>>=20 >>>> I believe that since it has the 8-byte alignment >>>> for a member, its size must be a multiple of 8 >>>=20 >>> There are some C/C++ language rules about the address of a structure = equalling the address of the first field, uniformity of the offsets, and = the like. But. . . >>>=20 >>> The C and C++ languages specify no specific numerical alignment = figures, not even relative to specific sizeof(...) expressions. To use = an old example: a 68010 only needs alignment for >=3D 2 byte things and = even alignment is all that is then required. Some other contexts take a = lot more to meet the specifications. There are some implications of the = modern memory model(s) created to cover concurrency explicitly, such as = avoiding interactions that can happen via, for example, separate objects = (in part) sharing a cache line. (I've only looked at C++ for this, and = only to a degree.) >>>=20 >>> The detailed alignment rules are more "implementation defined" than = "predefined by the standard". But the definition is trying to meet = language criteria. It is not a fully independent choice. >>=20 >> Many of them are actually defined by a combination of the standard = language definition, as well as the ABI standard. This is why we know = that mbstate_t must be 8 byte aligned. >>=20 >>> May be some other standards that FreeBSD is tied to specify more = specifics, such as a N byte integer always aligns to some multiple of N = (a waste on the 68010), including the alignment for union or struct that = it may be a part of tracking. But such rules force padding that may or = may not be required to meet the language's more abstract criteria and = such rules may not match the existing/in-use ABI. >>=20 >> It is all spelled out in the ARM EABI docs. >>=20 >>> So far as I can tell explicitly declared alignments may well be = necessary. If that one "popular package", say, formed an array of FILE = copies then the resultant alignments need not all match the ones = produced by your example code unless the FILE declaration forces the = compiler to match, causing sizeof(FILE) to track as well. FILE need not = be the only such issue. >>=20 >> Arrays of FILEs isn=E2=80=99t an issue (except that it encodes the = size of FILE into the app). It=E2=80=99s the specifically quirky way = that libc does it that=E2=80=99s the problem. >>=20 >>> My background and reference material are mostly tied the languages = --and so my notes tend to be limited to that much context. >>=20 >> Understood. While there may be issues with alignment still, tossing a = big hammer at the problem because they might exist will likely mean they = will persist far longer than fixing them one at a time. When we first = ported to arm, there were maybe half a dozen places that needed fixing. = I doubt there=E2=80=99s more now. >>=20 >> Can you try the patch in the above code review w/o the -f switch and = let me know if it works for you? >>=20 >> Warner >=20 > buildworld/buildkernel has been started on amd64 for a rpi2 target. = That and install kernel/world and starting up a port rebuild on the rpi2 = and waiting for it means it will be a few hours even if I start the next = thing just as each prior thing finishes. I may give up and go to sleep = first. >=20 > As for presumptions: I'll take your word on expected status of things. = I've no clue. But absent even the hear-say status information at the = time I did not presume that what was in front of me was all there is to = worry about --nor did I try to go figure it all out on my own. I took a = path to cover both possibilities for local-only vs. more-wide-spread (so = long as that path did not force a split-up of some larger form of atomic = action). >=20 > In my view "-mno-unaligned-access" is an even bigger hammer than I = used. I find no clang statement about what its ABI consequences would = be, unlike for what I did: What mix of more padding for alignment vs. = more but smaller accesses? But as I remember I've seen = "-mno-unaligned-access" in use in ports and the like so its consequences = may be familiar material for some folks. >=20 > Absent any questions about ABI consequences "-mno-unaligned-access" = does well mark the expected SCTLR bit[1] status, far better than what I = did. Again: I was covering my ignorance while making any significant = investigation/debugging as unlikely as I could. >=20 >=20 >> Original material: >>=20 >>> On Dec 25, 2015, at 7:24 AM, Mark Millard <markmi@dsl-only.net> = wrote: >>>=20 >>> [Good News Summary: Rebuilding buildworld/buildkernel for rpi2 = 11.0-CURRENT 292413 from amd64 based on adding -fmax-type-align=3D4 has = so far removed the crashes during the toolchain activity: no more = misaligned accesses in libc's _fseeko or elsewhere.] >>>=20 >>> On 2015-Dec-25, at 12:31 AM, Mark Millard <markmi@dsl-only.net> = wrote: >>>=20 >>>> On 2015-Dec-24, at 10:39 PM, Mark Millard <markmi@dsl-only.net> = wrote: >>>>=20 >>>>> [I do not know if this partial crash analysis related to on-arm = clang-associated activity is good enough and appropriate to submit or = not.] >>>>>=20 >>>>> The /usr/local/arm-gnueabi-freebsd/bin/ar on the rpi2b involved = below came from pkg install activity instead of port building. Used = as-is. >>>>>=20 >>>>> When I just tried my first from-rpi2b builds (ports for a rpi2b), = /usr/local/arm-gnueabi-freebsd/bin/ar crashed. I believe that the = following suggests an alignment error for the type of instructions that = memset for 128 bytes was translated to (sizeof(mbstate_t)) in the code = used by /usr/local/arm-gnueabi-freebsd/bin/ar. (But I do not know how to = check SCTLR bit[1] to be directly sure that alignment was being = enforced.) >>>>>=20 >>>>> The crash was a Bus error in /usr/local/arm-gnueabi-freebsd/bin/ar = : >>>>>=20 >>>>>> libtool: link: /usr/local/arm-gnueabi-freebsd/bin/ar cru = .libs/libgnuintl.a bindtextdom.o dcgettext.o dgettext.o gettext.o = finddomain.o hash-string.o loadmsgcat.o localealias.o textdomain.o = l10nflist.o explodename.o dcigettext.o dcngettext.o dngettext.o = ngettext.o pluralx.o plural-exp.o localcharset.o threadlib.o lock.o = relocatable.o langprefs.o localename.o log.o printf.o setlocale.o = version.o xsize.o osdep.o intl-compat.o >>>>>> Bus error (core dumped) >>>>>> *** [libgnuintl.la] Error code 138 >>>>>=20 >>>>> It failed in _fseeko doing a memset that turned into uses of = "vst1.64 {d16-d17}, [r0]" instructions, for an address in = register r0 that ended in 0xa4, so was not aligned to 8 byte boundaries. = =46rom what I read such "VSTn (multiple n-element structures)" that have = .64 require 8 byte alignment. The evidence of the code and register = value follow. >>>>>=20 >>>>>> # gdb /usr/local/arm-gnueabi-freebsd/bin/ar = /usr/obj/portswork/usr/ports/devel/gettext-tools/work/gettext-0.19.6/gette= xt-tools/intl/ar.core >>>>>> . . . >>>>>> #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D<value = optimized out>, whence=3D<value optimized out>, ltest=3D<value optimized = out>) at /usr/src/lib/libc/stdio/fseek.c:299 >>>>>> 299 memset(&fp->_mbstate, 0, sizeof(mbstate_t)); >>>>>> . . . >>>>>> (gdb) x/24i 0x2033adb0 >>>>>> 0x2033adb0 <_fseeko+836>: vmov.i32 q8, #0 ; = 0x00000000 >>>>>> 0x2033adb4 <_fseeko+840>: movw r1, #65503 ; 0xffdf >>>>>> 0x2033adb8 <_fseeko+844>: stm r4, {r0, r7} >>>>>> 0x2033adbc <_fseeko+848>: ldrh r0, [r4, #12] >>>>>> 0x2033adc0 <_fseeko+852>: and r0, r0, r1 >>>>>> 0x2033adc4 <_fseeko+856>: strh r0, [r4, #12] >>>>>> 0x2033adc8 <_fseeko+860>: add r0, r4, #216 ; 0xd8 >>>>>> 0x2033adcc <_fseeko+864>: vst1.64 {d16-d17}, [r0] >>>>>> 0x2033add0 <_fseeko+868>: add r0, r4, #200 ; 0xc8 >>>>>> 0x2033add4 <_fseeko+872>: vst1.64 {d16-d17}, [r0] >>>>>> 0x2033add8 <_fseeko+876>: add r0, r4, #184 ; 0xb8 >>>>>> 0x2033addc <_fseeko+880>: vst1.64 {d16-d17}, [r0] >>>>>> 0x2033ade0 <_fseeko+884>: add r0, r4, #168 ; 0xa8 >>>>>> 0x2033ade4 <_fseeko+888>: vst1.64 {d16-d17}, [r0] >>>>>> 0x2033ade8 <_fseeko+892>: add r0, r4, #152 ; 0x98 >>>>>> 0x2033adec <_fseeko+896>: vst1.64 {d16-d17}, [r0] >>>>>> 0x2033adf0 <_fseeko+900>: add r0, r4, #136 ; 0x88 >>>>>> 0x2033adf4 <_fseeko+904>: vst1.64 {d16-d17}, [r0] >>>>>> 0x2033adf8 <_fseeko+908>: add r0, r4, #120 ; 0x78 >>>>>> 0x2033adfc <_fseeko+912>: vst1.64 {d16-d17}, [r0] >>>>>> 0x2033ae00 <_fseeko+916>: add r0, r4, #104 ; 0x68 >>>>>> 0x2033ae04 <_fseeko+920>: vst1.64 {d16-d17}, [r0] >>>>>> 0x2033ae08 <_fseeko+924>: b 0x2033b070 = <_fseeko+1540> >>>>>> 0x2033ae0c <_fseeko+928>: cmp r5, #0 ; 0x0 >>>>>> (gdb) info all-registers >>>>>> r0 0x20651ea4 543497892 >>>>>> r1 0xffdf 65503 >>>>>> r2 0x0 0 >>>>>> r3 0x0 0 >>>>>> r4 0x20651dcc 543497676 >>>>>> r5 0x0 0 >>>>>> r6 0x0 0 >>>>>> r7 0x0 0 >>>>>> r8 0x20359df4 540384756 >>>>>> r9 0x0 0 >>>>>> r10 0x0 0 >>>>>> r11 0xbfbfb948 -1077954232 >>>>>> r12 0x2037b208 540520968 >>>>>> sp 0xbfbfb898 -1077954408 >>>>>> lr 0x2035a004 540385284 >>>>>> pc 0x2033adcc 540257740 >>>>>> f0 0 (raw 0x000000000000000000000000) >>>>>> f1 0 (raw 0x000000000000000000000000) >>>>>> f2 0 (raw 0x000000000000000000000000) >>>>>> f3 0 (raw 0x000000000000000000000000) >>>>>> f4 0 (raw 0x000000000000000000000000) >>>>>> f5 0 (raw 0x000000000000000000000000) >>>>>> f6 0 (raw 0x000000000000000000000000) >>>>>> f7 0 (raw 0x000000000000000000000000) >>>>>> fps 0x0 0 >>>>>> cpsr 0x60000010 1610612752 >>>>>=20 >>>>> The syntax in use for vst1.64 instructions does not explicitly = have the alignment notation. Presuming that the decoding is correct then = from what I read the following applies: >>>>>=20 >>>>>> Home > NEON and VFP Programming > NEON load and store element and = structure instructions > Alignment restrictions in load and store, = element and structure instructions >>>>>>=20 >>>>>> . . . When the alignment is not specified in the instruction, the = alignment restriction is controlled by the A bit (SCTLR bit[1]): >>>>>> =E2=80=A2 if the A bit is 0, there are no alignment = restrictions (except for strongly ordered or device memory, where = accesses must be element aligned or the result is unpredictable) >>>>>> =E2=80=A2 if the A bit is 1, accesses must be element = aligned. >>>>>> If an address is not correctly aligned, an alignment fault = occurs. >>>>>=20 >>>>> So if at the time the "A bit" (SCTLR bit[1]) is 1 then the Bus = error would have the context to happen because of the mis-alignment. >>>>>=20 >>>>> The following shows the make.conf context that explains how = /usr/local/arm-gnueabi-freebsd/bin/ar came to be invoked: >>>>>=20 >>>>>> # more /etc/make.conf >>>>>> WRKDIRPREFIX=3D/usr/obj/portswork >>>>>> WITH_DEBUG=3D >>>>>> WITH_DEBUG_FILES=3D >>>>>> MALLOC_PRODUCTION=3D >>>>>> # >>>>>> TO_TYPE=3Darmv6 >>>>>> TOOLS_TO_TYPE=3Darm-gnueabi >>>>>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >>>>>> .if ${.MAKE.LEVEL} =3D=3D 0 >>>>>> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>>>> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>>>> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a >>>>>> .export CC >>>>>> .export CXX >>>>>> .export CPP >>>>>> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >>>>>> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >>>>>> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >>>>>> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >>>>>> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >>>>>> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >>>>>> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >>>>>> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >>>>>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings= >>>>>> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >>>>>> .export AS >>>>>> .export AR >>>>>> .export LD >>>>>> .export NM >>>>>> .export OBJCOPY >>>>>> .export OBJDUMP >>>>>> .export RANLIB >>>>>> .export SIZE >>>>>> .export STRINGS >>>>>> .endif >>>>>=20 >>>>>=20 >>>>> Other context: >>>>>=20 >>>>>> # freebsd-version -ku; uname -aKU >>>>>> 11.0-CURRENT >>>>>> 11.0-CURRENT >>>>>> FreeBSD rpi2 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r292413M: Tue = Dec 22 22:02:21 PST 2015 = root@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG arm = 1100091 1100091 >>>>>=20 >>>>>=20 >>>>>=20 >>>>> I will note that world and kernel are my own build of -r292413 = (earlier experiment) --a build made from an amd64 host context and put = in place via DESTDIR=3D. My expectation would be that the amd64 context = would not be likely to have similar alignment restrictions involved in = its ar activity (or other activity). That would explain how I got this = far using such a clang 3.7 related toolchain for targeting an rpi2 = before finding such a problem. >>>>=20 >>>>=20 >>>> I realized re-reading the all above that it seems to suggest that = the _fseeko code involved is from /usr/local/arm-gnueabi-freebsd/bin/ar = but that was not my intent. >>>>=20 >>>> libc.so.7 is from my buildworld, including the fseeko = implementation: >>>>=20 >>>> Reading symbols from /lib/libc.so.7...Reading symbols from = /usr/lib/debug//lib/libc.so.7.debug...done. >>>> done. >>>> Loaded symbols for /lib/libc.so.7 >>>>=20 >>>>=20 >>>> head/sys/sys/_types.h has: >>>>=20 >>>> /* >>>> * mbstate_t is an opaque object to keep conversion state during = multibyte >>>> * stream conversions. >>>> */ >>>> typedef union { >>>> char __mbstate8[128]; >>>> __int64_t _mbstateL; /* for alignment */ >>>> } __mbstate_t; >>>>=20 >>>> suggesting an implicit alignment of the union to whatever the = implementation defines for __int64_t --which need not be 8 byte = alignment (in the abstract, general case). But 8 byte alignment is a = possibility as well (in the abstract). >>>>=20 >>>> But printing *fp in gdb for the fp argument to _fseeko reports the = same not-8-byte aligned address for __mbstate8 that was in r0: >>>>=20 >>>>> (gdb) bt >>>>> #0 0x2033adcc in _fseeko (fp=3D0x20651dcc, offset=3D<value = optimized out>, whence=3D<value optimized out>, ltest=3D<value optimized = out>) at /usr/src/lib/libc/stdio/fseek.c:299 >>>>> #1 0x2033b108 in fseeko (fp=3D0x20651dcc, offset=3D18571438587904, = whence=3D0) at /usr/src/lib/libc/stdio/fseek.c:82 >>>>> #2 0x00016138 in ?? () >>>>> (gdb) print fp >>>>> $2 =3D (FILE *) 0x20651dcc >>>>> (gdb) print *fp >>>>> $3 =3D {_p =3D 0x2069a240 "", _r =3D 0, _w =3D 0, _flags =3D 5264, = _file =3D 36, _bf =3D {_base =3D 0x2069a240 "", _size =3D 32768}, = _lbfsize =3D 0, _cookie =3D 0x20651dcc, _close =3D 0x20359dfc = <__sclose>, >>>>> _read =3D 0x20359de4 <__sread>, _seek =3D 0x20359df4 <__sseek>, = _write =3D 0x20359dec <__swrite>, _ub =3D {_base =3D 0x0, _size =3D 0}, = _up =3D 0x0, _ur =3D 0, _ubuf =3D 0x20651e0c "", _nbuf =3D 0x20651e0f = "", _lb =3D { >>>>> _base =3D 0x0, _size =3D 0}, _blksize =3D 32768, _offset =3D 0, = _fl_mutex =3D 0x0, _fl_owner =3D 0x0, _fl_count =3D 0, _orientation =3D = 0, _mbstate =3D {__mbstate8 =3D 0x20651e34 "", _mbstateL =3D 0}, _flags2 = =3D 0} >>>>=20 >>>> The overall FILE struct containing the _mbstate field is also not = 8-byte aligned. But the offset from the start of the FILE struct to = __mbstate8 is a multiple of 8 bytes. >>>>=20 >>>> It is my interpretation that there is nothing here to justify the = memset implementation combination: >>>>=20 >>>> SCTLR bit[1]=3D=3D1 >>>>=20 >>>> mixed with >>>>=20 >>>> vst1.64 instructions >>>>=20 >>>> I.e.: one or both needs to change unless some way for forcing = 8-byte alignment is introduced. >>>>=20 >>>> I have not managed to track down anything that would indicate = FreeBSD's intent for SCTLR bit[1]. I do not even know if it is required = by the design to be constant (once initialized). >>>=20 >>>=20 >>> I have (so far) removed the build tool crashes based on adding = -fmax-type-align=3D4 to avoid the misaligned accesses. Details follow. >>>=20 >>> src.conf on amd64 for the rpi2 targeting buildworld/buildkernel now = looks like: >>>=20 >>>> # more ~/src.configs/src.conf.rpi2-clang.amd64-host >>>> TO_TYPE=3Darmv6 >>>> TOOLS_TO_TYPE=3Darm-gnueabi >>>> FROM_TYPE=3Damd64 >>>> TOOLS_FROM_TYPE=3Dx86_64 >>>> VERSION_CONTEXT=3D11.0 >>>> # >>>> KERNCONF=3DRPI2-NODBG >>>> TARGET=3Darm >>>> .if ${.MAKE.LEVEL} =3D=3D 0 >>>> TARGET_ARCH=3D${TO_TYPE} >>>> .export TARGET_ARCH >>>> .endif >>>> # >>>> WITHOUT_CROSS_COMPILER=3D >>>> # >>>> # For WITH_BOOT=3D . . . >>>> # arm-gnueabi-freebsd/bin/ld reports bootinfo.o: relocation = R_ARM_MOVW_ABS_NC against `a local symbol' can not be used when making a = shared object; recompile with -fPIC >>>> WITHOUT_BOOT=3D >>>> # >>>> WITH_FAST_DEPEND=3D >>>> WITH_LIBCPLUSPLUS=3D >>>> WITH_CLANG=3D >>>> WITH_CLANG_IS_CC=3D >>>> WITH_CLANG_FULL=3D >>>> WITH_LLDB=3D >>>> WITH_CLANG_EXTRAS=3D >>>> # >>>> WITHOUT_LIB32=3D >>>> WITHOUT_GCC=3D >>>> WITHOUT_GNUCXX=3D >>>> # >>>> NO_WERROR=3D >>>> MALLOC_PRODUCTION=3D >>>> #CFLAGS+=3D -DELF_VERBOSE >>>> # >>>> WITH_DEBUG=3D >>>> WITH_DEBUG_FILES=3D >>>> # >>>> # TOOLS_TO_TYPE based on ${TO_TYPE}-xtoolchain-gcc related = bintutils... >>>> # >>>> #CROSS_TOOLCHAIN=3D${TO_TYPE}-gcc >>>> X_COMPILER_TYPE=3Dclang >>>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >>>> .if ${.MAKE.LEVEL} =3D=3D 0 >>>> XCC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>>> XCXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>>> XCPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>>> .export XCC >>>> .export XCXX >>>> .export XCPP >>>> XAS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >>>> XAR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >>>> XLD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >>>> XNM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >>>> XOBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >>>> XOBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >>>> XRANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >>>> XSIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >>>> #NO-SUCH: XSTRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >>>> XSTRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >>>> .export XAS >>>> .export XAR >>>> .export XLD >>>> .export XNM >>>> .export XOBJCOPY >>>> .export XOBJDUMP >>>> .export XRANLIB >>>> .export XSIZE >>>> .export XSTRINGS >>>> .endif >>>> # >>>> # Host compiler stuff: >>>> .if ${.MAKE.LEVEL} =3D=3D 0 >>>> CC=3D/usr/bin/clang -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >>>> CXX=3D/usr/bin/clang++ -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >>>> CPP=3D/usr/bin/clang-cpp = -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin >>>> .export CC >>>> .export CXX >>>> .export CPP >>>> AS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/as >>>> AR=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ar >>>> LD=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ld >>>> NM=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/nm >>>> OBJCOPY=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objcopy >>>> OBJDUMP=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objdump >>>> RANLIB=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ranlib >>>> SIZE=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/size >>>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/strings= >>>> STRINGS=3D/usr/local/bin/${TOOLS_FROM_TYPE}-freebsd-strings >>>> .export AS >>>> .export AR >>>> .export LD >>>> .export NM >>>> .export OBJCOPY >>>> .export OBJDUMP >>>> .export RANLIB >>>> .export SIZE >>>> .export STRINGS >>>> .endif >>>=20 >>> make.conf for during the on-rpi2 port builds now looks like: >>>=20 >>>> $ more /etc/make.conf >>>> WRKDIRPREFIX=3D/usr/obj/portswork >>>> WITH_DEBUG=3D >>>> WITH_DEBUG_FILES=3D >>>> MALLOC_PRODUCTION=3D >>>> # >>>> TO_TYPE=3Darmv6 >>>> TOOLS_TO_TYPE=3Darm-gnueabi >>>> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ >>>> .if ${.MAKE.LEVEL} =3D=3D 0 >>>> CC=3D/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>>> CXX=3D/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>>> CPP=3D/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi = -march=3Darmv7a -fmax-type-align=3D4 >>>> .export CC >>>> .export CXX >>>> .export CPP >>>> AS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as >>>> AR=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar >>>> LD=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld >>>> NM=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm >>>> OBJCOPY=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy >>>> OBJDUMP=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump >>>> RANLIB=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib >>>> SIZE=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size >>>> #NO-SUCH: STRINGS=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings >>>> STRINGS=3D/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings >>>> .export AS >>>> .export AR >>>> .export LD >>>> .export NM >>>> .export OBJCOPY >>>> .export OBJDUMP >>>> .export RANLIB >>>> .export SIZE >>>> .export STRINGS >>>> .endif >>>=20 >>>=20 >>>=20 >>> =3D=3D=3D >>> Mark Millard >>> markmi at dsl-only.net >>>=20 >>>=20 >>>=20 >>> _______________________________________________ >>> freebsd-toolchain@freebsd.org mailing list >>> https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain >>> To unsubscribe, send any mail to = "freebsd-toolchain-unsubscribe@freebsd.org" >=20 >=20 >=20
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?118D2970-4799-46B1-81A1-0101B907C1BE>